Files
OpenVPN-Monitoring-Simple/DOCS/data_gathering_report.md

46 lines
2.5 KiB
Markdown
Raw Normal View History

# OpenVPN Data Gatherer Analysis
This report details the internal mechanics of `openvpn_gatherer_v3.py`, responsible for collecting, processing, and storing OpenVPN usage metrics.
## 1. Raw Data Collection (`run_monitoring_cycle`)
The gatherer runs a continuous loop (default interval: **10 seconds**).
### A. Log Parsing
- **Source**: `/var/log/openvpn/openvpn-status.log` (Status File v2, CSV format).
- **Target Fields**: `Common Name`, `Real Address`, `Bytes Sent`, `Bytes Received`.
- **Filtering**: Only lines starting with `CLIENT_LIST`.
### B. Delta Calculation (`update_client_status_and_bytes`)
The log provides *lifetime* counters for a session. The gatherer calculates the traffic *delta* (increment) since the last check.
- **Logic**: `Increment = Current Value - Last Saved Value`.
- **Reset Detection**: If `Current Value < Last Saved Value`, it assumes a session/server restart and counts the entire `Current Value` as the increment.
### C. Rate Calculation
- **Speed**: Calculated as `Increment * 8 / (Interval * 1_000_000)` to get **Mbps**.
- **Storage**: Raw samples (10s resolution) including speed and volume are stored in the `usage_history` table.
## 2. Data Aggregation (TSDB)
To support long-term statistics without storing billions of rows, the `TimeSeriesAggregator` performs real-time rollups into 5 aggregated tables using an `UPSERT` strategy (Insert or update sum).
| Table | Resolution | Timestamp Alignment | Retention (Default) |
|-------|------------|---------------------|---------------------|
| `usage_history` | **10 sec** | Exact time | 7 Days |
| `stats_5min` | **5 min** | 00:00, 00:05... | 14 Days |
| `stats_15min` | **15 min** | 00:00, 00:15... | 28 Days |
| `stats_hourly` | **1 Hour** | XX:00:00 | 90 Days |
| `stats_6h` | **6 Hours** | 00:00, 06:00, 12:00... | 180 Days |
| `stats_daily` | **1 Day** | 00:00:00 | 365 Days |
**Logic**: Every 10s cycle, the calculated `Increment` is added to the sum of *all* relevant overlapping buckets. A single 5MB download contributes immediately to the current 5min, 15min, Hourly, 6h, and Daily counters simultaneously.
## 3. Data Retention
A cleanup job runs once every 24 hours (on day change).
- It executes `DELETE FROM table WHERE timestamp < cutoff_date`.
- Thresholds are configurable in `config.ini` under `[retention]`.
## Summary
The system employs a "Write-Optimized" approach. Instead of calculating heavy aggregates on-read (which would be slow), it pre-calculates them on-write. This ensures instant dashboard loading times even with years of historical data.