initial build
This commit is contained in:
269
README.md
Normal file
269
README.md
Normal file
@@ -0,0 +1,269 @@
|
||||
# RIPE AS CIDR & FQDN IP Collector
|
||||
|
||||
This project collects CIDR prefixes for specified Autonomous Systems (AS) from the RIPE NCC API and resolves IP addresses for specified FQDNs. It accumulates these addresses over time, maintaining a history of discovered prefixes. It also provides a FastAPI-based HTTP interface to retrieve the collected data.
|
||||
|
||||
## 1. Preparation and Installation
|
||||
|
||||
### Prerequisites
|
||||
- Python 3.8+
|
||||
- `pip` and `venv`
|
||||
|
||||
### Installation Steps
|
||||
1. **Clone the repository** (or copy the files) to your desired location, e.g., `/opt/ripe_collector`.
|
||||
```bash
|
||||
mkdir -p /opt/ripe_collector
|
||||
cd /opt/ripe_collector
|
||||
# Copy files: cidr_collector.py, api_server.py, requirements.txt, config.json
|
||||
```
|
||||
|
||||
2. **Create a Virtual Environment**:
|
||||
```bash
|
||||
python3 -m venv venv
|
||||
```
|
||||
|
||||
3. **Install Dependencies**:
|
||||
```bash
|
||||
source venv/bin/activate
|
||||
pip install -r requirements.txt
|
||||
deactivate
|
||||
```
|
||||
|
||||
4. **Initial Configuration**:
|
||||
Edit `config.json` to set your initial ASNs and FQDNs.
|
||||
```json
|
||||
{
|
||||
"asns": [62041],
|
||||
"fqdns": ["google.com"]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2. Running the Collector (Periodic Task)
|
||||
|
||||
The collector script `cidr_collector.py` is designed to run once per day to fetch updates.
|
||||
|
||||
### Manual Run
|
||||
```bash
|
||||
/opt/ripe_collector/venv/bin/python3 /opt/ripe_collector/cidr_collector.py run
|
||||
```
|
||||
|
||||
### Setup Cron Job (Recommended)
|
||||
To run daily at 02:00 AM:
|
||||
|
||||
1. Open crontab:
|
||||
```bash
|
||||
crontab -e
|
||||
```
|
||||
2. Add the line:
|
||||
```cron
|
||||
0 2 * * * /opt/ripe_collector/venv/bin/python3 /opt/ripe_collector/cidr_collector.py run >> /var/log/ripe_collector.log 2>&1
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Application Setup: Systemd (Ubuntu, Debian)
|
||||
|
||||
This section describes how to run the **API Server** (`api_server.py`) as a system service.
|
||||
|
||||
### Create Service File
|
||||
Create `/etc/systemd/system/ripe-api.service`:
|
||||
|
||||
```ini
|
||||
[Unit]
|
||||
Description=RIPE CIDR Collector API
|
||||
After=network.target
|
||||
|
||||
[Service]
|
||||
User=root
|
||||
# Change User=root to a generic user if desired, ensure they have write access to data.json/fqdn_data.json
|
||||
WorkingDirectory=/opt/ripe_collector
|
||||
ExecStart=/opt/ripe_collector/venv/bin/uvicorn api_server:app --host 0.0.0.0 --port 8000
|
||||
Restart=always
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
```
|
||||
|
||||
### Enable and Start
|
||||
```bash
|
||||
# Reload systemd
|
||||
sudo systemctl daemon-reload
|
||||
|
||||
# Enable service to start on boot
|
||||
sudo systemctl enable ripe-api
|
||||
|
||||
# Start service immediately
|
||||
sudo systemctl start ripe-api
|
||||
|
||||
# Check status
|
||||
sudo systemctl status ripe-api
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. Application Setup: RC-Script (Alpine Linux)
|
||||
|
||||
For Alpine Linux using OpenRC.
|
||||
|
||||
### Create Init Script
|
||||
Create `/etc/init.d/ripe-api`:
|
||||
|
||||
```sh
|
||||
#!/sbin/openrc-run
|
||||
|
||||
name="ripe-api"
|
||||
description="RIPE CIDR Collector API"
|
||||
command="/opt/ripe_collector/venv/bin/uvicorn"
|
||||
# --host and --port and module:app passed as arguments
|
||||
command_args="api_server:app --host 0.0.0.0 --port 8000"
|
||||
command_background="yes"
|
||||
pidfile="/run/${RC_SVCNAME}.pid"
|
||||
directory="/opt/ripe_collector"
|
||||
|
||||
depend() {
|
||||
need net
|
||||
}
|
||||
```
|
||||
|
||||
### Make Executable
|
||||
```bash
|
||||
chmod +x /etc/init.d/ripe-api
|
||||
```
|
||||
|
||||
### Enable and Start
|
||||
```bash
|
||||
# Add to default runlevel
|
||||
rc-update add ripe-api default
|
||||
|
||||
# Start service
|
||||
service ripe-api start
|
||||
|
||||
# Check status
|
||||
service ripe-api status
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. API Usage Documentation
|
||||
|
||||
The API runs by default on port `8000`. It allows retrieving the collected data in a flat JSON list.
|
||||
|
||||
### Base URL
|
||||
`http://<server-ip>:8000`
|
||||
|
||||
### Endpoint: Get Addresses
|
||||
**GET** `/addresses`
|
||||
|
||||
Retrieves the list of collected IP addresses/CIDRs.
|
||||
|
||||
| Parameter | Type | Required | Default | Description |
|
||||
| :--- | :--- | :--- | :--- | :--- |
|
||||
| `type` | string | No | `all` | Filter by source type. Options: `cidr` (ASNs only), `fqdn` (Domains only), `all` (Both). |
|
||||
|
||||
#### Example 1: Get All Addresses (Default)
|
||||
|
||||
**Request:**
|
||||
```bash
|
||||
curl -X GET "http://localhost:8000/addresses"
|
||||
```
|
||||
|
||||
**Response (JSON):**
|
||||
```json
|
||||
[
|
||||
"142.250.1.1",
|
||||
"149.154.160.0/22",
|
||||
"149.154.160.0/23",
|
||||
"2001:4860:4860::8888",
|
||||
"91.108.4.0/22"
|
||||
]
|
||||
```
|
||||
|
||||
#### Example 2: Get Only CIDRs (from ASNs)
|
||||
|
||||
**Request:**
|
||||
```bash
|
||||
curl -X GET "http://localhost:8000/addresses?type=cidr"
|
||||
```
|
||||
|
||||
**Response (JSON):**
|
||||
```json
|
||||
[
|
||||
"149.154.160.0/22",
|
||||
"149.154.160.0/23",
|
||||
"91.108.4.0/22"
|
||||
]
|
||||
```
|
||||
|
||||
#### Example 3: Get Only Resolved IPs (from FQDNs)
|
||||
|
||||
**Request:**
|
||||
```bash
|
||||
curl -X GET "http://localhost:8000/addresses?type=fqdn"
|
||||
```
|
||||
|
||||
**Response (JSON):**
|
||||
```json
|
||||
[
|
||||
"142.250.1.1",
|
||||
"2001:4860:4860::8888"
|
||||
]
|
||||
```
|
||||
|
||||
### Endpoint: Manage Schedule
|
||||
**GET** `/schedule`
|
||||
Returns the current cron schedules.
|
||||
|
||||
**POST** `/schedule`
|
||||
Updates the schedule for a specific collector type.
|
||||
Body:
|
||||
```json
|
||||
{
|
||||
"type": "asn",
|
||||
"cron": "*/15 * * * *"
|
||||
}
|
||||
```
|
||||
*Note: `type` can be `asn` or `fqdn`.*
|
||||
|
||||
---
|
||||
|
||||
## 6. Advanced CLI Usage
|
||||
|
||||
The collector script supports running modes independently:
|
||||
|
||||
```bash
|
||||
# Run both (Default)
|
||||
python3 cidr_collector.py run
|
||||
|
||||
# Run only ASN collection
|
||||
python3 cidr_collector.py run --mode asn
|
||||
|
||||
# Run only FQDN collection
|
||||
python3 cidr_collector.py run --mode fqdn
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. Internal Logic & Architecture
|
||||
|
||||
### Collector Logic
|
||||
When the collector runs (whether manually or via schedule):
|
||||
1. **Instantiation**: Creates a new instance of `CIDRCollector` or `FQDNCollector`. This forces a fresh read of `config.json`, ensuring any added ASNs/FQDNs are immediately processed.
|
||||
2. **Fetching**:
|
||||
* **ASN**: Queries RIPE NCC API (`stat.ripe.net`).
|
||||
* **FQDN**: Uses Python's `socket.getaddrinfo` to resolve A and AAAA records.
|
||||
3. **Comparison**: Reads existing `data.json`/`fqdn_data.json`. It compares the fetched set with the stored set.
|
||||
4. **Accumulation**: It effectively performs a Union operation (Old U New).
|
||||
* **If new items found**: The list is updated, sorting is applied, and `last_updated` timestamp is refreshed for that specific resource.
|
||||
* **If no new items**: The file is untouched.
|
||||
5. **Persistence**: Checks are performed to ensure data is only written to disk if changes actually occurred.
|
||||
|
||||
### Scheduler Logic
|
||||
The `api_server.py` uses `APScheduler` (BackgroundScheduler).
|
||||
|
||||
1. **Startup**: When the server starts (`uvicorn`), `start_scheduler` is called. It loads the `schedule` block from `config.json` and creates two independent jobs (`asn_job`, `fqdn_job`).
|
||||
2. **Runtime Updates (POST /schedule)**:
|
||||
* The server validates the new cron expression.
|
||||
* It updates `config.json` so the change survives restarts.
|
||||
* It calls `scheduler.add_job(..., replace_existing=True)`. This hot-swaps the trigger for the running job.
|
||||
3. **Concurrency**: If a scheduled job is already running when a new schedule is posted, the running job completes normally. The new schedule applies to the *next* calculated run time.
|
||||
Reference in New Issue
Block a user