270 lines
6.5 KiB
Markdown
270 lines
6.5 KiB
Markdown
# RIPE AS CIDR & FQDN IP Collector
|
|
|
|
This project collects CIDR prefixes for specified Autonomous Systems (AS) from the RIPE NCC API and resolves IP addresses for specified FQDNs. It accumulates these addresses over time, maintaining a history of discovered prefixes. It also provides a FastAPI-based HTTP interface to retrieve the collected data.
|
|
|
|
## 1. Preparation and Installation
|
|
|
|
### Prerequisites
|
|
- Python 3.8+
|
|
- `pip` and `venv`
|
|
|
|
### Installation Steps
|
|
1. **Clone the repository** (or copy the files) to your desired location, e.g., `/opt/ripe_collector`.
|
|
```bash
|
|
mkdir -p /opt/ripe_collector
|
|
cd /opt/ripe_collector
|
|
# Copy files: cidr_collector.py, api_server.py, requirements.txt, config.json
|
|
```
|
|
|
|
2. **Create a Virtual Environment**:
|
|
```bash
|
|
python3 -m venv venv
|
|
```
|
|
|
|
3. **Install Dependencies**:
|
|
```bash
|
|
source venv/bin/activate
|
|
pip install -r requirements.txt
|
|
deactivate
|
|
```
|
|
|
|
4. **Initial Configuration**:
|
|
Edit `config.json` to set your initial ASNs and FQDNs.
|
|
```json
|
|
{
|
|
"asns": [62041],
|
|
"fqdns": ["google.com"]
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 2. Running the Collector (Periodic Task)
|
|
|
|
The collector script `cidr_collector.py` is designed to run once per day to fetch updates.
|
|
|
|
### Manual Run
|
|
```bash
|
|
/opt/ripe_collector/venv/bin/python3 /opt/ripe_collector/cidr_collector.py run
|
|
```
|
|
|
|
### Setup Cron Job (Recommended)
|
|
To run daily at 02:00 AM:
|
|
|
|
1. Open crontab:
|
|
```bash
|
|
crontab -e
|
|
```
|
|
2. Add the line:
|
|
```cron
|
|
0 2 * * * /opt/ripe_collector/venv/bin/python3 /opt/ripe_collector/cidr_collector.py run >> /var/log/ripe_collector.log 2>&1
|
|
```
|
|
|
|
---
|
|
|
|
## 3. Application Setup: Systemd (Ubuntu, Debian)
|
|
|
|
This section describes how to run the **API Server** (`api_server.py`) as a system service.
|
|
|
|
### Create Service File
|
|
Create `/etc/systemd/system/ripe-api.service`:
|
|
|
|
```ini
|
|
[Unit]
|
|
Description=RIPE CIDR Collector API
|
|
After=network.target
|
|
|
|
[Service]
|
|
User=root
|
|
# Change User=root to a generic user if desired, ensure they have write access to data.json/fqdn_data.json
|
|
WorkingDirectory=/opt/ripe_collector
|
|
ExecStart=/opt/ripe_collector/venv/bin/uvicorn api_server:app --host 0.0.0.0 --port 8000
|
|
Restart=always
|
|
|
|
[Install]
|
|
WantedBy=multi-user.target
|
|
```
|
|
|
|
### Enable and Start
|
|
```bash
|
|
# Reload systemd
|
|
sudo systemctl daemon-reload
|
|
|
|
# Enable service to start on boot
|
|
sudo systemctl enable ripe-api
|
|
|
|
# Start service immediately
|
|
sudo systemctl start ripe-api
|
|
|
|
# Check status
|
|
sudo systemctl status ripe-api
|
|
```
|
|
|
|
---
|
|
|
|
## 4. Application Setup: RC-Script (Alpine Linux)
|
|
|
|
For Alpine Linux using OpenRC.
|
|
|
|
### Create Init Script
|
|
Create `/etc/init.d/ripe-api`:
|
|
|
|
```sh
|
|
#!/sbin/openrc-run
|
|
|
|
name="ripe-api"
|
|
description="RIPE CIDR Collector API"
|
|
command="/opt/ripe_collector/venv/bin/uvicorn"
|
|
# --host and --port and module:app passed as arguments
|
|
command_args="api_server:app --host 0.0.0.0 --port 8000"
|
|
command_background="yes"
|
|
pidfile="/run/${RC_SVCNAME}.pid"
|
|
directory="/opt/ripe_collector"
|
|
|
|
depend() {
|
|
need net
|
|
}
|
|
```
|
|
|
|
### Make Executable
|
|
```bash
|
|
chmod +x /etc/init.d/ripe-api
|
|
```
|
|
|
|
### Enable and Start
|
|
```bash
|
|
# Add to default runlevel
|
|
rc-update add ripe-api default
|
|
|
|
# Start service
|
|
service ripe-api start
|
|
|
|
# Check status
|
|
service ripe-api status
|
|
```
|
|
|
|
---
|
|
|
|
## 5. API Usage Documentation
|
|
|
|
The API runs by default on port `8000`. It allows retrieving the collected data in a flat JSON list.
|
|
|
|
### Base URL
|
|
`http://<server-ip>:8000`
|
|
|
|
### Endpoint: Get Addresses
|
|
**GET** `/addresses`
|
|
|
|
Retrieves the list of collected IP addresses/CIDRs.
|
|
|
|
| Parameter | Type | Required | Default | Description |
|
|
| :--- | :--- | :--- | :--- | :--- |
|
|
| `type` | string | No | `all` | Filter by source type. Options: `cidr` (ASNs only), `fqdn` (Domains only), `all` (Both). |
|
|
|
|
#### Example 1: Get All Addresses (Default)
|
|
|
|
**Request:**
|
|
```bash
|
|
curl -X GET "http://localhost:8000/addresses"
|
|
```
|
|
|
|
**Response (JSON):**
|
|
```json
|
|
[
|
|
"142.250.1.1",
|
|
"149.154.160.0/22",
|
|
"149.154.160.0/23",
|
|
"2001:4860:4860::8888",
|
|
"91.108.4.0/22"
|
|
]
|
|
```
|
|
|
|
#### Example 2: Get Only CIDRs (from ASNs)
|
|
|
|
**Request:**
|
|
```bash
|
|
curl -X GET "http://localhost:8000/addresses?type=cidr"
|
|
```
|
|
|
|
**Response (JSON):**
|
|
```json
|
|
[
|
|
"149.154.160.0/22",
|
|
"149.154.160.0/23",
|
|
"91.108.4.0/22"
|
|
]
|
|
```
|
|
|
|
#### Example 3: Get Only Resolved IPs (from FQDNs)
|
|
|
|
**Request:**
|
|
```bash
|
|
curl -X GET "http://localhost:8000/addresses?type=fqdn"
|
|
```
|
|
|
|
**Response (JSON):**
|
|
```json
|
|
[
|
|
"142.250.1.1",
|
|
"2001:4860:4860::8888"
|
|
]
|
|
```
|
|
|
|
### Endpoint: Manage Schedule
|
|
**GET** `/schedule`
|
|
Returns the current cron schedules.
|
|
|
|
**POST** `/schedule`
|
|
Updates the schedule for a specific collector type.
|
|
Body:
|
|
```json
|
|
{
|
|
"type": "asn",
|
|
"cron": "*/15 * * * *"
|
|
}
|
|
```
|
|
*Note: `type` can be `asn` or `fqdn`.*
|
|
|
|
---
|
|
|
|
## 6. Advanced CLI Usage
|
|
|
|
The collector script supports running modes independently:
|
|
|
|
```bash
|
|
# Run both (Default)
|
|
python3 cidr_collector.py run
|
|
|
|
# Run only ASN collection
|
|
python3 cidr_collector.py run --mode asn
|
|
|
|
# Run only FQDN collection
|
|
python3 cidr_collector.py run --mode fqdn
|
|
```
|
|
|
|
---
|
|
|
|
## 7. Internal Logic & Architecture
|
|
|
|
### Collector Logic
|
|
When the collector runs (whether manually or via schedule):
|
|
1. **Instantiation**: Creates a new instance of `CIDRCollector` or `FQDNCollector`. This forces a fresh read of `config.json`, ensuring any added ASNs/FQDNs are immediately processed.
|
|
2. **Fetching**:
|
|
* **ASN**: Queries RIPE NCC API (`stat.ripe.net`).
|
|
* **FQDN**: Uses Python's `socket.getaddrinfo` to resolve A and AAAA records.
|
|
3. **Comparison**: Reads existing `data.json`/`fqdn_data.json`. It compares the fetched set with the stored set.
|
|
4. **Accumulation**: It effectively performs a Union operation (Old U New).
|
|
* **If new items found**: The list is updated, sorting is applied, and `last_updated` timestamp is refreshed for that specific resource.
|
|
* **If no new items**: The file is untouched.
|
|
5. **Persistence**: Checks are performed to ensure data is only written to disk if changes actually occurred.
|
|
|
|
### Scheduler Logic
|
|
The `api_server.py` uses `APScheduler` (BackgroundScheduler).
|
|
|
|
1. **Startup**: When the server starts (`uvicorn`), `start_scheduler` is called. It loads the `schedule` block from `config.json` and creates two independent jobs (`asn_job`, `fqdn_job`).
|
|
2. **Runtime Updates (POST /schedule)**:
|
|
* The server validates the new cron expression.
|
|
* It updates `config.json` so the change survives restarts.
|
|
* It calls `scheduler.add_job(..., replace_existing=True)`. This hot-swaps the trigger for the running job.
|
|
3. **Concurrency**: If a scheduled job is already running when a new schedule is posted, the running job completes normally. The new schedule applies to the *next* calculated run time.
|