ayurishchev/ASN-and-FQDN-IP-Collector

Fork 0

Go to file

Антон c104b13dfd initial build

2026-02-02 11:02:32 +03:00

.gitignore

initial build

2026-02-02 11:02:32 +03:00

api_server.py

initial build

2026-02-02 11:02:32 +03:00

cidr_collector.py

initial build

2026-02-02 11:02:32 +03:00

config.json

initial build

2026-02-02 11:02:32 +03:00

data.json

initial build

2026-02-02 11:02:32 +03:00

fqdn_data.json

initial build

2026-02-02 11:02:32 +03:00

README.md

initial build

2026-02-02 11:02:32 +03:00

requirements.txt

initial build

2026-02-02 11:02:32 +03:00

README.md

RIPE AS CIDR & FQDN IP Collector

This project collects CIDR prefixes for specified Autonomous Systems (AS) from the RIPE NCC API and resolves IP addresses for specified FQDNs. It accumulates these addresses over time, maintaining a history of discovered prefixes. It also provides a FastAPI-based HTTP interface to retrieve the collected data.

1. Preparation and Installation

Prerequisites

Python 3.8+
pip and venv

Installation Steps

Clone the repository (or copy the files) to your desired location, e.g., /opt/ripe_collector.

mkdir -p /opt/ripe_collector
cd /opt/ripe_collector
# Copy files: cidr_collector.py, api_server.py, requirements.txt, config.json

Create a Virtual Environment:
```
python3 -m venv venv
```

Install Dependencies:

source venv/bin/activate
pip install -r requirements.txt
deactivate

Initial Configuration: Edit config.json to set your initial ASNs and FQDNs.
```
{
    "asns": [62041],
    "fqdns": ["google.com"]
}
```

2. Running the Collector (Periodic Task)

The collector script cidr_collector.py is designed to run once per day to fetch updates.

Manual Run

/opt/ripe_collector/venv/bin/python3 /opt/ripe_collector/cidr_collector.py run

Setup Cron Job (Recommended)

To run daily at 02:00 AM:

Open crontab:
```
crontab -e
```

Add the line:

0 2 * * * /opt/ripe_collector/venv/bin/python3 /opt/ripe_collector/cidr_collector.py run >> /var/log/ripe_collector.log 2>&1

3. Application Setup: Systemd (Ubuntu, Debian)

This section describes how to run the API Server (api_server.py) as a system service.

Create Service File

Create /etc/systemd/system/ripe-api.service:

[Unit]
Description=RIPE CIDR Collector API
After=network.target

[Service]
User=root
# Change User=root to a generic user if desired, ensure they have write access to data.json/fqdn_data.json
WorkingDirectory=/opt/ripe_collector
ExecStart=/opt/ripe_collector/venv/bin/uvicorn api_server:app --host 0.0.0.0 --port 8000
Restart=always

[Install]
WantedBy=multi-user.target

Enable and Start

# Reload systemd
sudo systemctl daemon-reload

# Enable service to start on boot
sudo systemctl enable ripe-api

# Start service immediately
sudo systemctl start ripe-api

# Check status
sudo systemctl status ripe-api

4. Application Setup: RC-Script (Alpine Linux)

For Alpine Linux using OpenRC.

Create Init Script

Create /etc/init.d/ripe-api:

#!/sbin/openrc-run

name="ripe-api"
description="RIPE CIDR Collector API"
command="/opt/ripe_collector/venv/bin/uvicorn"
# --host and --port and module:app passed as arguments
command_args="api_server:app --host 0.0.0.0 --port 8000"
command_background="yes"
pidfile="/run/${RC_SVCNAME}.pid"
directory="/opt/ripe_collector"

depend() {
    need net
}

Make Executable

chmod +x /etc/init.d/ripe-api

Enable and Start

# Add to default runlevel
rc-update add ripe-api default

# Start service
service ripe-api start

# Check status
service ripe-api status

5. API Usage Documentation

The API runs by default on port 8000. It allows retrieving the collected data in a flat JSON list.

Base URL

http://<server-ip>:8000

Endpoint: Get Addresses

GET /addresses

Retrieves the list of collected IP addresses/CIDRs.

Parameter	Type	Required	Default	Description
`type`	string	No	`all`	Filter by source type. Options: `cidr` (ASNs only), `fqdn` (Domains only), `all` (Both).

Example 1: Get All Addresses (Default)

Request:

curl -X GET "http://localhost:8000/addresses"

Response (JSON):

[
  "142.250.1.1",
  "149.154.160.0/22",
  "149.154.160.0/23",
  "2001:4860:4860::8888",
  "91.108.4.0/22"
]

Example 2: Get Only CIDRs (from ASNs)

Request:

curl -X GET "http://localhost:8000/addresses?type=cidr"

Response (JSON):

[
  "149.154.160.0/22",
  "149.154.160.0/23",
  "91.108.4.0/22"
]

Example 3: Get Only Resolved IPs (from FQDNs)

Request:

curl -X GET "http://localhost:8000/addresses?type=fqdn"

Response (JSON):

[
  "142.250.1.1",
  "2001:4860:4860::8888"
]

Endpoint: Manage Schedule

GET /schedule Returns the current cron schedules.

POST /schedule Updates the schedule for a specific collector type. Body:

{
    "type": "asn", 
    "cron": "*/15 * * * *"
}

Note: type can be asn or fqdn.

6. Advanced CLI Usage

The collector script supports running modes independently:

# Run both (Default)
python3 cidr_collector.py run

# Run only ASN collection
python3 cidr_collector.py run --mode asn

# Run only FQDN collection
python3 cidr_collector.py run --mode fqdn

7. Internal Logic & Architecture

Collector Logic

When the collector runs (whether manually or via schedule):

Instantiation: Creates a new instance of CIDRCollector or FQDNCollector. This forces a fresh read of config.json, ensuring any added ASNs/FQDNs are immediately processed.
Fetching:
- ASN: Queries RIPE NCC API (stat.ripe.net).
- FQDN: Uses Python's socket.getaddrinfo to resolve A and AAAA records.
Comparison: Reads existing data.json/fqdn_data.json. It compares the fetched set with the stored set.
Accumulation: It effectively performs a Union operation (Old U New).
- If new items found: The list is updated, sorting is applied, and last_updated timestamp is refreshed for that specific resource.
- If no new items: The file is untouched.
Persistence: Checks are performed to ensure data is only written to disk if changes actually occurred.

Scheduler Logic

The api_server.py uses APScheduler (BackgroundScheduler).

Startup: When the server starts (uvicorn), start_scheduler is called. It loads the schedule block from config.json and creates two independent jobs (asn_job, fqdn_job).
Runtime Updates (POST /schedule):
- The server validates the new cron expression.
- It updates config.json so the change survives restarts.
- It calls scheduler.add_job(..., replace_existing=True). This hot-swaps the trigger for the running job.
Concurrency: If a scheduled job is already running when a new schedule is posted, the running job completes normally. The new schedule applies to the next calculated run time.