Files

Username 04fb362181 tools: add ppf-status cluster overview

2026-02-18 01:02:42 +01:00

11 KiB

Raw Blame History

PPF Project Instructions

Architecture

┌──────────┬─────────────┬────────────────────────────────────────────────────────┐
│ Host     │ Role        │ Notes
├──────────┼─────────────┼────────────────────────────────────────────────────────┤
│ odin     │ Master      │ API server + SSL-only proxy verification, port 8081
│ cassius  │ Worker      │ Tests proxies, reports to master via WireGuard
│ edge     │ Worker      │ Tests proxies, reports to master via WireGuard
│ sentinel │ Worker      │ Tests proxies, reports to master via WireGuard
└──────────┴─────────────┴────────────────────────────────────────────────────────┘

Role Separation

Odin (Master): API server + SSL-only proxy verification (10 threads). No URL cycling (workers handle it via /api/claim-urls). Local Tor only.
Workers: All URL fetching (via /api/claim-urls) and proxy testing. Each uses only local Tor (127.0.0.1:9050).

CRITICAL: Directory Structure Differences

┌──────────┬─────────────────────────┬──────────────────────────────────────────┐
│ Host     │ Code Location           │ Container Mount
├──────────┼─────────────────────────┼──────────────────────────────────────────┤
│ odin     │ /home/podman/ppf/*.py   │ Mounts ppf/ directly to /app
│ workers  │ /home/podman/ppf/src/   │ Mounts ppf/src/ to /app (via compose)
└──────────┴─────────────────────────┴──────────────────────────────────────────┘

ODIN uses root ppf/ directory. WORKERS use ppf/src/ subdirectory.

Operations Toolkit

All deployment and service management is handled by tools/:

tools/
  lib/ppf-common.sh           shared library (hosts, wrappers, colors)
  ppf-deploy                   deploy wrapper (local validation + playbook)
  ppf-logs                     view container logs
  ppf-service                  manage containers (status/start/stop/restart)
  ppf-db                       database operations (stats/purge-proxies/vacuum)
  ppf-status                   cluster overview (containers, workers, queue)
  playbooks/
    deploy.yml                 ansible playbook (sync, compose, restart)
    inventory.ini              hosts with WireGuard IPs + SSH key
    group_vars/
      all.yml                  shared vars (ppf_base, ppf_owner)
      master.yml               odin paths + compose file
      workers.yml              worker paths + compose file

Symlinked to ~/.local/bin/ for direct use.

Connectivity

All tools connect over WireGuard (10.200.1.0/24) as user ansible with the SSH key at /opt/ansible/secrets/ssh/ansible.

Deployment

ppf-deploy validates syntax locally, then runs the Ansible playbook. Hosts execute in parallel; containers restart only when files change.

ppf-deploy                  # all nodes: validate, sync, restart
ppf-deploy odin             # master only
ppf-deploy workers          # cassius, edge, sentinel
ppf-deploy cassius edge     # specific hosts
ppf-deploy --no-restart     # sync only, skip restart
ppf-deploy --check          # dry run (ansible --check --diff)
ppf-deploy -v               # verbose ansible output

Playbook steps (per host, in parallel):

Rsync *.py + servers.txt (role-aware destination via group_vars)
Copy compose file per role (compose.master.yml / compose.worker.yml)
Fix ownership (podman:podman, recursive)
Restart containers via handler (only if files changed)
Show container status

Container Logs

ppf-logs                    # last 40 lines from odin
ppf-logs cassius            # specific worker
ppf-logs -f edge            # follow mode
ppf-logs -n 100 sentinel    # last N lines

Service Management

ppf-service status          # all nodes: compose ps + health
ppf-service status workers  # workers only
ppf-service restart odin    # restart master
ppf-service stop cassius    # stop specific worker
ppf-service start workers   # start all workers

Database Management

ppf-db stats                # proxy and URL counts
ppf-db purge-proxies        # stop odin, delete all proxies, restart
ppf-db vacuum               # reclaim disk space

Cluster Status

ppf-status                  # full overview: containers, DB, workers, queue
ppf-status --json           # raw JSON from odin API

Direct Ansible (for operations not covered by tools)

Use the toolkit inventory for ad-hoc commands over WireGuard:

cd /opt/ansible && source venv/bin/activate
INV=/home/user/git/ppf/tools/playbooks/inventory.ini

# Check worker config
ansible -i $INV workers -m shell \
  -a "grep -E 'threads|timeout|ssl' /home/podman/ppf/config.ini"

# Modify config option
ansible -i $INV workers -m lineinfile \
  -a "path=/home/podman/ppf/config.ini line='ssl_only = 1' insertafter='ssl_first'"

Podman User IDs

┌──────────┬───────┬─────────────────────────────┐
│ Host     │ UID   │ XDG_RUNTIME_DIR
├──────────┼───────┼─────────────────────────────┤
│ odin     │ 1005  │ /run/user/1005
│ cassius  │ 993   │ /run/user/993
│ edge     │ 993   │ /run/user/993
│ sentinel │ 992   │ /run/user/992
└──────────┴───────┴─────────────────────────────┘

Prefer dynamic UID discovery (uid=$(id -u podman)) over hardcoded values.

Configuration

Odin config.ini

[common]
tor_hosts = 127.0.0.1:9050   # Local Tor ONLY

[watchd]
threads = 10                  # SSL-only verification of worker-reported proxies
timeout = 7
checktype = none              # No secondary check
ssl_first = 1
ssl_only = 1
database = data/proxies.sqlite

[ppf]
threads = 0                   # NO URL cycling (workers handle it)
database = data/websites.sqlite

[scraper]
enabled = 0                   # Disabled on master

Worker config.ini

[common]
tor_hosts = 127.0.0.1:9050   # Local Tor ONLY

[watchd]
threads = 35
timeout = 9
ssl_first = 1     # Try SSL handshake first
ssl_only = 0      # Set to 1 to skip secondary check on SSL failure
checktype = head  # Secondary check: head, irc, judges, none (SSL-only)

Config Options

┌───────────────┬─────────┬────────────────────────────────────────────────────┐
│ Option        │ Default │ Description
├───────────────┼─────────┼────────────────────────────────────────────────────┤
│ ssl_first     │ 1       │ Try SSL handshake first, fallback to checktype
│ ssl_only      │ 0       │ Skip secondary check when SSL fails (faster)
│ checktype     │ head    │ Secondary check: head, irc, judges, none/false
│ threads       │ 20      │ Number of test threads
│ timeout       │ 15      │ Socket timeout in seconds
└───────────────┴─────────┴────────────────────────────────────────────────────┘

Work Distribution

Fair distribution algorithm (httpd.py):

fair_share = (due_proxies / active_workers) * 1.2
batch_size = clamp(fair_share, min=100, max=1000)

Master calculates batch size based on queue and active workers
Workers shuffle their batch locally to avoid testing same proxies simultaneously
Claims expire after 5 minutes if not completed

Container Management

All nodes run via podman-compose with role-specific compose files:

Odin: compose.master.yml -> deployed as compose.yml
Workers: compose.worker.yml -> deployed as compose.yml

Containers are managed exclusively through compose. No systemd user services or standalone podman run commands.

Rebuilding Images

# Workers - from ppf/ directory (Dockerfile copies from src/)
ansible cassius,edge,sentinel -m raw \
  -a "cd /home/podman/ppf && sudo -u podman podman build -t localhost/ppf-worker:latest ."

# Odin - from ppf/ directory
ansible odin -m raw \
  -a "cd /home/podman/ppf && sudo -u podman podman build -t localhost/ppf:latest ."

API Endpoints

/dashboard           Web UI with live statistics
/map                 Interactive world map
/health              Health check
/api/stats           Runtime statistics (JSON)
/api/workers         Connected worker status
/api/countries       Proxy counts by country
/api/claim-urls      Claim URL batch for worker-driven fetching (GET)
/api/report-urls     Report URL fetch results (POST)
/api/report-proxies  Report working proxies (POST)
/proxies             Working proxies list

Troubleshooting

Missing servers.txt

Redeploy syncs servers.txt automatically:

ppf-deploy workers

Exit Code 126 (Permission/Storage)

sudo -u podman podman system reset --force
# Then rebuild image

Dashboard Shows NaN or Missing Data

Odin likely running old code:

ppf-deploy odin

Worker Keeps Crashing

Check status: ppf-service status workers
Check logs: ppf-logs -n 50 cassius
Redeploy (fixes ownership + servers.txt): ppf-deploy cassius
If still failing, run manually on the host to see error:

sudo -u podman podman run --rm --network=host \
  -v /home/podman/ppf/src:/app:ro,Z \
  -v /home/podman/ppf/data:/app/data:Z \
  -v /home/podman/ppf/config.ini:/app/config.ini:ro,Z \
  -v /home/podman/ppf/servers.txt:/app/servers.txt:ro,Z \
  localhost/ppf-worker:latest \
  python -u ppf.py --worker --server http://10.200.1.250:8081

Files to Deploy

All *.py files
servers.txt

Do NOT Deploy

config.ini (server-specific)
data/ contents
*.sqlite files

11 KiB Raw Blame History