# PPF Project Instructions ## Architecture ``` ┌──────────┬─────────────┬────────────────────────────────────────────────────────┐ │ Host │ Role │ Notes ├──────────┼─────────────┼────────────────────────────────────────────────────────┤ │ odin │ Master │ API server + SSL-only proxy verification, port 8081 │ cassius │ Worker │ Tests proxies, reports to master via WireGuard │ edge │ Worker │ Tests proxies, reports to master via WireGuard │ sentinel │ Worker │ Tests proxies, reports to master via WireGuard └──────────┴─────────────┴────────────────────────────────────────────────────────┘ ``` ### Role Separation - **Odin (Master)**: API server + SSL-only proxy verification (10 threads). No URL cycling (workers handle it via `/api/claim-urls`). Local Tor only. - **Workers**: All URL fetching (via `/api/claim-urls`) and proxy testing. Each uses only local Tor (127.0.0.1:9050). ## CRITICAL: Directory Structure Differences ``` ┌──────────┬─────────────────────────┬──────────────────────────────────────────┐ │ Host │ Code Location │ Container Mount ├──────────┼─────────────────────────┼──────────────────────────────────────────┤ │ odin │ /home/podman/ppf/*.py │ Mounts ppf/ directly to /app │ workers │ /home/podman/ppf/src/ │ Mounts ppf/src/ to /app (via compose) └──────────┴─────────────────────────┴──────────────────────────────────────────┘ ``` **ODIN uses root ppf/ directory. WORKERS use ppf/src/ subdirectory.** ## Operations Toolkit All deployment and service management is handled by `tools/`: ``` tools/ lib/ppf-common.sh shared library (hosts, wrappers, colors) ppf-deploy deploy wrapper (local validation + playbook) ppf-logs view container logs ppf-service manage containers (status/start/stop/restart) ppf-db database operations (stats/purge-proxies/vacuum) ppf-status cluster overview (containers, workers, queue) playbooks/ deploy.yml ansible playbook (sync, compose, restart) inventory.ini hosts with WireGuard IPs + SSH key group_vars/ all.yml shared vars (ppf_base, ppf_owner) master.yml odin paths + compose file workers.yml worker paths + compose file ``` Symlinked to `~/.local/bin/` for direct use. ### Connectivity All tools connect over WireGuard (`10.200.1.0/24`) as user `ansible` with the SSH key at `/opt/ansible/secrets/ssh/ansible`. ### Deployment `ppf-deploy` validates syntax locally, then runs the Ansible playbook. Hosts execute in parallel; containers restart only when files change. ```bash ppf-deploy # all nodes: validate, sync, restart ppf-deploy odin # master only ppf-deploy workers # cassius, edge, sentinel ppf-deploy cassius edge # specific hosts ppf-deploy --no-restart # sync only, skip restart ppf-deploy --check # dry run (ansible --check --diff) ppf-deploy -v # verbose ansible output ``` Playbook steps (per host, in parallel): 1. Rsync `*.py` + `servers.txt` (role-aware destination via group_vars) 2. Copy compose file per role (`compose.master.yml` / `compose.worker.yml`) 3. Fix ownership (`podman:podman`, recursive) 4. Restart containers via handler (only if files changed) 5. Show container status ### Container Logs ```bash ppf-logs # last 40 lines from odin ppf-logs cassius # specific worker ppf-logs -f edge # follow mode ppf-logs -n 100 sentinel # last N lines ``` ### Service Management ```bash ppf-service status # all nodes: compose ps + health ppf-service status workers # workers only ppf-service restart odin # restart master ppf-service stop cassius # stop specific worker ppf-service start workers # start all workers ``` ### Database Management ```bash ppf-db stats # proxy and URL counts ppf-db purge-proxies # stop odin, delete all proxies, restart ppf-db vacuum # reclaim disk space ``` ### Cluster Status ```bash ppf-status # full overview: containers, DB, workers, queue ppf-status --json # raw JSON from odin API ``` ### Direct Ansible (for operations not covered by tools) Use the toolkit inventory for ad-hoc commands over WireGuard: ```bash cd /opt/ansible && source venv/bin/activate INV=/home/user/git/ppf/tools/playbooks/inventory.ini # Check worker config ansible -i $INV workers -m shell \ -a "grep -E 'threads|timeout|ssl' /home/podman/ppf/config.ini" # Modify config option ansible -i $INV workers -m lineinfile \ -a "path=/home/podman/ppf/config.ini line='ssl_only = 1' insertafter='ssl_first'" ``` ## Podman User IDs ``` ┌──────────┬───────┬─────────────────────────────┐ │ Host │ UID │ XDG_RUNTIME_DIR ├──────────┼───────┼─────────────────────────────┤ │ odin │ 1005 │ /run/user/1005 │ cassius │ 993 │ /run/user/993 │ edge │ 993 │ /run/user/993 │ sentinel │ 992 │ /run/user/992 └──────────┴───────┴─────────────────────────────┘ ``` **Prefer dynamic UID discovery** (`uid=$(id -u podman)`) over hardcoded values. ## Configuration ### Odin config.ini ```ini [common] tor_hosts = 127.0.0.1:9050 # Local Tor ONLY [watchd] threads = 10 # SSL-only verification of worker-reported proxies timeout = 7 checktype = none # No secondary check ssl_first = 1 ssl_only = 1 database = data/proxies.sqlite [ppf] threads = 0 # NO URL cycling (workers handle it) database = data/websites.sqlite [scraper] enabled = 0 # Disabled on master ``` ### Worker config.ini ```ini [common] tor_hosts = 127.0.0.1:9050 # Local Tor ONLY [watchd] threads = 35 timeout = 9 ssl_first = 1 # Try SSL handshake first ssl_only = 0 # Set to 1 to skip secondary check on SSL failure checktype = head # Secondary check: head, irc, judges, none (SSL-only) ``` ### Config Options ``` ┌───────────────┬─────────┬────────────────────────────────────────────────────┐ │ Option │ Default │ Description ├───────────────┼─────────┼────────────────────────────────────────────────────┤ │ ssl_first │ 1 │ Try SSL handshake first, fallback to checktype │ ssl_only │ 0 │ Skip secondary check when SSL fails (faster) │ checktype │ head │ Secondary check: head, irc, judges, none/false │ threads │ 20 │ Number of test threads │ timeout │ 15 │ Socket timeout in seconds └───────────────┴─────────┴────────────────────────────────────────────────────┘ ``` ## Work Distribution Fair distribution algorithm (httpd.py): ``` fair_share = (due_proxies / active_workers) * 1.2 batch_size = clamp(fair_share, min=100, max=1000) ``` - Master calculates batch size based on queue and active workers - Workers shuffle their batch locally to avoid testing same proxies simultaneously - Claims expire after 5 minutes if not completed ## Container Management All nodes run via `podman-compose` with role-specific compose files: - **Odin**: `compose.master.yml` -> deployed as `compose.yml` - **Workers**: `compose.worker.yml` -> deployed as `compose.yml` Containers are managed exclusively through compose. No systemd user services or standalone `podman run` commands. ## Rebuilding Images ```bash # Workers - from ppf/ directory (Dockerfile copies from src/) ansible cassius,edge,sentinel -m raw \ -a "cd /home/podman/ppf && sudo -u podman podman build -t localhost/ppf-worker:latest ." # Odin - from ppf/ directory ansible odin -m raw \ -a "cd /home/podman/ppf && sudo -u podman podman build -t localhost/ppf:latest ." ``` ## API Endpoints ``` /dashboard Web UI with live statistics /map Interactive world map /health Health check /api/stats Runtime statistics (JSON) /api/workers Connected worker status /api/countries Proxy counts by country /api/claim-urls Claim URL batch for worker-driven fetching (GET) /api/report-urls Report URL fetch results (POST) /api/report-proxies Report working proxies (POST) /proxies Working proxies list ``` ## Troubleshooting ### Missing servers.txt Redeploy syncs `servers.txt` automatically: ```bash ppf-deploy workers ``` ### Exit Code 126 (Permission/Storage) ```bash sudo -u podman podman system reset --force # Then rebuild image ``` ### Dashboard Shows NaN or Missing Data Odin likely running old code: ```bash ppf-deploy odin ``` ### Worker Keeps Crashing 1. Check status: `ppf-service status workers` 2. Check logs: `ppf-logs -n 50 cassius` 3. Redeploy (fixes ownership + servers.txt): `ppf-deploy cassius` 4. If still failing, run manually on the host to see error: ```bash sudo -u podman podman run --rm --network=host \ -v /home/podman/ppf/src:/app:ro,Z \ -v /home/podman/ppf/data:/app/data:Z \ -v /home/podman/ppf/config.ini:/app/config.ini:ro,Z \ -v /home/podman/ppf/servers.txt:/app/servers.txt:ro,Z \ localhost/ppf-worker:latest \ python -u ppf.py --worker --server http://10.200.1.250:8081 ``` ## Files to Deploy - All *.py files - servers.txt ## Do NOT Deploy - config.ini (server-specific) - data/ contents - *.sqlite files