# PPF Project Instructions ## Architecture ``` ┌──────────┬─────────────┬────────────────────────────────────────────────────────┐ │ Host │ Role │ Notes ├──────────┼─────────────┼────────────────────────────────────────────────────────┤ │ odin │ Master │ Scrapes proxy lists, verifies conflicts, port 8081 │ cassius │ Worker │ Tests proxies, reports to master via WireGuard │ edge │ Worker │ Tests proxies, reports to master via WireGuard │ sentinel │ Worker │ Tests proxies, reports to master via WireGuard └──────────┴─────────────┴────────────────────────────────────────────────────────┘ ``` ### Role Separation - **Odin (Master)**: Scrapes proxy sources, does verification tests only. No routine testing. Local Tor only. - **Workers**: All routine proxy testing. Each uses only local Tor (127.0.0.1:9050). ## CRITICAL: Directory Structure Differences ``` ┌──────────┬─────────────────────────┬──────────────────────────────────────────┐ │ Host │ Code Location │ Container Mount ├──────────┼─────────────────────────┼──────────────────────────────────────────┤ │ odin │ /home/podman/ppf/*.py │ Mounts ppf/ directly to /app │ workers │ /home/podman/ppf/src/ │ Mounts ppf/src/ to /app (via systemd) └──────────┴─────────────────────────┴──────────────────────────────────────────┘ ``` **ODIN uses root ppf/ directory. WORKERS use ppf/src/ subdirectory.** ## Host Access **ALWAYS use Ansible from `/opt/ansible` with venv activated:** ```bash cd /opt/ansible && source venv/bin/activate ``` ### Quick Reference Commands ```bash # Check worker status ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible cassius,edge,sentinel -m shell -a "hostname" # Check worker config ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible cassius,edge,sentinel -m shell -a "grep -E 'threads|timeout|ssl' /home/podman/ppf/config.ini" # Check worker logs (dynamic UID) ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible cassius -m raw \ -a "uid=\$(id -u podman) && sudo -u podman podman logs --tail 20 ppf-worker" # Modify config option ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible cassius,edge,sentinel -m lineinfile -a "path=/home/podman/ppf/config.ini line='ssl_only = 1' insertafter='ssl_first'" # Restart workers via compose ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible cassius,edge,sentinel -m raw \ -a "uid=\$(id -u podman) && cd /home/podman/ppf && sudo -u podman XDG_RUNTIME_DIR=/run/user/\$uid podman-compose restart" ``` ## Full Deployment Procedure All hosts use `podman-compose` with `compose.yml` for container management. Rsync deploys code; compose handles restart. ### Step 1: Validate Syntax Locally ```bash cd /home/user/git/ppf for f in *.py; do python3 -m py_compile "$f" && echo "OK: $f"; done ``` ### Step 2: Deploy to ALL Hosts ```bash cd /opt/ansible && source venv/bin/activate # Deploy to ODIN (root ppf/ directory + compose.master.yml as compose.yml) ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible odin -m synchronize \ -a "src=/home/user/git/ppf/ dest=/home/podman/ppf/ rsync_opts='--include=*.py,--include=servers.txt,--include=Dockerfile,--exclude=*'" ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible odin -m copy \ -a "src=/home/user/git/ppf/compose.master.yml dest=/home/podman/ppf/compose.yml owner=podman group=podman" # Deploy to WORKERS (ppf/src/ subdirectory + compose.worker.yml as compose.yml) ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible cassius,edge,sentinel -m synchronize \ -a "src=/home/user/git/ppf/ dest=/home/podman/ppf/src/ rsync_opts='--include=*.py,--include=servers.txt,--include=Dockerfile,--exclude=*'" ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible cassius,edge,sentinel -m copy \ -a "src=/home/user/git/ppf/compose.worker.yml dest=/home/podman/ppf/compose.yml owner=podman group=podman" # CRITICAL: Fix ownership on ALL hosts (rsync uses ansible user, containers need podman) ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible odin,cassius,edge,sentinel -m raw \ -a "chown -R podman:podman /home/podman/ppf/" ``` **Note:** Ownership must be fixed after every deploy. rsync runs as ansible user, but containers run as podman user. Missing ownership fix causes `ImportError: No module named X` errors. ### Step 3: Restart Services ```bash # Restart ODIN via compose ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible odin -m raw \ -a "uid=\$(id -u podman) && cd /home/podman/ppf && sudo -u podman XDG_RUNTIME_DIR=/run/user/\$uid podman-compose restart" # Restart WORKERS via compose ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible cassius,edge,sentinel -m raw \ -a "uid=\$(id -u podman) && cd /home/podman/ppf && sudo -u podman XDG_RUNTIME_DIR=/run/user/\$uid podman-compose restart" ``` ### Step 4: Verify All Running ```bash # Check all hosts via compose ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible odin,cassius,edge,sentinel -m raw \ -a "uid=\$(id -u podman) && cd /home/podman/ppf && sudo -u podman XDG_RUNTIME_DIR=/run/user/\$uid podman-compose ps" ``` ## Podman User IDs ``` ┌──────────┬───────┬─────────────────────────────┐ │ Host │ UID │ XDG_RUNTIME_DIR ├──────────┼───────┼─────────────────────────────┤ │ odin │ 1005 │ /run/user/1005 │ cassius │ 993 │ /run/user/993 │ edge │ 993 │ /run/user/993 │ sentinel │ 992 │ /run/user/992 └──────────┴───────┴─────────────────────────────┘ ``` **Prefer dynamic UID discovery** (`uid=$(id -u podman)`) over hardcoded values. ## Configuration ### Odin config.ini ```ini [common] tor_hosts = 127.0.0.1:9050 # Local Tor ONLY [watchd] threads = 0 # NO routine testing database = data/ppf.sqlite [scraper] threads = 10 ``` ### Worker config.ini ```ini [common] tor_hosts = 127.0.0.1:9050 # Local Tor ONLY [watchd] threads = 35 timeout = 9 ssl_first = 1 # Try SSL handshake first ssl_only = 0 # Set to 1 to skip secondary check on SSL failure checktype = head # Secondary check: head, irc, judges, none (SSL-only) ``` ### Config Options ``` ┌───────────────┬─────────┬────────────────────────────────────────────────────┐ │ Option │ Default │ Description ├───────────────┼─────────┼────────────────────────────────────────────────────┤ │ ssl_first │ 1 │ Try SSL handshake first, fallback to checktype │ ssl_only │ 0 │ Skip secondary check when SSL fails (faster) │ checktype │ head │ Secondary check: head, irc, judges, none/false │ threads │ 20 │ Number of test threads │ timeout │ 15 │ Socket timeout in seconds └───────────────┴─────────┴────────────────────────────────────────────────────┘ ``` ## Work Distribution Fair distribution algorithm (httpd.py): ``` fair_share = (due_proxies / active_workers) * 1.2 batch_size = clamp(fair_share, min=100, max=1000) ``` - Master calculates batch size based on queue and active workers - Workers shuffle their batch locally to avoid testing same proxies simultaneously - Claims expire after 5 minutes if not completed ## Worker Container Workers run as podman containers with `--restart=unless-stopped`: ```bash podman run -d --name ppf-worker --network=host --restart=unless-stopped \ -e PYTHONUNBUFFERED=1 \ -v /home/podman/ppf/src:/app:ro,Z \ -v /home/podman/ppf/data:/app/data:Z \ -v /home/podman/ppf/config.ini:/app/config.ini:ro,Z \ -v /home/podman/ppf/servers.txt:/app/servers.txt:ro,Z \ localhost/ppf-worker:latest \ python -u ppf.py --worker --server http://10.200.1.250:8081 ``` ## Rebuilding Images ```bash # Workers - from ppf/ directory (Dockerfile copies from src/) ansible cassius,edge,sentinel -m raw \ -a "cd /home/podman/ppf && sudo -u podman podman build -t localhost/ppf-worker:latest ." # Odin - from ppf/ directory ansible odin -m raw \ -a "cd /home/podman/ppf && sudo -u podman podman build -t localhost/ppf:latest ." ``` ## API Endpoints ``` /dashboard Web UI with live statistics /map Interactive world map /health Health check /api/stats Runtime statistics (JSON) /api/workers Connected worker status /api/countries Proxy counts by country /api/claim-urls Claim URL batch for worker-driven fetching (GET) /api/report-urls Report URL fetch results (POST) /api/report-proxies Report working proxies (POST) /proxies Working proxies list ``` ## Troubleshooting ### Missing servers.txt Workers need `servers.txt` in src/: ```bash ansible cassius,edge,sentinel -m copy \ -a "src=/home/user/git/ppf/servers.txt dest=/home/podman/ppf/src/servers.txt owner=podman group=podman" ``` ### Exit Code 126 (Permission/Storage) ```bash sudo -u podman podman system reset --force # Then rebuild image ``` ### Dashboard Shows NaN or Missing Data Odin likely running old code. Redeploy to odin: ```bash ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible odin -m synchronize \ -a "src=/home/user/git/ppf/ dest=/home/podman/ppf/ rsync_opts='--include=*.py,--include=servers.txt,--exclude=*'" ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible odin -m raw -a "chown -R podman:podman /home/podman/ppf/" ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible odin -m raw \ -a "uid=\$(id -u podman) && cd /home/podman/ppf && sudo -u podman XDG_RUNTIME_DIR=/run/user/\$uid podman-compose restart" ``` ### Worker Keeps Crashing 1. Check container status: `sudo -u podman podman ps -a` 2. Check logs: `sudo -u podman podman logs --tail 50 ppf-worker` 3. Verify servers.txt exists in src/ 4. Check ownership: `ls -la /home/podman/ppf/src/` 5. Run manually to see error: ```bash sudo -u podman podman run --rm --network=host \ -v /home/podman/ppf/src:/app:ro,Z \ -v /home/podman/ppf/data:/app/data:Z \ -v /home/podman/ppf/config.ini:/app/config.ini:ro,Z \ -v /home/podman/ppf/servers.txt:/app/servers.txt:ro,Z \ localhost/ppf-worker:latest \ python -u ppf.py --worker --server http://10.200.1.250:8081 ``` ## Files to Deploy - All *.py files - servers.txt ## Do NOT Deploy - config.ini (server-specific) - data/ contents - *.sqlite files