diff --git a/CLAUDE.md b/CLAUDE.md index b817bda..091c629 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -1,96 +1,295 @@ # PPF Project Instructions -## Current State - -PPF is a Python 2 proxy scraping and validation framework with: -- Multi-target validation (2/3 majority voting) -- SSL/TLS proxy testing with MITM detection -- Web dashboard with electric cyan theme -- Interactive world map (/map endpoint) -- Memory profiling (/api/memory endpoint) -- Tor connection pooling with health monitoring - -## Deployment to Odin - -When deploying PPF to odin server, follow these specifications exactly: - -### Production Deployment +## Architecture ``` -Host: odin -Container User: podman -Source Path: /home/podman/ppf/src/ -Data Path: /home/podman/ppf/data/ -Config: /home/podman/ppf/config.ini -HTTP Port: 8081 -Container: ppf (localhost/ppf:latest) +┌──────────┬─────────────┬────────────────────────────────────────────────────────┐ +│ Host │ Role │ Notes +├──────────┼─────────────┼────────────────────────────────────────────────────────┤ +│ odin │ Master │ Scrapes proxy lists, verifies conflicts, port 8081 +│ forge │ Worker │ Tests proxies, reports to master via WireGuard +│ hermes │ Worker │ Tests proxies, reports to master via WireGuard +│ janus │ Worker │ Tests proxies, reports to master via WireGuard +└──────────┴─────────────┴────────────────────────────────────────────────────────┘ ``` -### Deployment Steps +### Role Separation -1. **Validate syntax** - Run `python3 -m py_compile` on all .py files -2. **Sync to staging** - rsync Python files to `odin:/tmp/ppf-update/` -3. **Copy to container source** - `sudo -u podman cp /tmp/ppf-update/*.py /home/podman/ppf/src/` -4. **Restart container** - `sudo -u podman podman restart ppf` -5. **Clean staging** - Remove `/tmp/ppf-update/` -6. **Verify** - Check container logs for successful startup +- **Odin (Master)**: Scrapes proxy sources, does verification tests only. No routine testing. Local Tor only. +- **Workers**: All routine proxy testing. Each uses only local Tor (127.0.0.1:9050). -### Test Pod (Development) +## CRITICAL: Directory Structure Differences -If running a test container, use: -- Different port: 8082 -- Different container name: ppf-test -- Separate data directory: /home/podman/ppf-test/data/ +``` +┌──────────┬─────────────────────────┬──────────────────────────────────────────┐ +│ Host │ Code Location │ Container Mount +├──────────┼─────────────────────────┼──────────────────────────────────────────┤ +│ odin │ /home/podman/ppf/*.py │ Mounts ppf/ directly to /app +│ workers │ /home/podman/ppf/src/ │ Mounts ppf/src/ to /app (via systemd) +└──────────┴─────────────────────────┴──────────────────────────────────────────┘ +``` + +**ODIN uses root ppf/ directory. WORKERS use ppf/src/ subdirectory.** + +## Host Access + +**ALWAYS use Ansible from `/opt/ansible` with venv activated:** ```bash -# Test container example -sudo -u podman podman run -d --name ppf-test \ - -p 8082:8081 \ - -v /home/podman/ppf/src:/app/src:ro \ - -v /home/podman/ppf-test/data:/app/data \ - localhost/ppf:latest +cd /opt/ansible && source venv/bin/activate ``` -### Files to Deploy +### Quick Reference Commands -All Python files in project root: -- proxywatchd.py (main daemon) -- httpd.py (web dashboard) -- config.py, dbs.py, fetch.py, etc. +```bash +# Check worker status +ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible forge,hermes,janus -m shell -a "hostname" -### Do NOT Deploy +# Check worker config +ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible forge,hermes,janus -m shell -a "grep -E 'threads|timeout|ssl' /home/podman/ppf/config.ini" -- config.ini (server-specific) -- data/ directory contents -- *.sqlite files +# Check worker logs +ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible forge -m shell -a "sudo -u podman journalctl --user -u ppf-worker -n 20" + +# Modify config option +ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible forge,hermes,janus -m lineinfile -a "path=/home/podman/ppf/config.ini line='ssl_only = 1' insertafter='ssl_first'" + +# Restart workers (different UIDs!) +ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible janus,forge -m raw -a "sudo -u podman XDG_RUNTIME_DIR=/run/user/996 systemctl --user restart ppf-worker" +ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible hermes -m raw -a "sudo -u podman XDG_RUNTIME_DIR=/run/user/1001 systemctl --user restart ppf-worker" +``` + +## Full Deployment Procedure + +### Step 1: Validate Syntax Locally + +```bash +cd /home/user/git/ppf +for f in *.py; do python3 -m py_compile "$f" && echo "OK: $f"; done +``` + +### Step 2: Deploy to ALL Hosts + +```bash +cd /opt/ansible && source venv/bin/activate + +# Deploy to ODIN (root ppf/ directory) +ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible odin -m synchronize \ + -a "src=/home/user/git/ppf/ dest=/home/podman/ppf/ rsync_opts='--include=*.py,--include=servers.txt,--exclude=*'" + +# Deploy to WORKERS (ppf/src/ subdirectory) +ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible forge,hermes,janus -m synchronize \ + -a "src=/home/user/git/ppf/ dest=/home/podman/ppf/src/ rsync_opts='--include=*.py,--include=servers.txt,--exclude=*'" + +# CRITICAL: Fix ownership on ALL hosts (rsync uses ansible user, containers need podman) +ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible odin,forge,hermes,janus -m raw \ + -a "chown -R podman:podman /home/podman/ppf/" +``` + +**Note:** Ownership must be fixed after every deploy. rsync runs as ansible user, but containers run as podman user. Missing ownership fix causes `ImportError: No module named X` errors. + +### Step 3: Restart Services + +```bash +# Restart ODIN (UID 1005) +ansible odin -m raw \ + -a "cd /tmp && XDG_RUNTIME_DIR=/run/user/1005 runuser -u podman -- podman restart ppf" + +# Restart WORKERS (note different UIDs) +ansible janus,forge -m raw \ + -a "sudo -u podman XDG_RUNTIME_DIR=/run/user/996 systemctl --user restart ppf-worker" +ansible hermes -m raw \ + -a "sudo -u podman XDG_RUNTIME_DIR=/run/user/1001 systemctl --user restart ppf-worker" +``` + +### Step 4: Verify All Running + +```bash +# Check odin (UID 1005) +ansible odin -m raw \ + -a "cd /tmp && XDG_RUNTIME_DIR=/run/user/1005 runuser -u podman -- podman ps" + +# Check workers +ansible janus,forge -m raw \ + -a "sudo -u podman XDG_RUNTIME_DIR=/run/user/996 systemctl --user is-active ppf-worker" +ansible hermes -m raw \ + -a "sudo -u podman XDG_RUNTIME_DIR=/run/user/1001 systemctl --user is-active ppf-worker" +``` + +## Podman User IDs + +``` +┌──────────┬───────┬─────────────────────────────┐ +│ Host │ UID │ XDG_RUNTIME_DIR +├──────────┼───────┼─────────────────────────────┤ +│ odin │ 1005 │ /run/user/1005 +│ hermes │ 1001 │ /run/user/1001 +│ janus │ 996 │ /run/user/996 +│ forge │ 996 │ /run/user/996 +└──────────┴───────┴─────────────────────────────┘ +``` + +## Configuration + +### Odin config.ini + +```ini +[common] +tor_hosts = 127.0.0.1:9050 # Local Tor ONLY + +[watchd] +threads = 0 # NO routine testing +database = data/ppf.sqlite + +[scraper] +threads = 10 +``` + +### Worker config.ini + +```ini +[common] +tor_hosts = 127.0.0.1:9050 # Local Tor ONLY + +[watchd] +threads = 35 +timeout = 9 +ssl_first = 1 # Try SSL handshake first +ssl_only = 0 # Set to 1 to skip secondary check on SSL failure +checktype = head # Secondary check type: head, irc, judges +``` + +### Config Options + +``` +┌───────────────┬─────────┬────────────────────────────────────────────────────┐ +│ Option │ Default │ Description +├───────────────┼─────────┼────────────────────────────────────────────────────┤ +│ ssl_first │ 1 │ Try SSL handshake first, fallback to checktype +│ ssl_only │ 0 │ Skip secondary check when SSL fails (faster) +│ checktype │ head │ Secondary check: head, irc, judges +│ threads │ 20 │ Number of test threads +│ timeout │ 15 │ Socket timeout in seconds +└───────────────┴─────────┴────────────────────────────────────────────────────┘ +``` + +## Work Distribution + +Fair distribution algorithm (httpd.py): + +``` +fair_share = (due_proxies / active_workers) * 1.2 +batch_size = clamp(fair_share, min=100, max=1000) +``` + +- Master calculates batch size based on queue and active workers +- Workers shuffle their batch locally to avoid testing same proxies simultaneously +- Claims expire after 5 minutes if not completed + +## Worker systemd Unit + +Located at `/home/podman/.config/systemd/user/ppf-worker.service`: + +```ini +[Unit] +Description=PPF Worker Container +After=network-online.target tor.service + +[Service] +Type=simple +Restart=on-failure +RestartSec=10 +WorkingDirectory=%h +ExecStartPre=-/usr/bin/podman stop -t 10 ppf-worker +ExecStartPre=-/usr/bin/podman rm -f ppf-worker +ExecStart=/usr/bin/podman run \ + --name ppf-worker --rm --log-driver=journald --network=host \ + -v %h/ppf/src:/app:ro \ + -v %h/ppf/data:/app/data \ + -v %h/ppf/config.ini:/app/config.ini:ro \ + -e PYTHONUNBUFFERED=1 \ + localhost/ppf-worker:latest \ + python -u ppf.py --worker --server http://10.200.1.250:8081 +ExecStop=/usr/bin/podman stop -t 10 ppf-worker + +[Install] +WantedBy=default.target +``` + +## Rebuilding Images + +```bash +# Workers - from ppf/ directory (Dockerfile copies from src/) +ansible forge,hermes,janus -m raw \ + -a "cd /home/podman/ppf && sudo -u podman podman build -t localhost/ppf-worker:latest ." + +# Odin - from ppf/ directory +ansible odin -m raw \ + -a "cd /home/podman/ppf && sudo -u podman podman build -t localhost/ppf:latest ." +``` ## API Endpoints ``` /dashboard Web UI with live statistics -/map Interactive world map (Leaflet.js) -/health Health check: {"status": "ok"} +/map Interactive world map +/health Health check /api/stats Runtime statistics (JSON) -/api/memory Memory profiling data (JSON) +/api/workers Connected worker status +/api/memory Memory profiling data /api/countries Proxy counts by country -/api/locations Precise proxy locations (requires DB5) -/proxies Working proxies (limit, proto, country params) +/proxies Working proxies list ``` -## Memory Analysis +## Troubleshooting -Query production memory state: +### Missing servers.txt + +Workers need `servers.txt` in src/: ```bash -ssh odin "curl -s localhost:8081/api/memory" | python3 -m json.tool +ansible forge,hermes,janus -m copy \ + -a "src=/home/user/git/ppf/servers.txt dest=/home/podman/ppf/src/servers.txt owner=podman group=podman" ``` -Key metrics: -- `start_rss` / `process.VmRSS` - memory growth -- `objgraph_common` - top object types by count -- `samples` - RSS history over time -- `gc.objects` - total GC-tracked objects +### Exit Code 126 (Permission/Storage) -Current baseline (~260k queue): -- Start: 442 MB -- Running: 1.6 GB -- Per-job overhead: ~4.5 KB +```bash +sudo -u podman podman system reset --force +# Then rebuild image +``` + +### Dashboard Shows NaN or Missing Data + +Odin likely running old code. Redeploy to odin: +```bash +ansible odin -m synchronize \ + -a "src=/home/user/git/ppf/ dest=/home/podman/ppf/ rsync_opts='--include=*.py,--include=servers.txt,--exclude=*'" +ansible odin -m raw -a "chown -R podman:podman /home/podman/ppf/" +ansible odin -m raw -a "cd /tmp; sudo -u podman podman restart ppf" +``` + +### Worker Keeps Crashing + +1. Check systemd status with correct UID +2. Verify servers.txt exists in src/ +3. Check ownership +4. Run manually to see error: +```bash +sudo -u podman podman run --rm --network=host \ + -v /home/podman/ppf/src:/app:ro \ + -v /home/podman/ppf/data:/app/data \ + -v /home/podman/ppf/config.ini:/app/config.ini:ro \ + localhost/ppf-worker:latest \ + python -u ppf.py --worker --server http://10.200.1.250:8081 +``` + +## Files to Deploy + +- All *.py files +- servers.txt + +## Do NOT Deploy + +- config.ini (server-specific) +- data/ contents +- *.sqlite files