12 KiB
12 KiB
PPF Project Instructions
Architecture
┌──────────┬─────────────┬────────────────────────────────────────────────────────┐
│ Host │ Role │ Notes
├──────────┼─────────────┼────────────────────────────────────────────────────────┤
│ odin │ Master │ Scrapes proxy lists, verifies conflicts, port 8081
│ cassius │ Worker │ Tests proxies, reports to master via WireGuard
│ edge │ Worker │ Tests proxies, reports to master via WireGuard
│ sentinel │ Worker │ Tests proxies, reports to master via WireGuard
└──────────┴─────────────┴────────────────────────────────────────────────────────┘
Role Separation
- Odin (Master): Scrapes proxy sources, does verification tests only. No routine testing. Local Tor only.
- Workers: All routine proxy testing. Each uses only local Tor (127.0.0.1:9050).
CRITICAL: Directory Structure Differences
┌──────────┬─────────────────────────┬──────────────────────────────────────────┐
│ Host │ Code Location │ Container Mount
├──────────┼─────────────────────────┼──────────────────────────────────────────┤
│ odin │ /home/podman/ppf/*.py │ Mounts ppf/ directly to /app
│ workers │ /home/podman/ppf/src/ │ Mounts ppf/src/ to /app (via systemd)
└──────────┴─────────────────────────┴──────────────────────────────────────────┘
ODIN uses root ppf/ directory. WORKERS use ppf/src/ subdirectory.
Host Access
ALWAYS use Ansible from /opt/ansible with venv activated:
cd /opt/ansible && source venv/bin/activate
Quick Reference Commands
# Check worker status
ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible cassius,edge,sentinel -m shell -a "hostname"
# Check worker config
ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible cassius,edge,sentinel -m shell -a "grep -E 'threads|timeout|ssl' /home/podman/ppf/config.ini"
# Check worker logs (dynamic UID)
ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible cassius -m raw \
-a "uid=\$(id -u podman) && sudo -u podman podman logs --tail 20 ppf-worker"
# Modify config option
ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible cassius,edge,sentinel -m lineinfile -a "path=/home/podman/ppf/config.ini line='ssl_only = 1' insertafter='ssl_first'"
# Restart workers via compose
ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible cassius,edge,sentinel -m raw \
-a "uid=\$(id -u podman) && cd /home/podman/ppf && sudo -u podman XDG_RUNTIME_DIR=/run/user/\$uid podman-compose restart"
Full Deployment Procedure
All hosts use podman-compose with compose.yml for container management.
Rsync deploys code; compose handles restart.
Step 1: Validate Syntax Locally
cd /home/user/git/ppf
for f in *.py; do python3 -m py_compile "$f" && echo "OK: $f"; done
Step 2: Deploy to ALL Hosts
cd /opt/ansible && source venv/bin/activate
# Deploy to ODIN (root ppf/ directory + compose.master.yml as compose.yml)
ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible odin -m synchronize \
-a "src=/home/user/git/ppf/ dest=/home/podman/ppf/ rsync_opts='--include=*.py,--include=servers.txt,--include=Dockerfile,--exclude=*'"
ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible odin -m copy \
-a "src=/home/user/git/ppf/compose.master.yml dest=/home/podman/ppf/compose.yml owner=podman group=podman"
# Deploy to WORKERS (ppf/src/ subdirectory + compose.worker.yml as compose.yml)
ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible cassius,edge,sentinel -m synchronize \
-a "src=/home/user/git/ppf/ dest=/home/podman/ppf/src/ rsync_opts='--include=*.py,--include=servers.txt,--include=Dockerfile,--exclude=*'"
ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible cassius,edge,sentinel -m copy \
-a "src=/home/user/git/ppf/compose.worker.yml dest=/home/podman/ppf/compose.yml owner=podman group=podman"
# CRITICAL: Fix ownership on ALL hosts (rsync uses ansible user, containers need podman)
ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible odin,cassius,edge,sentinel -m raw \
-a "chown -R podman:podman /home/podman/ppf/"
Note: Ownership must be fixed after every deploy. rsync runs as ansible user, but containers run as podman user. Missing ownership fix causes ImportError: No module named X errors.
Step 3: Restart Services
# Restart ODIN via compose
ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible odin -m raw \
-a "uid=\$(id -u podman) && cd /home/podman/ppf && sudo -u podman XDG_RUNTIME_DIR=/run/user/\$uid podman-compose restart"
# Restart WORKERS via compose
ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible cassius,edge,sentinel -m raw \
-a "uid=\$(id -u podman) && cd /home/podman/ppf && sudo -u podman XDG_RUNTIME_DIR=/run/user/\$uid podman-compose restart"
Step 4: Verify All Running
# Check all hosts via compose
ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible odin,cassius,edge,sentinel -m raw \
-a "uid=\$(id -u podman) && cd /home/podman/ppf && sudo -u podman XDG_RUNTIME_DIR=/run/user/\$uid podman-compose ps"
Podman User IDs
┌──────────┬───────┬─────────────────────────────┐
│ Host │ UID │ XDG_RUNTIME_DIR
├──────────┼───────┼─────────────────────────────┤
│ odin │ 1005 │ /run/user/1005
│ cassius │ 993 │ /run/user/993
│ edge │ 993 │ /run/user/993
│ sentinel │ 992 │ /run/user/992
└──────────┴───────┴─────────────────────────────┘
Prefer dynamic UID discovery (uid=$(id -u podman)) over hardcoded values.
Configuration
Odin config.ini
[common]
tor_hosts = 127.0.0.1:9050 # Local Tor ONLY
[watchd]
threads = 0 # NO routine testing
database = data/ppf.sqlite
[scraper]
threads = 10
Worker config.ini
[common]
tor_hosts = 127.0.0.1:9050 # Local Tor ONLY
[watchd]
threads = 35
timeout = 9
ssl_first = 1 # Try SSL handshake first
ssl_only = 0 # Set to 1 to skip secondary check on SSL failure
checktype = head # Secondary check: head, irc, judges, none (SSL-only)
Config Options
┌───────────────┬─────────┬────────────────────────────────────────────────────┐
│ Option │ Default │ Description
├───────────────┼─────────┼────────────────────────────────────────────────────┤
│ ssl_first │ 1 │ Try SSL handshake first, fallback to checktype
│ ssl_only │ 0 │ Skip secondary check when SSL fails (faster)
│ checktype │ head │ Secondary check: head, irc, judges, none/false
│ threads │ 20 │ Number of test threads
│ timeout │ 15 │ Socket timeout in seconds
└───────────────┴─────────┴────────────────────────────────────────────────────┘
Work Distribution
Fair distribution algorithm (httpd.py):
fair_share = (due_proxies / active_workers) * 1.2
batch_size = clamp(fair_share, min=100, max=1000)
- Master calculates batch size based on queue and active workers
- Workers shuffle their batch locally to avoid testing same proxies simultaneously
- Claims expire after 5 minutes if not completed
Worker Container
Workers run as podman containers with --restart=unless-stopped:
podman run -d --name ppf-worker --network=host --restart=unless-stopped \
-e PYTHONUNBUFFERED=1 \
-v /home/podman/ppf/src:/app:ro,Z \
-v /home/podman/ppf/data:/app/data:Z \
-v /home/podman/ppf/config.ini:/app/config.ini:ro,Z \
-v /home/podman/ppf/servers.txt:/app/servers.txt:ro,Z \
localhost/ppf-worker:latest \
python -u ppf.py --worker --server http://10.200.1.250:8081
Rebuilding Images
# Workers - from ppf/ directory (Dockerfile copies from src/)
ansible cassius,edge,sentinel -m raw \
-a "cd /home/podman/ppf && sudo -u podman podman build -t localhost/ppf-worker:latest ."
# Odin - from ppf/ directory
ansible odin -m raw \
-a "cd /home/podman/ppf && sudo -u podman podman build -t localhost/ppf:latest ."
API Endpoints
/dashboard Web UI with live statistics
/map Interactive world map
/health Health check
/api/stats Runtime statistics (JSON)
/api/workers Connected worker status
/api/countries Proxy counts by country
/api/claim-urls Claim URL batch for worker-driven fetching (GET)
/api/report-urls Report URL fetch results (POST)
/api/report-proxies Report working proxies (POST)
/proxies Working proxies list
Troubleshooting
Missing servers.txt
Workers need servers.txt in src/:
ansible cassius,edge,sentinel -m copy \
-a "src=/home/user/git/ppf/servers.txt dest=/home/podman/ppf/src/servers.txt owner=podman group=podman"
Exit Code 126 (Permission/Storage)
sudo -u podman podman system reset --force
# Then rebuild image
Dashboard Shows NaN or Missing Data
Odin likely running old code. Redeploy to odin:
ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible odin -m synchronize \
-a "src=/home/user/git/ppf/ dest=/home/podman/ppf/ rsync_opts='--include=*.py,--include=servers.txt,--exclude=*'"
ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible odin -m raw -a "chown -R podman:podman /home/podman/ppf/"
ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible odin -m raw \
-a "uid=\$(id -u podman) && cd /home/podman/ppf && sudo -u podman XDG_RUNTIME_DIR=/run/user/\$uid podman-compose restart"
Worker Keeps Crashing
- Check container status:
sudo -u podman podman ps -a - Check logs:
sudo -u podman podman logs --tail 50 ppf-worker - Verify servers.txt exists in src/
- Check ownership:
ls -la /home/podman/ppf/src/ - Run manually to see error:
sudo -u podman podman run --rm --network=host \
-v /home/podman/ppf/src:/app:ro,Z \
-v /home/podman/ppf/data:/app/data:Z \
-v /home/podman/ppf/config.ini:/app/config.ini:ro,Z \
-v /home/podman/ppf/servers.txt:/app/servers.txt:ro,Z \
localhost/ppf-worker:latest \
python -u ppf.py --worker --server http://10.200.1.250:8081
Files to Deploy
- All *.py files
- servers.txt
Do NOT Deploy
- config.ini (server-specific)
- data/ contents
- *.sqlite files