11 KiB
PPF Project Instructions
Architecture
┌──────────┬─────────────┬────────────────────────────────────────────────────────┐
│ Host │ Role │ Notes
├──────────┼─────────────┼────────────────────────────────────────────────────────┤
│ odin │ Master │ API server + SSL-only proxy verification, port 8081
│ cassius │ Worker │ Tests proxies, reports to master via WireGuard
│ edge │ Worker │ Tests proxies, reports to master via WireGuard
│ sentinel │ Worker │ Tests proxies, reports to master via WireGuard
└──────────┴─────────────┴────────────────────────────────────────────────────────┘
Role Separation
- Odin (Master): API server + SSL-only proxy verification (10 threads). No URL cycling (workers handle it via
/api/claim-urls). Local Tor only. - Workers: All URL fetching (via
/api/claim-urls) and proxy testing. Each uses only local Tor (127.0.0.1:9050).
CRITICAL: Directory Structure Differences
┌──────────┬─────────────────────────┬──────────────────────────────────────────┐
│ Host │ Code Location │ Container Mount
├──────────┼─────────────────────────┼──────────────────────────────────────────┤
│ odin │ /home/podman/ppf/*.py │ Mounts ppf/ directly to /app
│ workers │ /home/podman/ppf/src/ │ Mounts ppf/src/ to /app (via compose)
└──────────┴─────────────────────────┴──────────────────────────────────────────┘
ODIN uses root ppf/ directory. WORKERS use ppf/src/ subdirectory.
Operations Toolkit
All deployment and service management is handled by tools/:
tools/
lib/ppf-common.sh shared library (hosts, wrappers, colors)
ppf-deploy deploy wrapper (local validation + playbook)
ppf-logs view container logs
ppf-service manage containers (status/start/stop/restart)
ppf-db database operations (stats/purge-proxies/vacuum)
ppf-status cluster overview (containers, workers, queue)
playbooks/
deploy.yml ansible playbook (sync, compose, restart)
inventory.ini hosts with WireGuard IPs + SSH key
group_vars/
all.yml shared vars (ppf_base, ppf_owner)
master.yml odin paths + compose file
workers.yml worker paths + compose file
Symlinked to ~/.local/bin/ for direct use.
Connectivity
All tools connect over WireGuard (10.200.1.0/24) as user ansible
with the SSH key at /opt/ansible/secrets/ssh/ansible.
Deployment
ppf-deploy validates syntax locally, then runs the Ansible playbook.
Hosts execute in parallel; containers restart only when files change.
ppf-deploy # all nodes: validate, sync, restart
ppf-deploy odin # master only
ppf-deploy workers # cassius, edge, sentinel
ppf-deploy cassius edge # specific hosts
ppf-deploy --no-restart # sync only, skip restart
ppf-deploy --check # dry run (ansible --check --diff)
ppf-deploy -v # verbose ansible output
Playbook steps (per host, in parallel):
- Rsync
*.py+servers.txt(role-aware destination via group_vars) - Copy compose file per role (
compose.master.yml/compose.worker.yml) - Fix ownership (
podman:podman, recursive) - Restart containers via handler (only if files changed)
- Show container status
Container Logs
ppf-logs # last 40 lines from odin
ppf-logs cassius # specific worker
ppf-logs -f edge # follow mode
ppf-logs -n 100 sentinel # last N lines
Service Management
ppf-service status # all nodes: compose ps + health
ppf-service status workers # workers only
ppf-service restart odin # restart master
ppf-service stop cassius # stop specific worker
ppf-service start workers # start all workers
Database Management
ppf-db stats # proxy and URL counts
ppf-db purge-proxies # stop odin, delete all proxies, restart
ppf-db vacuum # reclaim disk space
Cluster Status
ppf-status # full overview: containers, DB, workers, queue
ppf-status --json # raw JSON from odin API
Direct Ansible (for operations not covered by tools)
Use the toolkit inventory for ad-hoc commands over WireGuard:
cd /opt/ansible && source venv/bin/activate
INV=/home/user/git/ppf/tools/playbooks/inventory.ini
# Check worker config
ansible -i $INV workers -m shell \
-a "grep -E 'threads|timeout|ssl' /home/podman/ppf/config.ini"
# Modify config option
ansible -i $INV workers -m lineinfile \
-a "path=/home/podman/ppf/config.ini line='ssl_only = 1' insertafter='ssl_first'"
Podman User IDs
┌──────────┬───────┬─────────────────────────────┐
│ Host │ UID │ XDG_RUNTIME_DIR
├──────────┼───────┼─────────────────────────────┤
│ odin │ 1005 │ /run/user/1005
│ cassius │ 993 │ /run/user/993
│ edge │ 993 │ /run/user/993
│ sentinel │ 992 │ /run/user/992
└──────────┴───────┴─────────────────────────────┘
Prefer dynamic UID discovery (uid=$(id -u podman)) over hardcoded values.
Configuration
Odin config.ini
[common]
tor_hosts = 127.0.0.1:9050 # Local Tor ONLY
[watchd]
threads = 10 # SSL-only verification of worker-reported proxies
timeout = 7
checktype = none # No secondary check
ssl_first = 1
ssl_only = 1
database = data/proxies.sqlite
[ppf]
threads = 0 # NO URL cycling (workers handle it)
database = data/websites.sqlite
[scraper]
enabled = 0 # Disabled on master
Worker config.ini
[common]
tor_hosts = 127.0.0.1:9050 # Local Tor ONLY
[watchd]
threads = 35
timeout = 9
ssl_first = 1 # Try SSL handshake first
ssl_only = 0 # Set to 1 to skip secondary check on SSL failure
checktype = head # Secondary check: head, irc, judges, none (SSL-only)
Config Options
┌───────────────┬─────────┬────────────────────────────────────────────────────┐
│ Option │ Default │ Description
├───────────────┼─────────┼────────────────────────────────────────────────────┤
│ ssl_first │ 1 │ Try SSL handshake first, fallback to checktype
│ ssl_only │ 0 │ Skip secondary check when SSL fails (faster)
│ checktype │ head │ Secondary check: head, irc, judges, none/false
│ threads │ 20 │ Number of test threads
│ timeout │ 15 │ Socket timeout in seconds
└───────────────┴─────────┴────────────────────────────────────────────────────┘
Work Distribution
Fair distribution algorithm (httpd.py):
fair_share = (due_proxies / active_workers) * 1.2
batch_size = clamp(fair_share, min=100, max=1000)
- Master calculates batch size based on queue and active workers
- Workers shuffle their batch locally to avoid testing same proxies simultaneously
- Claims expire after 5 minutes if not completed
Container Management
All nodes run via podman-compose with role-specific compose files:
- Odin:
compose.master.yml-> deployed ascompose.yml - Workers:
compose.worker.yml-> deployed ascompose.yml
Containers are managed exclusively through compose. No systemd user
services or standalone podman run commands.
Rebuilding Images
# Workers - from ppf/ directory (Dockerfile copies from src/)
ansible cassius,edge,sentinel -m raw \
-a "cd /home/podman/ppf && sudo -u podman podman build -t localhost/ppf-worker:latest ."
# Odin - from ppf/ directory
ansible odin -m raw \
-a "cd /home/podman/ppf && sudo -u podman podman build -t localhost/ppf:latest ."
API Endpoints
/dashboard Web UI with live statistics
/map Interactive world map
/health Health check
/api/stats Runtime statistics (JSON)
/api/workers Connected worker status
/api/countries Proxy counts by country
/api/claim-urls Claim URL batch for worker-driven fetching (GET)
/api/report-urls Report URL fetch results (POST)
/api/report-proxies Report working proxies (POST)
/proxies Working proxies list
Troubleshooting
Missing servers.txt
Redeploy syncs servers.txt automatically:
ppf-deploy workers
Exit Code 126 (Permission/Storage)
sudo -u podman podman system reset --force
# Then rebuild image
Dashboard Shows NaN or Missing Data
Odin likely running old code:
ppf-deploy odin
Worker Keeps Crashing
- Check status:
ppf-service status workers - Check logs:
ppf-logs -n 50 cassius - Redeploy (fixes ownership + servers.txt):
ppf-deploy cassius - If still failing, run manually on the host to see error:
sudo -u podman podman run --rm --network=host \
-v /home/podman/ppf/src:/app:ro,Z \
-v /home/podman/ppf/data:/app/data:Z \
-v /home/podman/ppf/config.ini:/app/config.ini:ro,Z \
-v /home/podman/ppf/servers.txt:/app/servers.txt:ro,Z \
localhost/ppf-worker:latest \
python -u ppf.py --worker --server http://10.200.1.250:8081
Files to Deploy
- All *.py files
- servers.txt
Do NOT Deploy
- config.ini (server-specific)
- data/ contents
- *.sqlite files