284 lines
11 KiB
Markdown
284 lines
11 KiB
Markdown
# PPF Project Instructions
|
|
|
|
## Architecture
|
|
|
|
```
|
|
┌──────────┬─────────────┬────────────────────────────────────────────────────────┐
|
|
│ Host │ Role │ Notes
|
|
├──────────┼─────────────┼────────────────────────────────────────────────────────┤
|
|
│ odin │ Master │ Scrapes proxy lists, verifies conflicts, port 8081
|
|
│ cassius │ Worker │ Tests proxies, reports to master via WireGuard
|
|
│ edge │ Worker │ Tests proxies, reports to master via WireGuard
|
|
│ sentinel │ Worker │ Tests proxies, reports to master via WireGuard
|
|
└──────────┴─────────────┴────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
### Role Separation
|
|
|
|
- **Odin (Master)**: Scrapes proxy sources, does verification tests only. No routine testing. Local Tor only.
|
|
- **Workers**: All routine proxy testing. Each uses only local Tor (127.0.0.1:9050).
|
|
|
|
## CRITICAL: Directory Structure Differences
|
|
|
|
```
|
|
┌──────────┬─────────────────────────┬──────────────────────────────────────────┐
|
|
│ Host │ Code Location │ Container Mount
|
|
├──────────┼─────────────────────────┼──────────────────────────────────────────┤
|
|
│ odin │ /home/podman/ppf/*.py │ Mounts ppf/ directly to /app
|
|
│ workers │ /home/podman/ppf/src/ │ Mounts ppf/src/ to /app (via systemd)
|
|
└──────────┴─────────────────────────┴──────────────────────────────────────────┘
|
|
```
|
|
|
|
**ODIN uses root ppf/ directory. WORKERS use ppf/src/ subdirectory.**
|
|
|
|
## Host Access
|
|
|
|
**ALWAYS use Ansible from `/opt/ansible` with venv activated:**
|
|
|
|
```bash
|
|
cd /opt/ansible && source venv/bin/activate
|
|
```
|
|
|
|
### Quick Reference Commands
|
|
|
|
```bash
|
|
# Check worker status
|
|
ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible cassius,edge,sentinel -m shell -a "hostname"
|
|
|
|
# Check worker config
|
|
ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible cassius,edge,sentinel -m shell -a "grep -E 'threads|timeout|ssl' /home/podman/ppf/config.ini"
|
|
|
|
# Check worker logs (dynamic UID)
|
|
ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible cassius -m raw \
|
|
-a "uid=\$(id -u podman) && sudo -u podman podman logs --tail 20 ppf-worker"
|
|
|
|
# Modify config option
|
|
ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible cassius,edge,sentinel -m lineinfile -a "path=/home/podman/ppf/config.ini line='ssl_only = 1' insertafter='ssl_first'"
|
|
|
|
# Restart workers (dynamic UID discovery)
|
|
ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible cassius,edge,sentinel -m raw \
|
|
-a "uid=\$(id -u podman) && sudo -u podman XDG_RUNTIME_DIR=/run/user/\$uid podman restart ppf-worker"
|
|
```
|
|
|
|
## Full Deployment Procedure
|
|
|
|
### Step 1: Validate Syntax Locally
|
|
|
|
```bash
|
|
cd /home/user/git/ppf
|
|
for f in *.py; do python3 -m py_compile "$f" && echo "OK: $f"; done
|
|
```
|
|
|
|
### Step 2: Deploy to ALL Hosts
|
|
|
|
```bash
|
|
cd /opt/ansible && source venv/bin/activate
|
|
|
|
# Deploy to ODIN (root ppf/ directory)
|
|
ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible odin -m synchronize \
|
|
-a "src=/home/user/git/ppf/ dest=/home/podman/ppf/ rsync_opts='--include=*.py,--include=servers.txt,--exclude=*'"
|
|
|
|
# Deploy to WORKERS (ppf/src/ subdirectory)
|
|
ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible cassius,edge,sentinel -m synchronize \
|
|
-a "src=/home/user/git/ppf/ dest=/home/podman/ppf/src/ rsync_opts='--include=*.py,--include=servers.txt,--exclude=*'"
|
|
|
|
# CRITICAL: Fix ownership on ALL hosts (rsync uses ansible user, containers need podman)
|
|
ANSIBLE_REMOTE_TMP=/tmp/.ansible ansible odin,cassius,edge,sentinel -m raw \
|
|
-a "chown -R podman:podman /home/podman/ppf/"
|
|
```
|
|
|
|
**Note:** Ownership must be fixed after every deploy. rsync runs as ansible user, but containers run as podman user. Missing ownership fix causes `ImportError: No module named X` errors.
|
|
|
|
### Step 3: Restart Services
|
|
|
|
```bash
|
|
# Restart ODIN (UID 1005)
|
|
ansible odin -m raw \
|
|
-a "cd /tmp && XDG_RUNTIME_DIR=/run/user/1005 runuser -u podman -- podman restart ppf"
|
|
|
|
# Restart WORKERS (dynamic UID discovery)
|
|
ansible cassius,edge,sentinel -m raw \
|
|
-a "uid=\$(id -u podman) && sudo -u podman XDG_RUNTIME_DIR=/run/user/\$uid podman restart ppf-worker"
|
|
```
|
|
|
|
### Step 4: Verify All Running
|
|
|
|
```bash
|
|
# Check odin (UID 1005)
|
|
ansible odin -m raw \
|
|
-a "cd /tmp && XDG_RUNTIME_DIR=/run/user/1005 runuser -u podman -- podman ps"
|
|
|
|
# Check workers (dynamic UID discovery)
|
|
ansible cassius,edge,sentinel -m raw \
|
|
-a "uid=\$(id -u podman) && sudo -u podman XDG_RUNTIME_DIR=/run/user/\$uid podman ps --format '{{.Names}} {{.Status}}'"
|
|
```
|
|
|
|
## Podman User IDs
|
|
|
|
```
|
|
┌──────────┬───────┬─────────────────────────────┐
|
|
│ Host │ UID │ XDG_RUNTIME_DIR
|
|
├──────────┼───────┼─────────────────────────────┤
|
|
│ odin │ 1005 │ /run/user/1005
|
|
│ cassius │ 993 │ /run/user/993
|
|
│ edge │ 993 │ /run/user/993
|
|
│ sentinel │ 992 │ /run/user/992
|
|
└──────────┴───────┴─────────────────────────────┘
|
|
```
|
|
|
|
**Prefer dynamic UID discovery** (`uid=$(id -u podman)`) over hardcoded values.
|
|
|
|
## Configuration
|
|
|
|
### Odin config.ini
|
|
|
|
```ini
|
|
[common]
|
|
tor_hosts = 127.0.0.1:9050 # Local Tor ONLY
|
|
|
|
[watchd]
|
|
threads = 0 # NO routine testing
|
|
database = data/ppf.sqlite
|
|
|
|
[scraper]
|
|
threads = 10
|
|
```
|
|
|
|
### Worker config.ini
|
|
|
|
```ini
|
|
[common]
|
|
tor_hosts = 127.0.0.1:9050 # Local Tor ONLY
|
|
|
|
[watchd]
|
|
threads = 35
|
|
timeout = 9
|
|
ssl_first = 1 # Try SSL handshake first
|
|
ssl_only = 0 # Set to 1 to skip secondary check on SSL failure
|
|
checktype = head # Secondary check type: head, irc, judges
|
|
```
|
|
|
|
### Config Options
|
|
|
|
```
|
|
┌───────────────┬─────────┬────────────────────────────────────────────────────┐
|
|
│ Option │ Default │ Description
|
|
├───────────────┼─────────┼────────────────────────────────────────────────────┤
|
|
│ ssl_first │ 1 │ Try SSL handshake first, fallback to checktype
|
|
│ ssl_only │ 0 │ Skip secondary check when SSL fails (faster)
|
|
│ checktype │ head │ Secondary check: head, irc, judges
|
|
│ threads │ 20 │ Number of test threads
|
|
│ timeout │ 15 │ Socket timeout in seconds
|
|
└───────────────┴─────────┴────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
## Work Distribution
|
|
|
|
Fair distribution algorithm (httpd.py):
|
|
|
|
```
|
|
fair_share = (due_proxies / active_workers) * 1.2
|
|
batch_size = clamp(fair_share, min=100, max=1000)
|
|
```
|
|
|
|
- Master calculates batch size based on queue and active workers
|
|
- Workers shuffle their batch locally to avoid testing same proxies simultaneously
|
|
- Claims expire after 5 minutes if not completed
|
|
|
|
## Worker Container
|
|
|
|
Workers run as podman containers with `--restart=unless-stopped`:
|
|
|
|
```bash
|
|
podman run -d --name ppf-worker --network=host --restart=unless-stopped \
|
|
-e PYTHONUNBUFFERED=1 \
|
|
-v /home/podman/ppf/src:/app:ro,Z \
|
|
-v /home/podman/ppf/data:/app/data:Z \
|
|
-v /home/podman/ppf/config.ini:/app/config.ini:ro,Z \
|
|
-v /home/podman/ppf/servers.txt:/app/servers.txt:ro,Z \
|
|
localhost/ppf-worker:latest \
|
|
python -u ppf.py --worker --server http://10.200.1.250:8081
|
|
```
|
|
|
|
## Rebuilding Images
|
|
|
|
```bash
|
|
# Workers - from ppf/ directory (Dockerfile copies from src/)
|
|
ansible cassius,edge,sentinel -m raw \
|
|
-a "cd /home/podman/ppf && sudo -u podman podman build -t localhost/ppf-worker:latest ."
|
|
|
|
# Odin - from ppf/ directory
|
|
ansible odin -m raw \
|
|
-a "cd /home/podman/ppf && sudo -u podman podman build -t localhost/ppf:latest ."
|
|
```
|
|
|
|
## API Endpoints
|
|
|
|
```
|
|
/dashboard Web UI with live statistics
|
|
/map Interactive world map
|
|
/health Health check
|
|
/api/stats Runtime statistics (JSON)
|
|
/api/workers Connected worker status
|
|
/api/countries Proxy counts by country
|
|
/api/claim-urls Claim URL batch for worker-driven fetching (GET)
|
|
/api/report-urls Report URL fetch results (POST)
|
|
/api/report-proxies Report working proxies (POST)
|
|
/proxies Working proxies list
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### Missing servers.txt
|
|
|
|
Workers need `servers.txt` in src/:
|
|
```bash
|
|
ansible cassius,edge,sentinel -m copy \
|
|
-a "src=/home/user/git/ppf/servers.txt dest=/home/podman/ppf/src/servers.txt owner=podman group=podman"
|
|
```
|
|
|
|
### Exit Code 126 (Permission/Storage)
|
|
|
|
```bash
|
|
sudo -u podman podman system reset --force
|
|
# Then rebuild image
|
|
```
|
|
|
|
### Dashboard Shows NaN or Missing Data
|
|
|
|
Odin likely running old code. Redeploy to odin:
|
|
```bash
|
|
ansible odin -m synchronize \
|
|
-a "src=/home/user/git/ppf/ dest=/home/podman/ppf/ rsync_opts='--include=*.py,--include=servers.txt,--exclude=*'"
|
|
ansible odin -m raw -a "chown -R podman:podman /home/podman/ppf/"
|
|
ansible odin -m raw -a "cd /tmp; sudo -u podman podman restart ppf"
|
|
```
|
|
|
|
### Worker Keeps Crashing
|
|
|
|
1. Check container status: `sudo -u podman podman ps -a`
|
|
2. Check logs: `sudo -u podman podman logs --tail 50 ppf-worker`
|
|
3. Verify servers.txt exists in src/
|
|
4. Check ownership: `ls -la /home/podman/ppf/src/`
|
|
5. Run manually to see error:
|
|
```bash
|
|
sudo -u podman podman run --rm --network=host \
|
|
-v /home/podman/ppf/src:/app:ro,Z \
|
|
-v /home/podman/ppf/data:/app/data:Z \
|
|
-v /home/podman/ppf/config.ini:/app/config.ini:ro,Z \
|
|
-v /home/podman/ppf/servers.txt:/app/servers.txt:ro,Z \
|
|
localhost/ppf-worker:latest \
|
|
python -u ppf.py --worker --server http://10.200.1.250:8081
|
|
```
|
|
|
|
## Files to Deploy
|
|
|
|
- All *.py files
|
|
- servers.txt
|
|
|
|
## Do NOT Deploy
|
|
|
|
- config.ini (server-specific)
|
|
- data/ contents
|
|
- *.sqlite files
|