Files
ppf/ROADMAP.md
Username 93eb395727
All checks were successful
CI / validate (push) Successful in 21s
docs: update roadmap, todo, and add tasklist
Restructure roadmap into phases. Clean up todo as intake buffer.
Add execution tasklist with prioritized items.
2026-02-22 13:58:37 +01:00

4.7 KiB

PPF Roadmap

Architecture

                    ┌──────────────────────────────────────────┐
                    │              Odin (Master)                │
                    │  httpd.py ─ API + SSL-only verification   │
                    │  proxywatchd.py ─ proxy recheck daemon    │
                    │  SQLite: proxies.db, websites.db          │
                    └──────────┬───────────────────────────────┘
                               │ WireGuard (10.200.1.0/24)
              ┌────────────────┼────────────────┐
              v                v                v
        ┌───────────┐   ┌───────────┐   ┌───────────┐
        │  cassius   │   │   edge    │   │ sentinel  │
        │  Worker    │   │  Worker   │   │  Worker   │
        │  ppf.py    │   │  ppf.py   │   │  ppf.py   │
        └───────────┘   └───────────┘   └───────────┘

Workers claim URLs, extract proxies, test them, report back. Master verifies (SSL-only), serves API, coordinates distribution.

Constraints

  • Python 2.7 runtime (container-based)
  • Minimal external dependencies
  • All traffic via Tor

Phase 1: Performance and Quality (current)

Profiling-driven optimizations and source pipeline hardening.

Item Status Description
Extraction short-circuits done Guard clauses in fetch.py extractors
Skip shutdown on failed sockets pending Avoid 39s/session wasted on dead connections
SQLite connection reuse (odin) pending Cache per-greenlet, eliminate 2.7k opens/session
Lazy-load ASN database pending Defer 3.6s startup cost to first lookup
Add more seed sources (100+) pending Expand beyond 37 hardcoded URLs
Protocol-aware source weighting pending Prioritize SOCKS5-yielding sources

Phase 2: Proxy Diversity and Consumer API

Address customer-reported quality gaps.

Item Status Description
ASN diversity scoring pending Deprioritize over-represented ASNs in testing
Graduated recheck intervals pending Fresh proxies rechecked more often than stale
API filters (proto/country/ASN/latency) pending Consumer-facing query parameters on /proxies
Latency-based ranking pending Expose latency percentiles per proxy

Phase 3: Self-Expanding Source Pool

Worker-driven link discovery from productive pages.

Item Status Description
Link extraction from productive pages pending Parse HTML for links when page yields proxies
Report discovered URLs to master pending New endpoint for worker URL submissions
Conditional discovery pending Only extract links from confirmed-productive pages

Phase 4: Long-Term

Item Status Description
Python 3 migration deferred Unblocks modern deps, security patches, pyasn native
Worker trust scoring pending Activate spot-check verification framework
Dynamic target pool pending Auto-discover and rotate validation targets
Geographic target spread pending Ensure targets span multiple regions

Completed

Item Date Description
last_seen freshness fix 2026-02-22 Watchd updates last_seen on verification
Periodic re-seeding 2026-02-22 Reset errored sources every 6h
ASN enrichment 2026-02-22 Pure-Python ipasn.dat reader + backfill
URL pipeline stats 2026-02-22 /api/stats exposes source health metrics
Extraction short-circuits 2026-02-22 Guard clauses + precompiled table regexes
Target health tracking prior Cooldown-based health for all target pools
MITM field in proxy list prior Expose mitm boolean in JSON endpoints
V1 worker protocol removal prior Cleaned up legacy --worker code path

File Reference

File Purpose
ppf.py URL harvester, worker main loop
proxywatchd.py Proxy validation daemon
fetch.py HTTP fetching, proxy extraction
httpd.py API server, worker coordination
dbs.py Database schema, seed sources
config.py Configuration management
rocksock.py Socket/proxy abstraction
http2.py HTTP client implementation
tools/ppf-deploy Deployment wrapper