docs: add README and update ROADMAP
- README.md: installation, configuration, usage, deployment - ROADMAP.md: mark completed items (pooling, scaling, latency, containers) - priority matrix updated with completion status
This commit is contained in:
51
ROADMAP.md
51
ROADMAP.md
@@ -177,18 +177,18 @@ PPF (Proxy Fetcher) is a Python 2 proxy scraping and validation framework design
|
||||
├──────────────────────────┬──────────────────────────────────────────────────┤
|
||||
│ HIGH IMPACT / LOW EFFORT │ HIGH IMPACT / HIGH EFFORT │
|
||||
│ │ │
|
||||
│ ● Unify _known_proxies │ ● Connection pooling │
|
||||
│ ● Graceful DB errors │ ● Dynamic thread scaling │
|
||||
│ ● Batch inserts │ ● Unit test infrastructure │
|
||||
│ ● WAL mode for SQLite │ ● Latency tracking │
|
||||
│ [x] Unify _known_proxies │ [x] Connection pooling │
|
||||
│ [x] Graceful DB errors │ [x] Dynamic thread scaling │
|
||||
│ [x] Batch inserts │ [ ] Unit test infrastructure │
|
||||
│ [x] WAL mode for SQLite │ [x] Latency tracking │
|
||||
│ │ │
|
||||
├──────────────────────────┼──────────────────────────────────────────────────┤
|
||||
│ LOW IMPACT / LOW EFFORT │ LOW IMPACT / HIGH EFFORT │
|
||||
│ │ │
|
||||
│ ● Standardize logging │ ● Geographic validation │
|
||||
│ ● Config validation │ ● Additional scrapers │
|
||||
│ ● Export functionality │ ● API sources │
|
||||
│ ● Status output │ ● Protocol fingerprinting │
|
||||
│ [x] Standardize logging │ [ ] Geographic validation │
|
||||
│ [x] Config validation │ [x] Additional scrapers │
|
||||
│ [ ] Export functionality │ [ ] API sources │
|
||||
│ [x] Status output │ [ ] Protocol fingerprinting │
|
||||
│ │ │
|
||||
└──────────────────────────┴──────────────────────────────────────────────────┘
|
||||
```
|
||||
@@ -233,6 +233,41 @@ PPF (Proxy Fetcher) is a Python 2 proxy scraping and validation framework design
|
||||
- [x] Stale proxy cleanup (cleanup_stale() with configurable stale_days)
|
||||
- [x] Timeout config options (timeout_connect, timeout_read)
|
||||
|
||||
### Connection Pooling (Done)
|
||||
- [x] TorHostState class tracking per-host health and latency
|
||||
- [x] TorConnectionPool with worker affinity for circuit reuse
|
||||
- [x] Exponential backoff (5s, 10s, 20s, 40s, max 60s) on failures
|
||||
- [x] Pool warmup and health status reporting
|
||||
|
||||
### Priority Queue (Done)
|
||||
- [x] PriorityJobQueue class with heap-based ordering
|
||||
- [x] calculate_priority() assigns priority 0-4 by proxy state
|
||||
- [x] New proxies tested first, high-fail proxies last
|
||||
|
||||
### Dynamic Thread Scaling (Done)
|
||||
- [x] ThreadScaler class adjusts thread count dynamically
|
||||
- [x] Scales up when queue deep and success rate acceptable
|
||||
- [x] Scales down when queue shallow or success rate drops
|
||||
- [x] Respects min/max bounds with cooldown period
|
||||
|
||||
### Latency Tracking (Done)
|
||||
- [x] avg_latency, latency_samples columns in proxylist
|
||||
- [x] Exponential moving average calculation
|
||||
- [x] Migration function for existing databases
|
||||
- [x] Latency recorded for successful proxy tests
|
||||
|
||||
### Container Support (Done)
|
||||
- [x] Dockerfile with Python 2.7-slim base
|
||||
- [x] docker-compose.yml for local development
|
||||
- [x] Rootless podman deployment documentation
|
||||
- [x] Volume mounts for persistent data
|
||||
|
||||
### Code Style (Done)
|
||||
- [x] Normalized indentation (4-space, no tabs)
|
||||
- [x] Removed dead code and unused imports
|
||||
- [x] Added docstrings to classes and functions
|
||||
- [x] Python 2/3 compatible imports (Queue/queue)
|
||||
|
||||
---
|
||||
|
||||
## Technical Debt
|
||||
|
||||
Reference in New Issue
Block a user