From d09f6d5e08ab2311373a1f1a5e65e716ee631760 Mon Sep 17 00:00:00 2001 From: Username Date: Thu, 25 Dec 2025 11:14:27 +0100 Subject: [PATCH] docs: update roadmap and todo --- ROADMAP.md | 23 ++++ TODO.md | 320 ++++++++++++++++++++++++++++++++++++++++++++++++++++- 2 files changed, 342 insertions(+), 1 deletion(-) diff --git a/ROADMAP.md b/ROADMAP.md index b850dc7..e8e4b3a 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -306,6 +306,29 @@ PPF (Proxy Fetcher) is a Python 2 proxy scraping and validation framework design - [x] SQLite ANALYZE/VACUUM functions for query optimization - [x] Lightweight design: client-side polling, minimal DOM updates +### Dashboard Enhancements v3 (Done) +- [x] Electric cyan theme with translucent glass-morphism effects +- [x] Unified wrapper styling (.chart-wrap, .histo-wrap, .stats-wrap, .lb-wrap, .pie-wrap) +- [x] Consistent backdrop-filter blur and electric glow borders +- [x] Tor Exit Nodes cards with hover effects (.tor-card) +- [x] Lighter background/tile color scheme (#1e2738 bg, #181f2a card) +- [x] Map endpoint restyled to match dashboard (electric cyan theme) +- [x] Map markers updated from gold to cyan for approximate locations + +### Memory Profiling & Analysis (Done) +- [x] /api/memory endpoint with comprehensive memory stats +- [x] objgraph integration for object type counting +- [x] pympler integration for memory summaries +- [x] Memory sample history tracking (RSS over time) +- [x] Process memory from /proc/self/status (VmRSS, VmPeak, VmData, etc.) +- [x] GC statistics (collections, objects, thresholds) + +### MITM Detection Optimization (Done) +- [x] MITM re-test skip optimization - avoid redundant SSL checks for known MITM proxies +- [x] mitm_retest_skipped stats counter for tracking optimization effectiveness +- [x] Content hash deduplication for stale proxy list detection +- [x] stale_count reset when content hash changes + --- ## Technical Debt diff --git a/TODO.md b/TODO.md index 1ea0c04..8ec2644 100644 --- a/TODO.md +++ b/TODO.md @@ -460,15 +460,88 @@ above target the remaining 31.3% of CPU-bound operations. ### [ ] Dashboard Feature Ideas **Low priority - consider when time permits:** +- [x] Geographic map visualization - /map endpoint with Leaflet.js - [ ] Dark/light theme toggle - [ ] Export stats as CSV/JSON from dashboard - [ ] Historical graphs (24h, 7d) using stats_history table - [ ] Per-ASN performance analysis -- [ ] Geographic map visualization (requires JS library) - [ ] Alert thresholds (success rate < X%, MITM detected) - [ ] Mobile-responsive improvements - [ ] Keyboard shortcuts (r=refresh, t=toggle sections) +### [ ] Local JS Library Serving + +**Goal:** Serve all JavaScript libraries locally instead of CDN for reliability and offline use. + +**Current CDN dependencies:** +- Leaflet.js 1.9.4 (map) - https://unpkg.com/leaflet@1.9.4/ + +**Implementation:** +- [ ] Bundle libraries into container image +- [ ] Serve from /static/lib/ endpoint +- [ ] Update HTML to reference local paths + +**Candidate libraries for future enhancements:** + +``` +┌─────────────────┬─────────┬───────────────────────────────────────────────┐ +│ Library │ Size │ Use Case +├─────────────────┼─────────┼───────────────────────────────────────────────┤ +│ Chart.js │ 65 KB │ Line/bar/pie charts (simpler API than D3) +│ uPlot │ 15 KB │ Fast time-series charts (minimal, performant) +│ ApexCharts │ 125 KB │ Modern charts with animations +│ Frappe Charts │ 25 KB │ Simple, modern SVG charts +│ Sparkline │ 2 KB │ Tiny inline charts (already have custom impl) +├─────────────────┼─────────┼───────────────────────────────────────────────┤ +│ D3.js │ 85 KB │ Full control, complex visualizations +│ D3-geo │ 30 KB │ Geographic projections (alternative to Leaflet) +├─────────────────┼─────────┼───────────────────────────────────────────────┤ +│ Leaflet │ 40 KB │ Interactive maps (already using) +│ Leaflet.heat │ 5 KB │ Heatmap layer for proxy density +│ Leaflet.cluster │ 10 KB │ Marker clustering for many points +└─────────────────┴─────────┴───────────────────────────────────────────────┘ + +Recommendations: + ● uPlot - Best for time-series (rate history, success rate history) + ● Chart.js - Best for pie/bar charts (failure breakdown, protocol stats) + ● Leaflet - Keep for maps, add heatmap plugin for density viz +``` + +**Current custom implementations (no library):** +- Sparkline charts (Test Rate History, Success Rate History) - inline SVG +- Histogram bars (Response Time Distribution) - CSS divs +- Pie charts (Failure Breakdown, Protocol Stats) - CSS conic-gradient + +**Decision:** Current custom implementations are lightweight and sufficient. +Add libraries only when custom becomes unmaintainable or new features needed. + +### [ ] Memory Optimization Candidates + +**Based on memory analysis (production metrics):** +``` +Current State (260k queue): + Start RSS: 442 MB + Current RSS: 1,615 MB + Per-job: ~4.5 KB overhead + +Object Distribution: + 259,863 TargetTestJob (1 per job) + 259,863 ProxyTestState (1 per job) + 259,950 LockType (1 per job - threading locks) + 523,395 dict (2 per job - state + metadata) + 522,807 list (2 per job - results + targets) +``` + +**Potential optimizations (not yet implemented):** +- [ ] Lock consolidation - reduce per-proxy locks (260k LockType objects) +- [ ] Leaner state objects - reduce dict/list count per job +- [ ] Slot-based classes - use `__slots__` on hot objects +- [ ] Object pooling - reuse ProxyTestState/TargetTestJob objects + +**Verdict:** Memory scales linearly with queue (~4.5 KB/job). No leaks detected. +Current usage acceptable for production workloads. Optimize only if memory +becomes a constraint. + --- ## Completed @@ -553,3 +626,248 @@ above target the remaining 31.3% of CPU-bound operations. err_msg = type(e).__name__ ``` - Handles Korean/CJK characters in search queries without crashing + +### [x] Interactive World Map (/map endpoint) +- Added Leaflet.js interactive map showing proxy distribution by country +- Modern glassmorphism UI with `backdrop-filter: blur(12px)` +- CartoDB dark tiles for dark theme +- Circle markers sized proportionally to proxy count per country +- Hover effects with smooth transitions +- Stats overlay showing total countries/proxies +- Legend with proxy count scale +- Country coordinates and names lookup tables + +### [x] Dashboard v3 - Electric Cyan Theme +- Translucent glass-morphism effects with `backdrop-filter: blur()` +- Electric cyan glow borders `rgba(56,189,248,...)` on all graph wrappers +- Gradient overlays using `::before` pseudo-elements +- Unified styling across: .chart-wrap, .histo-wrap, .stats-wrap, .lb-wrap, .pie-wrap +- New .tor-card wrapper for Tor Exit Nodes with hover effects +- Lighter background color scheme (#1e2738 bg, #181f2a card) + +### [x] Map Endpoint Styling Update +- Converted from gold/bronze theme (#c8b48c) to electric cyan (#38bdf8) +- Glass panels with electric glow matching dashboard +- Map markers for approximate locations now cyan instead of gold +- Unified map_bg color with dashboard background (#1e2738) +- Updated Leaflet controls, popups, and legend to cyan theme + +### [x] MITM Re-test Optimization +- Skip redundant SSL checks for proxies already known to be MITM +- Added `mitm_retest_skipped` counter to Stats class +- Optimization in `_try_ssl_check()` checks existing MITM flag before testing +- Avoids 6k+ unnecessary re-tests per session (based on production metrics) + +### [x] Memory Profiling Endpoint +- /api/memory endpoint with comprehensive memory analysis +- objgraph integration for object type distribution +- pympler integration for memory summaries +- Memory sample history tracking (RSS over time) +- Process memory from /proc/self/status +- GC statistics and collection counts + +--- + +## Deployment Troubleshooting Log + +### [x] Container Crash on Startup (2024-12-24) + +**Symptoms:** +- Container starts then immediately disappears +- `podman ps` shows no running containers +- `podman logs ppf` returns "no such container" +- Port 8081 not listening + +**Debugging Process:** + +1. **Initial diagnosis** - SSH to odin, checked container state: + ```bash + sudo -u podman podman ps -a # Empty + sudo ss -tlnp | grep 8081 # Nothing listening + ``` + +2. **Ran container in foreground** to capture output: + ```bash + sudo -u podman bash -c 'cd /home/podman/ppf && \ + timeout 25 podman run --rm --name ppf --network=host \ + -v ./src:/app:ro -v ./data:/app/data \ + -v ./config.ini:/app/config.ini:ro \ + localhost/ppf python2 -u proxywatchd.py 2>&1' + ``` + +3. **Found the error** in httpd thread startup: + ``` + error: [Errno 98] Address already in use: ('0.0.0.0', 8081) + ``` + Container started, httpd failed to bind, process continued but HTTP unavailable. + +4. **Identified root cause** - orphaned processes from previous debug attempts: + ```bash + ps aux | grep -E "[p]pf|[p]roxy" + # Found: python2 ppf.py (PID 6421) still running, holding port 8081 + # Found: conmon, timeout, bash processes from stale container + ``` + +5. **Why orphans existed:** + - Previous `timeout 15 podman run` commands timed out + - `podman rm -f` doesn't kill processes when container metadata is corrupted + - Orphaned python2 process kept running with port bound + +**Root Cause:** +Stale container processes from interrupted debug sessions held port 8081. +The container started successfully but httpd thread failed to bind, +causing silent failure (no HTTP endpoints) while proxy testing continued. + +**Fix Applied:** +```bash +# Force kill all orphaned processes +sudo pkill -9 -f "ppf.py" +sudo pkill -9 -f "proxywatchd.py" +sudo pkill -9 -f "conmon.*ppf" +sleep 2 + +# Verify port is free +sudo ss -tlnp | grep 8081 # Should show nothing + +# Clean podman state +sudo -u podman podman rm -f -a +sudo -u podman podman container prune -f + +# Start fresh +sudo -u podman bash -c 'cd /home/podman/ppf && \ + podman run -d --rm --name ppf --network=host \ + -v ./src:/app:ro -v ./data:/app/data \ + -v ./config.ini:/app/config.ini:ro \ + localhost/ppf python2 -u proxywatchd.py' +``` + +**Verification:** +```bash +curl -sf http://localhost:8081/health +# {"status": "ok", "timestamp": 1766573885} +``` + +**Prevention:** +- Use `podman-compose` for reliable container management +- Use `pkill -9 -f` to kill orphaned processes before restart +- Check port availability before starting: `ss -tlnp | grep 8081` +- Run container foreground first to capture startup errors + +**Correct Deployment Procedure:** +```bash +# As root or with sudo +sudo -i -u podman bash +cd /home/podman/ppf +podman-compose down +podman-compose up -d +podman ps +podman logs -f ppf +``` + +**docker-compose.yml (updated):** +```yaml +version: '3.8' + +services: + ppf: + image: localhost/ppf:latest + container_name: ppf + network_mode: host + volumes: + - ./src:/app:ro + - ./data:/app/data + - ./config.ini:/app/config.ini:ro + command: python2 -u proxywatchd.py + restart: unless-stopped + environment: + - PYTHONUNBUFFERED=1 +``` + +--- + +### [x] SSH Connection Flooding / fail2ban (2024-12-24) + +**Symptoms:** +- SSH connections timing out or reset +- "Connection refused" errors +- Intermittent access to odin + +**Root Cause:** +Multiple individual SSH commands triggered fail2ban rate limiting. + +**Fix Applied:** +Created `~/.claude/rules/ssh-usage.md` with batching best practices. + +**Key Pattern:** +```bash +# BAD: 5 separate connections +ssh host 'cmd1' +ssh host 'cmd2' +ssh host 'cmd3' + +# GOOD: 1 connection, all commands +ssh host bash <<'EOF' +cmd1 +cmd2 +cmd3 +EOF +``` + +--- + +### [!] Podman Container Metadata Disappears (2024-12-24) + +**Symptoms:** +- `podman ps -a` shows empty even though process is running +- `podman logs ppf` returns "no such container" +- Port is listening and service responds to health checks + +**Observed Behavior:** +``` +# Container starts +podman run -d --name ppf ... +# Returns container ID: dc55f0a218b7... + +# Immediately after +podman ps -a # Empty! +ss -tlnp | grep 8081 # Shows python2 listening +curl localhost:8081/health # {"status": "ok"} +``` + +**Analysis:** +- The process runs correctly inside the container namespace +- Container metadata in podman's database is lost/corrupted +- May be related to `--rm` flag interaction with detached mode +- Rootless podman with overlayfs can have state sync issues + +**Workaround:** +Service works despite missing metadata. Monitor via: +- `ss -tlnp | grep 8081` - port listening +- `ps aux | grep proxywatchd` - process running +- `curl localhost:8081/health` - service responding + +**Impact:** Low. Service functions correctly. Only `podman logs` unavailable. + +--- + +### Container Debugging Checklist + +When container fails to start or crashes: + +``` +┌───┬─────────────────────────────────────────────────────────────────────────┐ +│ 1 │ Check for orphans: ps aux | grep -E "[p]rocess_name" +│ 2 │ Check port conflicts: ss -tlnp | grep PORT +│ 3 │ Run foreground: podman run --rm (no -d) to see output +│ 4 │ Check podman state: podman ps -a +│ 5 │ Clean stale: pkill -9 -f "pattern" && podman rm -f -a +│ 6 │ Verify deps: config files, data dirs, volumes exist +│ 7 │ Check logs: podman logs container_name 2>&1 | tail -50 +│ 8 │ Health check: curl -sf http://localhost:PORT/health +└───┴─────────────────────────────────────────────────────────────────────────┘ + +Note: If podman ps shows empty but port is listening and health check passes, +the service is running correctly despite metadata issues. See "Podman Container +Metadata Disappears" section above. +``` +- Dashboard: pause API polling for inactive tabs (only update persistent items + active tab)