Username
82c909d7c0
rename --worker-v2 to --worker
...
No V1 means no need for the suffix. Update flag, function name,
compose command, log messages, and docs.
2026-02-17 22:30:09 +01:00
Username
2782e6d754
ppf: remove V1 worker functions and main loop
...
Drop worker_get_work(), worker_submit_results(), and the entire
worker_main() V1 loop. Rewire --register to use worker_v2_main().
2026-02-17 22:10:38 +01:00
Username
dfcd8f0c00
add test provenance columns and worker report fields
...
Add last_check/last_target columns to proxylist schema with migration.
Include checktype and target in V2 worker report payload.
2026-02-17 21:06:21 +01:00
Username
e74782ad3f
ppf: fix worker_id undefined when using --worker-key
2026-02-17 16:15:04 +01:00
Username
c710555aad
ppf: pass url scoring config to httpd module
2026-02-17 15:20:15 +01:00
Username
862eeed5c8
ppf: add worker_v2_main() for URL-driven discovery
2026-02-17 14:23:58 +01:00
Username
0685c2bc4c
ppf: add HTTP client functions for V2 worker endpoints
2026-02-17 14:23:44 +01:00
Username
5197c3b7e6
httpd: pass url database to api server
2026-02-17 13:42:01 +01:00
Username
2156319bad
ppf: worker heartbeat includes thread count
2026-01-08 09:05:30 +01:00
Username
c1ec5d593b
worker: check tor every 30s instead of exponential backoff
2025-12-28 18:41:05 +01:00
Username
2bc00d3ebd
worker: check tor before claiming work
CI / syntax-check (push) Successful in 3s
CI / memory-leak-check (push) Successful in 11s
2025-12-28 16:09:40 +01:00
Username
f4286ea515
ppf: remove num_targets param (removed in phase 2)
CI / syntax-check (push) Successful in 3s
CI / memory-leak-check (push) Successful in 11s
2025-12-28 15:16:52 +01:00
Username
d219cc567f
phase 2: code cleanup and simplification
...
CI / syntax-check (push) Successful in 3s
CI / memory-leak-check (push) Successful in 11s
- Remove unused result_queue from WorkerThread and worker mode
- Remove num_targets abstraction, simplify to single-target mode
- Add _db_context() context manager for database connections
- Refactor 5 call sites to use context manager (finish, init, cleanup_stale, periodic saves)
- Mark _prep_db/_close_db as deprecated
- Add __version__ = '2.0.0' to ppf.py
- Add thread spawn stagger (0-100ms) in worker mode for Tor-friendly startup
2025-12-28 14:31:37 +01:00
Username
72a2dcdaf4
ppf: add worker mode with distributed testing
...
CI / syntax-check (push) Successful in 3s
CI / memory-leak-check (push) Successful in 11s
- Add --worker mode for distributed proxy testing
- Workers claim batches from manager, test via local Tor, submit results
- Add --register to register new workers with manager
- Add thread spawn stagger (0-100ms) to avoid overwhelming Tor
- Verify Tor connectivity before claiming work
- Add heartbeat and batch timeout handling
- Track worker profiling state for dashboard display
2025-12-28 14:12:59 +01:00
Username
7232846b0f
ppf: add --reset flag to clear all state
2025-12-26 20:57:15 +01:00
Username
a20b5525f0
ppf: handle confidence field in proxy tuples
2025-12-26 19:34:22 +01:00
Username
269fed55ff
refactor core modules, integrate network stats
2025-12-25 11:13:20 +01:00
Username
9360c35add
ppf: add format_duration helper and stale log improvements
...
- Add format_duration() for compact time display
- Improve stale proxy logging with duration info
2025-12-24 00:20:13 +01:00
Username
68a34f2638
fetch: detect proxy protocol from source URL path
...
- detect_proto_from_path() infers socks4/socks5/http from URL
- extract_proxies() now returns (address, proto) tuples
- ppf.py updated to handle protocol-tagged proxies
- profiler signal handler for SIGTERM stats dump
2025-12-23 17:23:17 +01:00
Username
267035802a
ppf: reset stale_count when content hash changes
2025-12-22 00:05:06 +01:00
Username
f382a4ab6a
ppf: add content hash for duplicate proxy list detection
2025-12-22 00:03:12 +01:00
Username
747e6dd7aa
ppf: improve exception handling and logging
2025-12-21 23:37:57 +01:00
Username
e24f68500c
style: normalize indentation and improve code style
...
- convert tabs to 4-space indentation
- add docstrings to modules and classes
- remove unused import (copy)
- use explicit object inheritance
- use 'while True' over 'while 1'
- use 'while args' over 'while len(args)'
- use '{}' over 'dict()'
- consistent string formatting
- Python 2/3 compatible Queue import
2025-12-20 23:18:45 +01:00
Username
4780b6f095
fetch: consolidate extract_proxies into single implementation
2025-12-20 22:50:39 +01:00
Username
c759f7197e
ppf: use shared proxy cache from fetch module
2025-12-20 22:28:42 +01:00
Username
1d865d5250
ppf: use soup_parser instead of direct bs4 import
2025-12-20 17:33:40 +01:00
Username
57a7687b08
ppf: remove dead http server code
2025-12-20 16:46:08 +01:00
Your Name
15ff16b8d6
force py2 usage
2021-10-30 07:13:04 +02:00
Your Name
ee481ea31e
ppf: make scraper use extra proxies if available
2021-07-27 22:36:15 +02:00
Your Name
6b6cd94cec
spaces to tabs
2021-06-27 12:31:15 +02:00
Your Name
d3d83e1d90
changes
2021-05-12 08:06:03 +02:00
Your Name
cae6f75643
changs
2021-05-02 00:22:12 +02:00
Your Name
1a4d51f08c
ppf: play nice with cpu
2021-02-10 22:26:27 +01:00
Your Name
60c78be3fb
import new url as bulk list, misc cleansing
2021-02-06 23:25:12 +01:00
Your Name
7e91ae5237
changes
2021-02-06 21:50:08 +01:00
Your Name
68394da9ab
misc changes and fixes and
2021-02-06 15:36:14 +01:00
Your Name
b29c734002
fix: url → self.url, make thread option configurable
2021-02-06 14:33:44 +01:00
Your Name
5965312a9a
make leeching multithreaded, misc changes
2021-02-06 14:30:07 +01:00
Your Name
dd3d3c3518
fix: always check if is_bad_url
2021-02-06 12:20:34 +01:00
Your Name
01bded472f
tabs to space
2021-02-06 12:14:22 +01:00
Your Name
78b29a1187
some changes
2021-01-24 03:52:56 +01:00
Mickaël Serneels
eeedf9d0a1
extract url only from same domains ? (default: False)
...
setting this option will make ppf not follow external links when extracting uris
2019-05-14 21:24:29 +02:00
Mickaël Serneels
b226bc0b03
check if bad url *after* building the url
2019-05-14 19:31:19 +02:00
Mickaël Serneels
eeae849e12
space2tab
2019-05-14 19:29:30 +02:00
Mickaël Serneels
bcaf7af0e7
extract_urls(): only when stale_count = 0
2019-05-13 23:49:35 +02:00
Mickaël Serneels
e2122a27d9
ppf: strip extraced uris
2019-05-13 23:48:55 +02:00
Mickaël Serneels
225b76462c
import_from_file: don't add empty url
2019-05-13 23:48:55 +02:00
Mickaël Serneels
c241f1a766
make use of dbs.insert_urls()
2019-05-01 23:19:50 +02:00
Mickaël Serneels
c8d594fb73
add url extraction
...
url get extracted from webpage when page contains proxies
this allows to "learn" as much links as possible from a working website
2019-05-01 22:58:23 +02:00
Mickaël Serneels
0fb706eeae
clean code
2019-05-01 17:43:29 +02:00