s5p/docs/USAGE.md

# s5p -- Usage

## Basic Usage

```bash
# Direct proxy (no chain, just a SOCKS5 server)
s5p

# Through Tor
s5p -C socks5://127.0.0.1:9050

# Through Tor + another proxy
s5p -C socks5://127.0.0.1:9050,socks5://proxy:1080

# Custom listen address
s5p -l 0.0.0.0:9999 -C socks5://127.0.0.1:9050

# From config file
s5p -c config/s5p.yaml

# With proxy source API (rotate exit proxy per-connection)
s5p -C socks5://127.0.0.1:9050 -S http://10.200.1.250:8081/proxies

# Debug mode
s5p -v -C socks5://127.0.0.1:9050
```

## Configuration

Copy the tracked example to create your live config:

```bash
cp config/example.yaml config/s5p.yaml
```

| File | Tracked | Purpose |
|------|---------|---------|
| `config/example.yaml` | yes | Template with placeholder addresses |
| `config/s5p.yaml` | no (gitignored) | Live config with real proxy addresses |

```yaml
timeout: 10
retries: 3
log_level: info
max_connections: 256      # concurrent connection limit (backpressure)
pool_size: 0              # pre-warmed TCP connections to first hop (0 = disabled)
pool_max_idle: 30         # max idle time for pooled connections (seconds)
api_listen: ""            # control API bind address (empty = disabled)

# Named proxy pools (each with its own sources and filters)
proxy_pools:
  clean:
    sources:
      - url: http://10.200.1.250:8081/proxies/all
        mitm: false
    refresh: 300
    test_interval: 120
    test_timeout: 8
    max_fails: 3

# Multi-listener (each port gets its own chain depth and pool)
listeners:
  - listen: 0.0.0.0:1080
    pool: clean
    chain:
      - socks5://127.0.0.1:9050
      - pool                           # Tor + 2 clean proxies
      - pool
  - listen: 0.0.0.0:1081
    pool: clean
    chain:
      - socks5://127.0.0.1:9050
      - pool                           # Tor + 1 clean proxy

# Or single-listener (old format):
# listen: 127.0.0.1:1080
# chain:
#   - socks5://127.0.0.1:9050
```

## Multi-Tor Round-Robin

Distribute traffic across multiple Tor nodes instead of funneling everything
through a single one. When `tor_nodes` is configured, the first hop in each
listener's chain is replaced at connection time by round-robin selection.
Health tests also rotate across all nodes.

```yaml
tor_nodes:
  - socks5://10.200.1.1:9050
  - socks5://10.200.1.254:9050
  - socks5://10.200.1.250:9050
  - socks5://10.200.1.13:9050
```

When `tor_nodes` is absent, listeners use their configured first hop as before.
When present, `tor_nodes` overrides the first hop everywhere.

If `pool_size > 0`, pre-warmed connection pools are created for all nodes
automatically.

### API

`tor_nodes` appears in both `/config` and `/status` responses:

```bash
curl -s http://127.0.0.1:1090/config | jq '.tor_nodes'
curl -s http://127.0.0.1:1090/status | jq '.tor_nodes'
```

## Named Proxy Pools

Define multiple proxy pools with different source filters. Each listener can
reference a specific pool by name via the `pool:` key.

```yaml
proxy_pools:
  clean:
    sources:
      - url: http://10.200.1.250:8081/proxies/all
        mitm: false
    state_file: /data/pool-clean.json
    refresh: 300
    test_interval: 120
    test_timeout: 8
    max_fails: 3
  mitm:
    sources:
      - url: http://10.200.1.250:8081/proxies/all
        mitm: true
    state_file: /data/pool-mitm.json
    refresh: 300
    test_interval: 120
    test_timeout: 8
    max_fails: 3
```

Each pool has independent health testing, state persistence, and source
refresh cycles. The `mitm` source filter adds `?mitm=0` or `?mitm=1` to
API requests.

### Backward compatibility

The singular `proxy_pool:` key still works -- it registers as pool `"default"`.
If both `proxy_pool:` and `proxy_pools:` are present, `proxy_pools:` wins;
the singular is registered as `"default"` only when not already defined.

## Multi-Listener Mode

Run multiple listeners on different ports, each with a different number
of proxy hops and pool assignment. Config-file only (not available via CLI).

```yaml
listeners:
  - listen: 0.0.0.0:1080
    pool: clean
    chain:
      - socks5://10.200.1.13:9050
      - pool                                 # Tor + 2 clean proxies
      - pool

  - listen: 0.0.0.0:1081
    pool: clean
    chain:
      - socks5://10.200.1.13:9050
      - pool                                 # Tor + 1 clean proxy

  - listen: 0.0.0.0:1082
    chain:
      - socks5://10.200.1.13:9050           # Tor only (no pool)

  - listen: 0.0.0.0:1083
    pool: mitm
    chain:
      - socks5://10.200.1.13:9050
      - pool                                 # Tor + 2 MITM proxies
      - pool
```

### Per-hop pool references

Use `pool:name` to draw from a specific named pool at that hop position.
Bare `pool` uses the listener's `pool:` default. This lets a single listener
mix pools in one chain.

```yaml
listeners:
  - listen: 0.0.0.0:1080
    pool: clean                              # default for bare "pool"
    chain:
      - socks5://10.200.1.13:9050
      - pool:clean                           # explicit: from clean pool
      - pool:mitm                            # explicit: from mitm pool

  - listen: 0.0.0.0:1081
    pool: clean
    chain:
      - socks5://10.200.1.13:9050
      - pool                                 # bare: uses default "clean"
      - pool:mitm                            # explicit: from mitm pool
```

| Syntax | Resolves to |
|--------|-------------|
| `pool` | Listener's `pool:` value, or `"default"` if unset |
| `pool:name` | Named pool `name` (case-sensitive) |
| `pool:` | Same as bare `pool` (empty name = default) |
| `Pool:name` | Prefix is case-insensitive; name is case-sensitive |

The `pool` keyword in a chain means "append a random alive proxy from the
assigned pool". Multiple `pool` entries = multiple pool hops (deeper chaining).

When `pool:` is omitted on a listener with pool hops, it defaults to
`"default"`. A listener referencing an unknown pool name causes a fatal
error at startup. Listeners without pool hops ignore the `pool:` key.

| Resource | Scope | Notes |
|----------|-------|-------|
| ProxyPool | per name | Each named pool is independent |
| TorController | shared | One Tor instance |
| Metrics | shared | Aggregate stats across listeners |
| Semaphore | shared | Global `max_connections` cap |
| API server | shared | One control endpoint |
| FirstHopPool | per unique first hop | Listeners with same first hop share it |
| Chain + pool_hops | per listener | Each listener has its own chain depth |

### Backward compatibility

When no `listeners:` key is present, the old `listen`/`chain` format creates
a single listener. If `proxy_pool` is configured without explicit `pool` in
the chain, legacy behavior is preserved (1 pool hop auto-appended).

Settings that require a restart: `listeners`, `listen`, `chain`, `pool_size`,
`pool_max_idle`, `api_listen`.

## Proxy URL Format

```
protocol://[username:password@]host[:port]
```

| Protocol | Default Port | Auth Support |
|----------|-------------|-------------|
| socks5   | 1080        | username/password |
| socks4   | 1080        | none |
| http     | 8080        | Basic |

## Container

```bash
make build   # build image
make up      # start container (detached)
make logs    # follow logs
make down    # stop and remove container
```

Source (`./src`) and config (`./config/s5p.yaml`) are mounted read-only
into the container. `~/.cache/s5p` is mounted as `/data` for pool state
and profile output. Edit locally, restart to pick up changes.

## Proxy Pool

Managed proxy pool with multiple sources, health testing, and persistence.
Appends an alive proxy after the static chain on each connection, weighted
by recency of last successful health test.

```yaml
proxy_pool:
  sources:
    - url: http://10.200.1.250:8081/proxies
      proto: socks5         # optional: filter by protocol
      country: US           # optional: filter by country
      limit: 1000           # max proxies to fetch from API
      mitm: false           # optional: filter by MITM status (true/false)
    - file: /etc/s5p/proxies.txt  # text file, one proxy URL per line
  refresh: 300              # re-fetch sources every 300 seconds
  test_interval: 120        # health test cycle every 120 seconds
  test_targets:             # TLS handshake targets (round-robin)
    - www.google.com
    - www.cloudflare.com
    - www.amazon.com
  test_timeout: 15          # per-test timeout (seconds)
  test_concurrency: 25      # max parallel tests (auto-scales to ~10% of pool)
  max_fails: 3              # evict after N consecutive failures
  state_file: ""            # empty = ~/.cache/s5p/pool[-name].json
  report_url: ""            # POST dead proxies here (optional)
```

### Sources

| Type | Config key | Description |
|------|-----------|-------------|
| HTTP API | `url` | JSON: `{"proxies": [{"proto": "socks5", "proxy": "host:port"}, ...]}` |
| Text file | `file` | One proxy URL per line, `#` comments, blank lines ignored |

### Source filters

| Filter | Values | Effect |
|--------|--------|--------|
| `proto` | `socks5`, `socks4`, `http` | Adds `?proto=...` to API URL |
| `country` | ISO 3166-1 alpha-2 | Adds `?country=...` to API URL |
| `limit` | integer | Adds `?limit=...` to API URL |
| `mitm` | `true` / `false` | Adds `?mitm=1` / `?mitm=0` to API URL |

The `mitm` filter is silently ignored for file sources.

### Proxy file format

```
# Exit proxies
socks5://1.2.3.4:1080
socks5://user:pass@5.6.7.8:1080
http://proxy.example.com:8080
```

### Health testing

Each cycle tests all proxies through the full chain (static chain + proxy)
by performing a TLS handshake against one of the `test_targets` (rotated
round-robin). A successful handshake marks the proxy alive. After `max_fails`
consecutive failures, a proxy is evicted.

Concurrency auto-scales to ~10% of the proxy count, capped by
`test_concurrency` (default 25, minimum 3). For example, a pool of 73 proxies
tests 7 at a time rather than saturating the upstream Tor node.

Before each health test cycle, the static chain is tested without any pool
proxy. If the chain itself is unreachable (e.g., Tor is down), proxy tests
are skipped entirely and a warning is logged. This prevents false mass-failure
and unnecessary evictions.

Mass-failure guard: if >90% of tests fail in one cycle, eviction is skipped
(likely the static chain is broken, not the proxies).

### Selection weight

Alive proxies are selected with probability proportional to their recency
weight: `1 / (1 + age / 300)`, where `age` is seconds since the last
successful health test. This favors freshly-verified proxies over stale ones:

| Age | Weight |
|-----|--------|
| 0 s (just tested) | ~1.0 |
| 5 min | ~0.5 |
| 10 min | ~0.33 |
| 30 min | ~0.1 |
| Never tested | 0.01 |

### Failure backoff

When a proxy fails during an actual connection attempt (not just a health
test), its weight is penalized for 60 seconds. The penalty ramps linearly
from floor (0.01) back to normal over that window. This prevents retries
from repeatedly selecting a proxy that just failed.

### Stale proxy expiry

Proxies not returned by any source for 3 consecutive refresh cycles and
not currently alive are automatically evicted. This cleans up proxies
removed upstream faster than waiting for `max_fails` health test failures.

### Persistence

Pool state is saved to `state_file` (default: `~/.cache/s5p/pool.json`) after
each refresh/health cycle and on shutdown. On startup, previously-alive proxies
are loaded for fast warm starts.

### Warm start

When restarting with an existing state file, the server trusts the cached
alive state and begins accepting connections immediately. A full health test
of all proxies runs in the background. Startup takes seconds regardless of
pool size. Cold starts (no state file) test all proxies before serving.

### Dead proxy reporting

When `report_url` is set, evicted proxies are POSTed to the upstream API
after each health test cycle. This helps the source maintain list quality.

```yaml
proxy_pool:
  report_url: http://10.200.1.250:8081/proxies/report
```

Payload format:

```json
{"dead": [{"proto": "socks5", "proxy": "1.2.3.4:1080"}, ...]}
```

Reporting is fire-and-forget; failures are logged at debug level only.

### CLI shorthand

```bash
s5p -C socks5://127.0.0.1:9050 -S http://10.200.1.250:8081/proxies
```

The `-S` flag creates a pool with a single API source (uses defaults for all
other pool settings).

### Legacy config

The old `proxy_source` key is still supported and auto-converts to `proxy_pool`
with a single API source. `proxy_pool` takes precedence if both are present.

## Control API

Built-in HTTP API for runtime inspection and management. Disabled by default;
enable with `api_listen` in config or `--api` on the command line.

```yaml
api_listen: 127.0.0.1:1081
```

```bash
s5p --api 127.0.0.1:1081 -c config/s5p.yaml
```

### Read endpoints

| Method | Path | Description |
|--------|------|-------------|
| `GET` | `/status` | Combined summary: uptime, metrics, pool stats, chain |
| `GET` | `/metrics` | Full metrics counters (connections, bytes, rate, latency) |
| `GET` | `/pool` | All proxies with per-entry state |
| `GET` | `/pool/alive` | Alive proxies only |
| `GET` | `/config` | Current runtime config (sanitized) |

### Write endpoints

| Method | Path | Description |
|--------|------|-------------|
| `POST` | `/reload` | Re-read config file (replaces SIGHUP) |
| `POST` | `/pool/test` | Trigger immediate health test cycle |
| `POST` | `/pool/refresh` | Trigger immediate source re-fetch |

All responses are `application/json`. Errors return `{"error": "message"}` with
appropriate status code (400, 404, 405, 500).

### Examples

```bash
# Runtime status
curl -s http://127.0.0.1:1081/status | jq .

# Full metrics
curl -s http://127.0.0.1:1081/metrics | jq .

# Pool state (all proxies)
curl -s http://127.0.0.1:1081/pool | jq .

# Alive proxies only
curl -s http://127.0.0.1:1081/pool/alive | jq '.proxies | length'

# Current config
curl -s http://127.0.0.1:1081/config | jq .

# Reload config (like SIGHUP)
curl -s -X POST http://127.0.0.1:1081/reload | jq .

# Trigger health tests now
curl -s -X POST http://127.0.0.1:1081/pool/test | jq .

# Re-fetch proxy sources now
curl -s -X POST http://127.0.0.1:1081/pool/refresh | jq .
```

Settings that require a restart: `listen`, `chain`, `pool_size`, `pool_max_idle`, `api_listen`.

## Tor Control Port

Optional integration with Tor's control protocol for circuit management.
When enabled, s5p connects to Tor's control port and can send NEWNYM signals
to request new circuits (new exit node) on demand or on a timer.

### Configuration

```yaml
tor:
  control_host: 127.0.0.1      # Tor control address
  control_port: 9051            # Tor control port
  password: ""                  # HashedControlPassword (torrc)
  cookie_file: ""               # CookieAuthentication file path
  newnym_interval: 0            # periodic NEWNYM in seconds (0 = manual only)
```

Requires Tor's `ControlPort` enabled in `torrc`:

```
ControlPort 9051
HashedControlPassword 16:...    # or CookieAuthentication 1
```

### Authentication modes

| Mode | Config | torrc |
|------|--------|-------|
| Password | `password: "secret"` | `HashedControlPassword 16:...` |
| Cookie | `cookie_file: /var/run/tor/control.authcookie` | `CookieAuthentication 1` |
| None | (leave both empty) | No auth configured |

### Rate limiting

Tor enforces a minimum 10-second interval between NEWNYM signals. s5p
applies the same client-side rate limit to avoid unnecessary rejections.

### API endpoints

| Method | Path | Description |
|--------|------|-------------|
| `GET` | `/tor` | Controller status (enabled, connected, last NEWNYM) |
| `POST` | `/tor/newnym` | Trigger NEWNYM signal (new circuit) |

```bash
# Check tor controller status
curl -s http://127.0.0.1:1081/tor | jq .

# Request new circuit
curl -s -X POST http://127.0.0.1:1081/tor/newnym | jq .
```

### Periodic NEWNYM

Set `newnym_interval` to automatically rotate circuits:

```yaml
tor:
  newnym_interval: 60    # new circuit every 60 seconds
```

Values below 10 are clamped to Tor's minimum interval.

## Connection Retry

When a proxy pool is active, s5p retries failed connections with a different
random proxy. Controlled by the `retries` setting (default: 3). Static-only
chains do not retry (retrying the same chain is pointless).

```yaml
retries: 5   # try up to 5 different proxies per connection
```

```bash
s5p -r 5 -C socks5://127.0.0.1:9050 -S http://api:8081/proxies
```

## Hot Reload

Send `SIGHUP` to reload the config file without restarting:

```bash
kill -HUP $(pidof s5p)
# or in a container:
podman kill --signal HUP s5p
```

Settings reloaded on SIGHUP:

| Setting | Effect |
|---------|--------|
| `timeout` | Per-connection timeout |
| `retries` | Max retry attempts |
| `log_level` | Logging verbosity |
| `max_connections` | Concurrent connection limit |
| `proxy_pool.*` | Sources, intervals, thresholds |

Settings that require a restart: `listeners`, `listen`, `chain`, `pool_size`, `pool_max_idle`, `api_listen`.

Requires `-c` / `--config` to know which file to re-read. Without a
config file, SIGHUP is ignored with a warning.

## Metrics

s5p tracks connection metrics and logs a summary every 60 seconds and on
shutdown:

```
metrics: conn=1842 ok=1790 fail=52 retries=67 active=3 in=50.0M out=1.0G rate=4.72/s p50=198.3ms p95=890.1ms up=1h01m01s pool=42/65
```

| Counter | Meaning |
|---------|---------|
| `conn` | Total incoming connections |
| `ok` | Successfully connected + relayed |
| `fail` | All retries exhausted |
| `retries` | Total retry attempts |
| `active` | Currently relaying |
| `in` | Bytes client -> remote |
| `out` | Bytes remote -> client |
| `rate` | Connection rate (events/sec, rolling window) |
| `p50` | Median chain setup latency in ms |
| `p95` | 95th percentile chain setup latency in ms |
| `up` | Server uptime |
| `pool` | Alive/total proxies (only when pool is active) |

### `/metrics` JSON response

`GET /metrics` returns all counters plus rate, latency percentiles, and
per-listener latency breakdowns:

```json
{
  "connections": 1842,
  "success": 1790,
  "failed": 52,
  "retries": 67,
  "active": 3,
  "bytes_in": 52428800,
  "bytes_out": 1073741824,
  "uptime": 3661.2,
  "rate": 4.72,
  "latency": {
    "count": 1000,
    "min": 45.2,
    "max": 2841.7,
    "avg": 312.4,
    "p50": 198.3,
    "p95": 890.1,
    "p99": 1523.6
  },
  "listener_latency": {
    "0.0.0.0:1080": {"count": 500, "min": 800.1, "max": 12400.3, "avg": 2100.5, "p50": 1800.2, "p95": 8200.1, "p99": 10500.3},
    "0.0.0.0:1081": {"count": 300, "min": 400.5, "max": 5200.1, "avg": 1200.3, "p50": 1000.1, "p95": 3500.2, "p99": 4800.7},
    "0.0.0.0:1082": {"count": 200, "min": 150.2, "max": 2000.1, "avg": 500.3, "p50": 400.1, "p95": 1200.5, "p99": 1800.2}
  }
}
```

| Field | Type | Description |
|-------|------|-------------|
| `rate` | float | Connections/sec (rolling window of last 256 events) |
| `latency` | object/null | Aggregate chain setup latency in ms (null if no samples) |
| `latency.count` | int | Number of samples in buffer (max 1000) |
| `latency.p50` | float | Median latency (ms) |
| `latency.p95` | float | 95th percentile (ms) |
| `latency.p99` | float | 99th percentile (ms) |
| `listener_latency` | object | Per-listener latency, keyed by `host:port` |

### Per-listener latency

Each listener tracks chain setup latency independently. The `/status`
endpoint includes a `latency` field on each listener entry:

```json
{
  "listeners": [
    {
      "listen": "0.0.0.0:1080",
      "chain": ["socks5://10.200.1.13:9050"],
      "pool_hops": 2,
      "latency": {"count": 500, "p50": 1800.2, "p95": 8200.1, "...": "..."}
    }
  ]
}
```

The aggregate `latency` in `/metrics` combines all listeners. Use
`listener_latency` or the per-listener `latency` in `/status` to
isolate latency by chain depth.

## Profiling

```bash
# Run with cProfile enabled
s5p --cprofile -c config/s5p.yaml

# Custom output file
s5p --cprofile output.prof -c config/s5p.yaml

# Container: uncomment the command line in compose.yaml
# command: ["-c", "/app/config/s5p.yaml", "--cprofile", "/data/s5p.prof"]
# Profile output persists at ~/.cache/s5p/s5p.prof on the host.

# Analyze after stopping
python -m pstats s5p.prof

# Memory profiling with tracemalloc (top 10 allocators on exit)
s5p --tracemalloc -c config/s5p.yaml

# Show top 20 allocators
s5p --tracemalloc 20 -c config/s5p.yaml

# Both profilers simultaneously
s5p --cprofile --tracemalloc -c config/s5p.yaml
```

## Testing the Proxy

```bash
# Check exit IP via Tor
curl --proxy socks5h://127.0.0.1:1080 https://check.torproject.org/api/ip

# Fetch a page
curl --proxy socks5h://127.0.0.1:1080 https://example.com

# Use with Firefox: set SOCKS5 proxy to 127.0.0.1:1080, enable remote DNS
```

Note: use `socks5h://` (not `socks5://`) with curl to send DNS through the proxy.

## Connection Limit

s5p caps concurrent connections with an `asyncio.Semaphore`. When all
slots are taken, new clients backpressure at TCP accept (the connection
is accepted but the handler waits for a slot).

```yaml
max_connections: 256    # default
```

```bash
s5p -m 512   # override via CLI
```

The `max_connections` value is reloaded on SIGHUP. Connections already
in flight are not affected -- new limits take effect as active connections
close.

## First-Hop Connection Pool

Pre-warms TCP connections to the first hop in the chain, avoiding the
TCP handshake latency on each client request. Only the raw TCP
connection is pooled -- once SOCKS/HTTP CONNECT negotiation begins,
the connection is consumed.

```yaml
pool_size: 8          # 0 = disabled (default)
pool_max_idle: 30     # seconds before a pooled connection is evicted
```

Connections idle longer than `pool_max_idle` are discarded and replaced.
A background task tops up the pool at half the idle interval. Requires
at least one hop in `chain` -- ignored if chain is empty.

## Chain Order

Hops are traversed left-to-right:

```
-C hop1,hop2,hop3

Client -> s5p -> hop1 -> hop2 -> hop3 -> Destination
```

Each hop only sees its immediate neighbors.