Commit Graph

164 Commits

Author SHA1 Message Date
user
8cabe0f8e8 feat: add URL title preview plugin
Event-driven plugin that auto-fetches page titles for URLs posted in
channel messages. HEAD-then-GET via SOCKS5 pool, og:title priority,
cooldown dedup, !-suppression, binary/host filtering. 52 tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 21:57:00 +01:00
user
7606280358 fix: repair broken tests across alert, chanmgmt, and integration
- test_alert: remove stale _MAX_ANNOUNCE import/test, update _errors
  assertions for per-backend dict, fix announcement checks (action vs
  send), mock _fetch_og_batch in seeding tests, fix YouTube/SearX mock
  targets (urllib.request.urlopen), include keyword in fake data titles
- test_chanmgmt: add _FakeState to _FakeBot (on_invite now persists)
- test_integration: update help assertion for new output format

696 tests pass, 0 failures.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 21:14:44 +01:00
user
94f563d55a feat: connection pooling via urllib3 + batch OG fetching
Replace per-request SOCKS5+TLS handshakes with urllib3 SOCKSProxyManager
connection pool (20 pools, 4 conns/host). Batch _fetch_og calls via
ThreadPoolExecutor to parallelize OG tag enrichment in alert polling.
Cache flaskpaste SSL context at module level.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 20:52:22 +01:00
user
e11994f320 docs: update for v1.2.1 performance changes
- USAGE.md: alert output format, background seeding, per-backend errors,
  concurrent fetches
- CHEATSHEET.md: updated alert section
- DEBUG.md: added profiling section (cProfile + tracemalloc)
- ROADMAP.md: added v1.2.1 milestone

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 18:09:53 +01:00
user
a2a607baa2 fix: write tracemalloc dump to file instead of logger
Podman's log buffer truncates the output. Write full traceback dump
to data/derp.malloc with per-allocation stack traces.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 13:16:22 +01:00
user
404800af94 docs: update TASKS.md with v1.2.1 performance work
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 12:47:43 +01:00
user
694c775782 fix: remove title truncation from alert backend builders
Mastodon, Bluesky, GitHub, GitLab, npm, PyPI, and arXiv backends
no longer truncate content/descriptions in titles. Full text is
shown on the PRIVMSG line; only !alert history keeps truncation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 12:44:48 +01:00
user
9672e325c2 fix: show full alert titles, split metadata into ACTION line
ACTION carries the tag/date/URL, PRIVMSG carries the uncropped title.
Removes _truncate on alert output for better readability.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 12:22:23 +01:00
user
76301ac8f2 perf: concurrent fetches for multi-instance alert backends
Add _fetch_many() helper using ThreadPoolExecutor to query instances
in parallel. Refactors PeerTube, Mastodon, Lemmy, and SearXNG from
sequential to concurrent fetches. Also adds retries parameter to
derp.http.urlopen; multi-instance backends use retries=1 since
instance redundancy already provides resilience.

Worst-case wall time per backend drops from N*timeout to 1*timeout.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 12:02:57 +01:00
user
da908a45e4 fix: track alert backend errors independently
Per-backend error counts with exponential backoff: after 5 consecutive
failures a backend is skipped every 2^(n-5) cycles (capped at 32).
Working backends are no longer penalized by one flaky backend doubling
the entire poll interval.

Migrates last_error (string) to last_errors (dict per backend).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 10:51:42 +01:00
user
f2199f2bec perf: seed alert seen IDs in background on add
Reply immediately with empty seen, run a silent _poll_once in a
background task to populate seen IDs, then start the poller.
Eliminates the 30-120s blocking wait from 27 sequential backend queries.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 10:45:25 +01:00
user
a123eba32a feat: add --tracemalloc flag for memory profiling
Starts tracemalloc before the event loop and dumps the top 25
allocations on shutdown. Accepts optional nframes depth (default 10).
Can be combined with --cprofile.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 10:41:45 +01:00
user
933d9e1ddd perf: cache default HTTP opener at module level
Avoid rebuilding _ProxyHandler + build_opener() on every request.
Default-context callers (16 of 18 plugins) reuse one cached opener;
custom-context callers still get a fresh one.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 10:15:20 +01:00
user
3c505dd825 fix: persist short URLs in alert history, regenerate on expiry
Store shortened URLs in the results DB at poll time alongside the
original URL. History output uses the stored short URL directly,
only regenerating (and persisting) when no short URL exists yet.
Original URL always preserved for re-shortening if needed.
2026-02-16 23:24:26 +01:00
user
c92fdbfc30 refactor: remove !paste command, keep as internal helper
Paste creation is only used internally by the bot for multi-line
output. The create_paste() helper remains importable by other plugins.
2026-02-16 23:17:53 +01:00
user
546570d21b fix: mount secrets volume for flaskpaste mTLS certs 2026-02-16 23:15:10 +01:00
user
ffa75670e2 fix: use mTLS client cert to bypass PoW on flaskpaste
When secrets/flaskpaste/derp.crt and derp.key are present, load them
into the SSL context for mutual TLS auth and skip the PoW challenge
entirely. Fall back to PoW only when no client cert is available.
2026-02-16 23:13:09 +01:00
user
3cdc00c285 feat: add flaskpaste plugin with paste/shorten commands
- PoW-authenticated paste creation and URL shortening via FlaskPaste
- !paste <text> creates a paste, !shorten <url> shortens a URL
- Module-level shorten_url/create_paste helpers for cross-plugin use
- Alert plugin auto-shortens URLs in announcements and history output
- Custom TLS CA cert support via secrets/flaskpaste/derp.crt
- No SOCKS proxy -- direct urllib.request to FlaskPaste instance
2026-02-16 23:10:59 +01:00
user
35acc744ac fix: use DNS-over-HTTPS with provider rotation for emailcheck
Tor exit nodes poison plain DNS on port 53 and MITM some HTTPS
connections. Replace raw TCP DNS with DoH (Google, Cloudflare, Quad9)
and retry up to 5 times across providers to find a clean exit node.
MX results are now sorted by priority.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 22:40:43 +01:00
user
e8d803abe6 fix: account for server prefix in IRC line splitting
The 512-byte IRC limit includes the :nick!user@host prefix the server
prepends when relaying. Reserve 64 bytes for it and prefer splitting at
space boundaries instead of mid-word. Also strip the command prefix and
"Commands:" label from help output to keep the listing compact.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 22:02:52 +01:00
user
eb37fef730 feat: add jwt, mac, abuseipdb, virustotal, and emailcheck plugins
v2.0.0 sprint 1 -- five standalone plugins requiring no core changes:

- jwt: decode JWT header/payload, flag alg=none/expired/nbf issues
- mac: IEEE OUI vendor lookup, random MAC generation, OUI download
- abuseipdb: IP reputation check + abuse reporting (admin) via API
- virustotal: hash/IP/domain/URL lookup via VT APIv3, 4/min rate limit
- emailcheck: SMTP RCPT TO verification via MX + SOCKS proxy (admin)

Also adds update_oui() to update-data.sh and documents all five
plugins in USAGE.md and CHEATSHEET.md.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 21:04:43 +01:00
user
75c6ab1e62 docs: expand v2.0.0 roadmap with integrations and new plugins
Add FlaskPaste integration (paste overflow, URL shortener),
webhook listener, granular ACLs, and 10 new plugin targets
(virustotal, abuseipdb, jwt, mac, pastemoni, cron, paste,
shorten, emailcheck, canary). Reorganize TODO.md by category.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 20:14:10 +01:00
user
8e2b94fef0 feat: add 11 alert backends and fix PyPI/DEV.to search
Add Wikipedia, Stack Exchange, GitLab, npm, PyPI, Docker Hub,
arXiv, Lobsters, DEV.to, Medium, and Hugging Face backends to
the alert plugin (16 -> 27 total). Fix PyPI backend to use RSS
updates feed (web search now requires JS challenge). Fix DEV.to
to use public articles API (feed_content endpoint returns empty).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 20:07:01 +01:00
user
34d5dd6f8d fix: resolve YouTube channel ID via InnerTube for video URLs
Video URLs (watch, shorts, embed, youtu.be) now resolve the channel
ID through the InnerTube player API -- a small JSON POST instead of
fetching the full 1MB watch page. Much more resilient to transient
proxy failures. Page scraping remains as fallback for handle URLs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 18:39:32 +01:00
user
daa3370433 feat: add short IDs to alert results with !alert info command
Each alert result gets a deterministic 8-char base36 ID derived from
backend:item_id. IDs appear in announcements and history, and can be
looked up with !alert info <id> for full details. Existing rows are
backfilled on startup.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 23:20:56 +01:00
user
5ded8186dd feat: add Hacker News and GitHub backends to alert plugin
Hacker News (hn) uses Algolia search_by_date API for stories,
appends point count to title, falls back to HN discussion URL
when no external link. GitHub (gh) searches repositories sorted
by recently updated, shows star count and truncated description.
Both routed through SOCKS5 proxy via _urlopen.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 23:10:00 +01:00
user
f0b198d98a feat: add Bluesky, Lemmy, Odysee, and Archive.org alert backends
Bluesky (bs) searches public post API, constructs bsky.app URLs
from at:// URIs. Lemmy (ly) queries 4 instances (lemmy.ml,
lemmy.world, programming.dev, infosec.pub) with cross-instance
dedup. Odysee (od) uses LBRY JSON-RPC claim_search for video,
audio, and documents with lbry:// to odysee.com URL conversion.
Archive.org (ia) searches via advanced search API sorted by date.
All routed through SOCKS5 proxy via _urlopen.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 23:07:09 +01:00
user
52c49609b3 feat: add Kick, Dailymotion, and PeerTube backends to alert plugin
Kick (kk) searches channels and livestreams via public search API.
Dailymotion (dm) queries video API sorted by recent. PeerTube (pt)
searches across 4 federated instances with per-instance timeout.
All routed through SOCKS5 proxy via _urlopen.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 23:01:21 +01:00
user
80677343bf feat: add DuckDuckGo and Google News backends to alert plugin
DuckDuckGo (dg) searches via HTML lite endpoint with HTMLParser,
resolves DDG redirect URLs to actual targets. Google News (gn)
queries public RSS feed, parses RFC 822 dates. Both routed through
SOCKS5 proxy via _urlopen.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 22:51:52 +01:00
user
e70c22a510 feat: search SearXNG across categories with day filter
Query general, news, videos, and social media categories
separately with time_range=day. Dedup results by URL across
categories to avoid announcing the same item twice.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 22:44:55 +01:00
user
f84723f66d feat: add Reddit and Mastodon backends to alert plugin
Search Reddit posts (rd) via JSON API and Mastodon hashtag
timelines (ft) across 4 fediverse instances. Both public,
no auth required.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 22:42:06 +01:00
user
83a1d37b98 feat: persist invite-joined channels for auto-rejoin on connect
When the bot accepts an admin INVITE, the channel is stored in
bot.state under chanmgmt/autojoin:<channel>. On reconnect, persisted
channels are rejoined alongside configured ones. If the bot is kicked,
the channel is removed from the auto-rejoin list.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 22:22:24 +01:00
user
122785b1f3 feat: persist alert results to SQLite history table
Matched results were announced then discarded. Add a dedicated SQLite
database (data/alert_history.db) to store every announced result with
channel, alert name, backend, title, URL, date, and timestamp. Add
!alert history <name> [n] subcommand to query recent results.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 22:09:01 +01:00
user
181d6dbfad fix: call YouTube API directly, announce all matched results
YouTube InnerTube is a public API -- no need to route through SOCKS5
proxy, which was causing SSL EOF errors on every poll. Switch to
direct urllib.request.urlopen.

Remove _MAX_ANNOUNCE cap; all matched results are now announced
individually instead of truncating with "... and N more".

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 22:04:15 +01:00
user
6d6f4e7343 fix: handle null publishedDate from SearXNG results
SearXNG can return "publishedDate": null, which bypasses the default
value in dict.get() and passes None to _parse_date / re.search.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 21:42:07 +01:00
user
7698d079f2 fix: switch to k8s-file log driver for reliable log capture
journald was dropping early startup logs. k8s-file writes directly
to disk, captures from process start, and is lighter on the Pi.
Capped at 10 MB with automatic rotation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 21:40:53 +01:00
user
604a0a5830 feat: display published date in alert announcements
Extract dates from multiple sources:
- SearXNG: publishedDate field from search results
- YouTube: publishedTimeText from InnerTube response
- OG fallback: article:published_time, og:updated_time, date,
  dc.date, dcterms.date, sailthru.date meta tags

Date is shown as (YYYY-MM-DD) or relative time after the tag prefix.
OG tags are fetched for date even when title/URL already matched.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 21:30:48 +01:00
user
e36ec350f5 feat: check og:title/og:description for keyword match in alerts
When a search result's title/URL doesn't contain the keyword, fetch
the page's first 64 KB and parse og:title and og:description meta
tags. If the keyword appears there, the result is announced. Prefers
og:title as display title when it's richer than the search result
title.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 21:28:48 +01:00
user
0d5855dda3 fix: filter alert results to require keyword match in title/URL
Search backends return loosely relevant results. Now only announces
items where the keyword actually appears in the title or URL
(case-insensitive). All results are still marked as seen to prevent
re-checking.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 21:27:03 +01:00
user
33c6032329 fix: enable unbuffered Python output in container
Without PYTHONUNBUFFERED=1, Python buffers stdout causing
podman logs to show no output.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 19:25:21 +01:00
user
118cf0de21 fix: centralize retry logic in proxy transport layer
Add exponential-backoff retry (3 attempts) for transient SSL,
connection, timeout, and OS errors to all three proxy functions:
urlopen, create_connection, open_connection. Remove per-plugin
retry from alert.py since transport layer now handles it.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 18:55:21 +01:00
user
6d86e8d7f8 fix: retry transient SSL/connection errors in alert backends
Add retry loop (3 attempts, exponential backoff) for SSLError,
ConnectionError, TimeoutError, and OSError in alert poll cycle.
Non-transient errors fail immediately. Also fixes searx test
mocks to match direct urlopen usage.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 18:51:28 +01:00
user
f046cced28 fix: use public SearXNG URL without proxy
Switch to searx.mymx.me (public domain) for SearXNG queries
without routing through SOCKS5 proxy. Internal address is not
reachable from the container network.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 18:28:29 +01:00
user
b973635445 fix: route SearXNG direct via static route, drop proxy
SearXNG instance at 192.168.122.119 is reachable via grokbox
static route -- no need to tunnel through SOCKS5. Reverts searx
and alert plugins to stdlib urlopen for SearXNG queries. YouTube
and Twitch in alert.py still use the proxy. Also removes cprofile
flag from docker-compose command.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 17:52:43 +01:00
user
23ba7dc474 feat: add graceful SIGTERM handling for clean shutdown
Install a SIGTERM signal handler that stops the bot loop and closes
the IRC connection, allowing cProfile stats to flush before exit.
2026-02-15 17:06:37 +01:00
user
29e77f97b2 fix: route searx and alert SearXNG traffic through SOCKS5 proxy
Both plugins called urllib.request.urlopen directly, bypassing the
proxy. Switch to derp.http.urlopen and update the SearXNG endpoint
to the public domain (searx.mymx.me). Update test mocks to match.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 16:56:45 +01:00
user
6e591a85b2 fix: use host networking for container proxy access
Bridge networking can't reach the host's loopback. Switch to
network_mode: host so the container shares the host network stack
and can reach the SOCKS5 proxy at 127.0.0.1:1080. Revert proxy
address back to 127.0.0.1.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 16:46:24 +01:00
user
a7f0246dac fix: use LAN address for SOCKS5 proxy
Container can't reach 127.0.0.1 on the host. Use the host's LAN
address 192.168.129.11 so containerized plugins can reach the
SOCKS5 forwarder.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 16:43:45 +01:00
user
87b43e211a fix: install PySocks in container image
All plugins importing derp.http failed to load because PySocks
was missing from the container. Add it alongside maxminddb.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 16:37:51 +01:00
user
fd8e9f85b6 fix: point Tor DNS resolver at relay address 10.200.1.13
Was incorrectly set to 127.0.0.1. The Tor DNSPort runs on the
remote relay at 10.200.1.13:9053. Alt relays noted in comments.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 16:20:08 +01:00