feat: add paste site keyword monitor plugin

Poll Pastebin archive and GitHub Gists for keyword matches,
announce hits to subscribed IRC channels. Follows rss.py
polling/subscription pattern with state persistence.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
user
2026-02-18 09:01:46 +01:00
parent 1836fa50af
commit c3b19feb0f
4 changed files with 1588 additions and 1 deletions

View File

@@ -134,6 +134,7 @@ format = "text" # Log format: "text" (default) or "json"
| `!vt <hash\|ip\|domain\|url>` | VirusTotal lookup |
| `!emailcheck <email> [email2 ...]` | SMTP email verification (admin) |
| `!shorten <url>` | Shorten a URL via FlaskPaste |
| `!pastemoni <add\|del\|list\|check>` | Paste site keyword monitoring |
### Command Shorthand
@@ -872,6 +873,44 @@ https://paste.mymx.me/s/AbCdEfGh
- mTLS client cert skips PoW; falls back to PoW challenge if no cert
- Also used internally by `!alert` to shorten announcement URLs
### `!pastemoni` -- Paste Site Keyword Monitor
Monitor public paste sites for keywords (data leaks, credential dumps, brand
mentions). Polls Pastebin's archive and GitHub's public Gists API on a
schedule, checks new pastes for keyword matches, and announces hits to the
subscribed IRC channel.
```
!pastemoni add <name> <keyword> Add monitor (admin)
!pastemoni del <name> Remove monitor (admin)
!pastemoni list List monitors
!pastemoni check <name> Force-poll now
```
- `add` and `del` require admin privileges
- All subcommands must be used in a channel (not PM)
- Names must be lowercase alphanumeric + hyphens, 1-20 characters
- Maximum 20 monitors per channel
Backends:
- **Pastebin** (`pb`) -- Scrapes `pastebin.com/archive` for recent pastes,
fetches raw content, case-insensitive keyword match against title + content
- **GitHub Gists** (`gh`) -- Queries `api.github.com/gists/public`, matches
keyword against description and filenames
Polling and announcements:
- Monitors are polled every 5 minutes by default
- On `add`, existing items are seeded in the background (no flood)
- New matches announced as `[tag] Title -- snippet -- URL`
- Maximum 5 items announced per backend per poll; excess shown as `... and N more`
- Titles truncated to 60 characters, snippets to 80 characters
- 5 consecutive all-backend failures doubles the poll interval (max 1 hour)
- Subscriptions persist across bot restarts via `bot.state`
- `list` shows keyword and per-backend error counts
- `check` forces an immediate poll across all backends
### FlaskPaste Configuration
```toml