feat: add --llm mode for LLM-friendly stdout filtering

Split output when running with --llm: addressed messages from owners
go to stdout, everything else (chatter, logs, plugin loads) goes to
info.log. Adds owner privilege level (superset of admin) for gating
LLM access. Status lines (connect, ping, disconnect, reconnect) and
bot replies also appear on stdout for session awareness.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
user
2026-02-19 20:11:23 +01:00
parent 0c18ba8e3a
commit ea6f07914e
4 changed files with 147 additions and 12 deletions

View File

@@ -1,6 +1,25 @@
# derp - Tasks # derp - Tasks
## Current Sprint -- v1.2.7 Subscription Plugin Enrichment (2026-02-19) ## Current Sprint -- v1.2.9 LLM Mode (2026-02-19)
| Pri | Status | Task |
|-----|--------|------|
| P0 | [x] | `--llm` CLI flag: route logging to `info.log`, stdout for addressed messages |
| P0 | [x] | `_is_addressed()` method: DMs + nick-prefixed |
| P1 | [x] | Stdout routing: PRIVMSG in/out, PING, 001, disconnect, reconnect |
| P2 | [x] | Documentation update (USAGE.md CLI flags + LLM mode section) |
## Previous Sprint -- v1.2.8 ASN Backend Replacement (2026-02-19)
| Pri | Status | Task |
|-----|--------|------|
| P0 | [x] | Replace MaxMind ASN with iptoasn.com TSV backend (no license key) |
| P0 | [x] | Bisect-based lookup in `plugins/asn.py` (pure stdlib) |
| P1 | [x] | `update_asn()` in `scripts/update-data.sh` (SOCKS5 download) |
| P2 | [x] | Tests: load, lookup, command handler (30 cases, 906 total) |
| P2 | [x] | Documentation update (USAGE.md data directory layout) |
## Previous Sprint -- v1.2.7 Subscription Plugin Enrichment (2026-02-19)
| Pri | Status | Task | | Pri | Status | Task |
|-----|--------|------| |-----|--------|------|

View File

@@ -16,6 +16,7 @@ derp --config /path/to/derp.toml --verbose
|------|-------------| |------|-------------|
| `-c, --config PATH` | Config file path | | `-c, --config PATH` | Config file path |
| `-v, --verbose` | Debug logging | | `-v, --verbose` | Debug logging |
| `--llm` | LLM mode: addressed messages to stdout, rest to info.log |
| `--cprofile [PATH]` | Enable cProfile, dump to PATH [derp.prof] | | `--cprofile [PATH]` | Enable cProfile, dump to PATH [derp.prof] |
| `--tracemalloc [N]` | Enable tracemalloc, capture N frames deep [10] | | `--tracemalloc [N]` | Enable tracemalloc, capture N frames deep [10] |
| `-V, --version` | Print version | | `-V, --version` | Print version |
@@ -51,6 +52,7 @@ plugins_dir = "plugins" # Plugin directory path
rate_limit = 2.0 # Max messages per second (default: 2.0) rate_limit = 2.0 # Max messages per second (default: 2.0)
rate_burst = 5 # Burst capacity (default: 5) rate_burst = 5 # Burst capacity (default: 5)
paste_threshold = 4 # Max lines before overflow to FlaskPaste (default: 4) paste_threshold = 4 # Max lines before overflow to FlaskPaste (default: 4)
owner = [] # Owner hostmask patterns (fnmatch), grants admin + LLM access
admins = [] # Hostmask patterns (fnmatch), IRCOPs auto-detected admins = [] # Hostmask patterns (fnmatch), IRCOPs auto-detected
timezone = "UTC" # Timezone for calendar reminders (IANA tz name) timezone = "UTC" # Timezone for calendar reminders (IANA tz name)
@@ -216,14 +218,20 @@ Default format is `"text"` (human-readable, same as before).
## Admin System ## Admin System
Commands marked as `admin` require elevated permissions. Admin access is Commands marked as `admin` require elevated permissions. There are two
granted via: privilege levels:
1. **IRC operator status** -- detected automatically via `WHO` | Level | Source | Grants |
2. **Hostmask patterns** -- configured in `[bot] admins`, fnmatch-style |-------|--------|--------|
| **Owner** | `[bot] owner` hostmask patterns | Admin + LLM mode access |
| **Admin** | `[bot] admins` patterns, IRC operators | Admin commands only |
Owner is a superset of admin -- owners automatically have admin privileges.
Only owners can interact with the bot via `--llm` mode.
```toml ```toml
[bot] [bot]
owner = ["me!~user@my.host"]
admins = [ admins = [
"*!~user@trusted.host", "*!~user@trusted.host",
"ops!*@*.ops.net", "ops!*@*.ops.net",
@@ -368,7 +376,7 @@ The script is cron-friendly (exit 0/1, quiet unless `NO_COLOR` is unset).
``` ```
data/ data/
GeoLite2-City.mmdb # MaxMind GeoIP (requires license key) GeoLite2-City.mmdb # MaxMind GeoIP (requires license key)
GeoLite2-ASN.mmdb # MaxMind ASN (requires license key) ip2asn-v4.tsv # iptoasn.com ASN database (no key required)
tor-exit-nodes.txt # Tor exit node IPs tor-exit-nodes.txt # Tor exit node IPs
iprep/ # Firehol/ET blocklist feeds iprep/ # Firehol/ET blocklist feeds
firehol_level1.netset firehol_level1.netset
@@ -380,7 +388,8 @@ data/
... ...
``` ```
GeoLite2 databases require a free MaxMind license key. Set The ASN database is downloaded from iptoasn.com (no account required).
GeoLite2-City requires a free MaxMind license key -- set
`MAXMIND_LICENSE_KEY` when running the update script. `MAXMIND_LICENSE_KEY` when running the update script.
## Plugin Management ## Plugin Management
@@ -488,6 +497,39 @@ On connection loss, the bot reconnects with exponential backoff and jitter:
- Jitter: +/- 25% to avoid thundering herd - Jitter: +/- 25% to avoid thundering herd
- Resets to 5s after a successful connection - Resets to 5s after a successful connection
## LLM Mode
Run with `--llm` to split output for LLM consumption. All internal logging
(connection status, plugin loads, unaddressed chatter) goes to `info.log`.
Stdout receives only messages addressed to the bot and the bot's own replies.
```bash
derp --llm
derp --llm --config config/derp.toml
```
### What goes to stdout
- **DMs**: private messages from an owner
- **Nick-prefixed**: channel messages from an owner starting with `<botnick>:` or `<botnick>,`
- **Bot replies**: all messages sent by the bot (PRIVMSG and ACTION)
- **Status lines**: connection, ping, disconnect, reconnect events
### Output format
```
19:09 --- connected as derp
19:09 --- ping
19:09 #test <alice> derp: what is 1.1.1.1?
19:09 #test <derp> 1.1.1.1: AS13335 CLOUDFLARENET (US)
19:09 <bob> hey derp, check this out
19:09 <derp> I'm just a bot
19:09 #test * derp [rss/hackernews] New article -- URL
19:15 --- disconnected: Connection reset
19:15 --- reconnecting in 5s
19:15 --- connected as derp
```
### `!dork` -- Google Dork Query Builder ### `!dork` -- Google Dork Query Builder
Generate Google dork queries for a target domain. Template-based, no HTTP Generate Google dork queries for a target domain. Template-based, no HTTP

View File

@@ -7,6 +7,7 @@ import base64
import fnmatch import fnmatch
import logging import logging
import random import random
import sys
import time import time
from datetime import datetime, timezone from datetime import datetime, timezone
from pathlib import Path from pathlib import Path
@@ -77,9 +78,10 @@ class _TokenBucket:
class Bot: class Bot:
"""IRC bot: ties connection, config, and plugins together.""" """IRC bot: ties connection, config, and plugins together."""
def __init__(self, config: dict, registry: PluginRegistry) -> None: def __init__(self, config: dict, registry: PluginRegistry, *, llm: bool = False) -> None:
self.config = config self.config = config
self.registry = registry self.registry = registry
self._llm = llm
self.conn = IRCConnection( self.conn = IRCConnection(
host=config["server"]["host"], host=config["server"]["host"],
port=config["server"]["port"], port=config["server"]["port"],
@@ -92,6 +94,7 @@ class Bot:
self._started: float = time.monotonic() self._started: float = time.monotonic()
self._tasks: set[asyncio.Task] = set() self._tasks: set[asyncio.Task] = set()
self._reconnect_delay: float = 5.0 self._reconnect_delay: float = 5.0
self._owner: list[str] = config.get("bot", {}).get("owner", [])
self._admins: list[str] = config.get("bot", {}).get("admins", []) self._admins: list[str] = config.get("bot", {}).get("admins", [])
self._opers: set[str] = set() # hostmasks of known IRC operators self._opers: set[str] = set() # hostmasks of known IRC operators
self._caps: set[str] = set() # negotiated IRCv3 caps self._caps: set[str] = set() # negotiated IRCv3 caps
@@ -117,10 +120,18 @@ class Bot:
self._reconnect_delay = 5.0 # reset on clean run self._reconnect_delay = 5.0 # reset on clean run
except (OSError, ConnectionError) as exc: except (OSError, ConnectionError) as exc:
log.error("connection lost: %s", exc) log.error("connection lost: %s", exc)
if self._llm:
ts = time.strftime("%H:%M")
sys.stdout.write(f"{ts} --- disconnected: {exc}\n")
sys.stdout.flush()
if self._running: if self._running:
jitter = self._reconnect_delay * 0.25 * (2 * random.random() - 1) jitter = self._reconnect_delay * 0.25 * (2 * random.random() - 1)
delay = self._reconnect_delay + jitter delay = self._reconnect_delay + jitter
log.info("reconnecting in %.0fs...", delay) log.info("reconnecting in %.0fs...", delay)
if self._llm:
ts = time.strftime("%H:%M")
sys.stdout.write(f"{ts} --- reconnecting in {delay:.0f}s\n")
sys.stdout.flush()
await asyncio.sleep(delay) await asyncio.sleep(delay)
self._reconnect_delay = min(self._reconnect_delay * 2, 300.0) self._reconnect_delay = min(self._reconnect_delay * 2, 300.0)
@@ -259,11 +270,19 @@ class Bot:
# Protocol-level PING/PONG # Protocol-level PING/PONG
if msg.command == "PING": if msg.command == "PING":
await self.conn.send(format_msg("PONG", msg.params[0] if msg.params else "")) await self.conn.send(format_msg("PONG", msg.params[0] if msg.params else ""))
if self._llm:
ts = time.strftime("%H:%M")
sys.stdout.write(f"{ts} --- ping\n")
sys.stdout.flush()
return return
# RPL_WELCOME (001) — join channels and WHO for oper detection # RPL_WELCOME (001) — join channels and WHO for oper detection
if msg.command == "001": if msg.command == "001":
self.nick = msg.params[0] if msg.params else self.nick self.nick = msg.params[0] if msg.params else self.nick
if self._llm:
ts = time.strftime("%H:%M")
sys.stdout.write(f"{ts} --- connected as {self.nick}\n")
sys.stdout.flush()
for channel in self.config["bot"]["channels"]: for channel in self.config["bot"]["channels"]:
await self.join(channel) await self.join(channel)
await self.conn.send(format_msg("WHO", channel)) await self.conn.send(format_msg("WHO", channel))
@@ -300,6 +319,16 @@ class Bot:
self._spawn(self._handle_ctcp(msg), name="ctcp") self._spawn(self._handle_ctcp(msg), name="ctcp")
return return
# LLM mode: route PRIVMSG to stdout or info.log
if self._llm and msg.command == "PRIVMSG" and msg.text:
ts = time.strftime("%H:%M")
if self._is_addressed(msg) and self._is_owner(msg):
prefix = f"{msg.target} " if msg.is_channel else ""
sys.stdout.write(f"{ts} {prefix}<{msg.nick}> {msg.text}\n")
sys.stdout.flush()
else:
log.info("%s <%s> %s", msg.target, msg.nick, msg.text)
# Dispatch to event handlers (fire-and-forget) # Dispatch to event handlers (fire-and-forget)
channel = msg.target if msg.is_channel else None channel = msg.target if msg.is_channel else None
event_type = msg.command event_type = msg.command
@@ -359,14 +388,25 @@ class Bot:
return True return True
return plugin_name in allowed return plugin_name in allowed
def _is_owner(self, msg: Message) -> bool:
"""Check if the sender matches a configured owner hostmask pattern."""
if not msg.prefix:
return False
for pattern in self._owner:
if fnmatch.fnmatch(msg.prefix, pattern):
return True
return False
def _is_admin(self, msg: Message) -> bool: def _is_admin(self, msg: Message) -> bool:
"""Check if the message sender is a bot admin. """Check if the message sender is a bot admin.
Returns True if the sender is a known IRC operator or matches Returns True if the sender is an owner, a known IRC operator,
a configured hostmask pattern (fnmatch-style). or matches a configured admin hostmask pattern (fnmatch-style).
""" """
if not msg.prefix: if not msg.prefix:
return False return False
if self._is_owner(msg):
return True
if msg.prefix in self._opers: if msg.prefix in self._opers:
return True return True
for pattern in self._admins: for pattern in self._admins:
@@ -374,6 +414,18 @@ class Bot:
return True return True
return False return False
def _is_addressed(self, msg: Message) -> bool:
"""Check if a message is addressed to the bot (DM or nick-prefixed)."""
if not msg.is_channel:
return True
text = (msg.text or "").lstrip()
nick_lower = self.nick.lower()
if text.lower().startswith(nick_lower):
rest = text[len(nick_lower):]
if rest and rest[0] in ":, ":
return True
return False
def _dispatch_command(self, msg: Message) -> None: def _dispatch_command(self, msg: Message) -> None:
"""Check if a PRIVMSG is a bot command and spawn it.""" """Check if a PRIVMSG is a bot command and spawn it."""
text = msg.text text = msg.text
@@ -453,6 +505,16 @@ class Bot:
for chunk in _split_utf8(line, max_text): for chunk in _split_utf8(line, max_text):
await self._bucket.acquire() await self._bucket.acquire()
await self.conn.send(format_msg("PRIVMSG", target, chunk)) await self.conn.send(format_msg("PRIVMSG", target, chunk))
if self._llm:
ts = time.strftime("%H:%M")
display = chunk
if display.startswith("\x01ACTION ") and display.endswith("\x01"):
display = display[8:-1]
out = f"{ts} {target} * {self.nick} {display}"
else:
out = f"{ts} {target} <{self.nick}> {display}"
sys.stdout.write(out + "\n")
sys.stdout.flush()
async def reply(self, msg: Message, text: str) -> None: async def reply(self, msg: Message, text: str) -> None:
"""Reply to the source of a message (channel or PM).""" """Reply to the source of a message (channel or PM)."""

View File

@@ -33,6 +33,11 @@ def build_parser() -> argparse.ArgumentParser:
action="store_true", action="store_true",
help="enable debug logging", help="enable debug logging",
) )
p.add_argument(
"--llm",
action="store_true",
help="LLM mode: addressed messages to stdout, rest to info.log",
)
p.add_argument( p.add_argument(
"--cprofile", "--cprofile",
metavar="PATH", metavar="PATH",
@@ -111,7 +116,14 @@ def main(argv: list[str] | None = None) -> int:
level = logging.DEBUG if args.verbose else logging.INFO level = logging.DEBUG if args.verbose else logging.INFO
log_fmt = config.get("logging", {}).get("format", "text") log_fmt = config.get("logging", {}).get("format", "text")
if log_fmt == "json": if args.llm:
handler = logging.FileHandler("info.log")
if log_fmt == "json":
handler.setFormatter(JsonFormatter())
else:
handler.setFormatter(logging.Formatter(LOG_FORMAT, datefmt=LOG_DATE))
logging.basicConfig(handlers=[handler], level=level)
elif log_fmt == "json":
handler = logging.StreamHandler() handler = logging.StreamHandler()
handler.setFormatter(JsonFormatter()) handler.setFormatter(JsonFormatter())
logging.basicConfig(handlers=[handler], level=level) logging.basicConfig(handlers=[handler], level=level)
@@ -122,7 +134,7 @@ def main(argv: list[str] | None = None) -> int:
log.info("derp %s starting", __version__) log.info("derp %s starting", __version__)
registry = PluginRegistry() registry = PluginRegistry()
bot = Bot(config, registry) bot = Bot(config, registry, llm=args.llm)
bot.load_plugins() bot.load_plugins()
if args.tracemalloc: if args.tracemalloc: