Initial commit — fireclaw multi-agent system

Firecracker microVM-based multi-agent system with IRC orchestration and local LLMs. Features: - Ephemeral command runner with VM snapshots (~1.1s) - Multi-agent orchestration via overseer IRC bot - 5 agent templates (worker, coder, researcher, quick, creative) - Tool access (shell + podman containers inside VMs) - Persistent workspace + memory system (MEMORY.md pattern) - Agent hot-reload (model/persona swap via SSH + SIGHUP) - Non-root agents, graceful shutdown, crash recovery - Agent-to-agent communication via IRC - DM support, /invite support - Systemd service, 20 regression tests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 13:28:29 +00:00
commit ff694d12f6
28 changed files with 5917 additions and 0 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -0,0 +1,4 @@
 node_modules/
 dist/
 *.js.map
 .DS_Store
--- a/IDEAS.md
+++ b/IDEAS.md
@@ -0,0 +1,142 @@
 # Fireclaw Ideas
 Future features and experiments, loosely prioritized by usefulness.
 ## Operator Tools
 ### !status command
 Quick dashboard in IRC: agent count, RAM/CPU per VM, Ollama model currently loaded, system uptime, disk free. One command to see the health of everything.
 ### !logs <agent> [n]
 Tail the last N interactions an agent had. Stored in the agent's workspace. Useful to see what an agent's been doing while you were away.
 ### !persona <agent> [new persona]
 View or live-edit an agent's persona via IRC. "Make the worker more sarcastic" without touching files or restarting. Uses hot-reload under the hood.
 ### !pause / !resume <agent>
 Temporarily mute an agent without destroying it. Agent stays alive but stops responding. Useful when you need a channel to yourself.
 ## Agent Tools
 ### Web search
 Agents can search via the searx instance on mymx. Either bake the searx CLI into the rootfs, or add a proper `web_search(query)` tool that calls the searx API from inside the VM. Agents could actually research topics instead of relying on training data.
 ### Fetch URL
 `fetch_url(url)` tool to grab a webpage, strip HTML, return text. Combined with web search, agents become genuine research assistants. Could use `curl | python3 -c "from html.parser import..."` or a lightweight readability script.
 ### File sharing between agents
 A shared `/shared` mount (third virtio drive, or a common ext4 image) that all agents can read/write. Drop a file from one agent, pick it up from another. Enables collaboration: researcher writes findings, coder reads and implements.
 ### Code execution sandbox
 A `run_python(code)` tool that's safer than `run_command`. Executes in a subprocess with resource limits (timeout, memory cap). Better for code agents that need to test their own output.
 ## Automation
 ### Cron agents
 Template gets an optional `schedule` field: `"schedule": "0 8 * * *"`. The overseer spawns the agent on schedule, it does its task, reports to #agents, and self-destructs. Use cases:
 - Morning health check: "any disk/memory/service issues on grogbox?"
 - Daily digest: "summarize what happened in #agents yesterday"
 - Backup verification: "check that last night's backups completed"
 ### Webhook triggers
 HTTP endpoint on the host (e.g., `:8080/hook/<template>`) that spawns an ephemeral agent with the webhook payload as context. Examples:
 - Gitea push webhook → coder agent reviews the commit in #dev
 - Uptime monitor → agent investigates and reports
 - RSS feed → researcher summarizes new articles
 ### Alert forwarding
 Pipe system alerts (fail2ban, smartmontools, systemd failures, journal errors) into #agents via a simple bridge script. An always-on agent could triage: "fail2ban banned 3 IPs today, all SSH brute force from China, nothing to worry about."
 ### Git integration
 Agent can clone repos from Gitea (on mymx), read code, create branches, commit changes, open PRs. Would need git in the rootfs (already available via `apk add git`) and Gitea API access via the bridge network.
 ## Agent Personality & Memory
 ### Evolving personalities
 Instruct agents to actively develop opinions, preferences, and communication styles over time. The memory system supports this — agents could save "I prefer concise answers" or "human likes dry humor" and adapt. Give them character arcs.
 ### Agent journals
 Each agent maintains a daily journal in its workspace. Auto-saved summary of conversations, decisions made, things learned. Creates a narrative over time. Useful for debugging agent behavior and understanding their "thought process."
 ### Cross-agent memory
 Agents can read (but not write) each other's MEMORY.md. A new agent spawned for a task can inherit context from an existing agent. "Spawn a coder that knows what the researcher found."
 ### Agent self-improvement
 After each conversation, the agent reflects: "What could I have done better?" Saves lessons to memory. Over time, agents get better at their specific role. Needs a meta-prompt that triggers self-reflection.
 ## Multi-Agent Orchestration
 ### Task delegation
 Human gives a complex task to one agent, it breaks it down and delegates subtasks to other agents via IRC. Researcher does the research, coder implements, worker tests. All visible in #agents.
 ### Agent voting
 Multiple agents weigh in on a question. "Should we upgrade the kernel?" Each agent responds in #agents, human gets multiple perspectives. Could formalize with a `!poll` command.
 ### Agent debates
 Two agents argue opposite sides of a technical decision. Useful for exploring trade-offs. "Should we use Rust or Go for this?" Coder argues one side, researcher the other.
 ## MCP Servers as Firecracker VMs
 Run MCP tool servers in their own Firecracker VMs, same isolation model as agents. Managed by the overseer with the same lifecycle (!invoke, !destroy).
 ### Approach: single Firecracker VM with podman containers
 ```
 Firecracker VMs (fcbr0, 172.16.0.x)
  ├── worker (agent VM)
  ├── coder (agent VM)
  └── mcp-services (service VM, 172.16.0.10)
        └── podman
              ├── mcp-fs (:8081)
              ├── mcp-git (:8082)
              └── mcp-searx (:8083)
 ```
 One VM hosts all MCP servers in separate containers. Firecracker isolates from the host, podman separates services from each other. Lightweight — MCP servers are just HTTP wrappers, don't need their own VMs.
 Agents call them at `172.16.0.10:<port>`. Overseer manages the VM and lists available tools via `!services`.
 One-VM-per-service is overkill for trusted MCP servers but could be used for untrusted third-party tools.
 ### Why a Firecracker VM instead of host podman
 - MCP servers can't access the host filesystem directly
 - Consistent isolation model with agents
 - The VM is independently restartable without affecting the host
 - Podman-in-Firecracker is already working in the agent rootfs
 ### Candidate MCP servers
 - **filesystem** — read/write to a shared volume (mounted as virtio drive)
 - **git** — clone, read, diff, commit (Gitea on mymx accessible via bridge)
 - **searx** — web search via searx.mymx.me
 - **database** — SQLite or PostgreSQL query tool
 - **fetch** — HTTP fetch + readability extraction
 ## Infrastructure
 ### Agent metrics dashboard
 Simple HTML page served from the host showing: running agents, response times, model usage, memory contents, conversation history. No framework — just a static page with data from agents.json and workspace files.
 ### Agent backup/restore
 Export an agent's complete state (workspace, config, rootfs diff) as a tarball. Import on another machine. Portable agent identities.
 ### Multi-host agents
 Run agents on multiple machines (grogbox + odin). Overseer manages VMs across hosts via SSH. Agents on different hosts communicate via IRC federation.
 ### GPU passthrough
 When/if grogbox gets a GPU: pass it through to a single agent VM for fast inference. That agent becomes the "smart" one, others stay on CPU. Or run Ollama with GPU on the host and all agents benefit.
 ## Fun & Experimental
 ### Agent challenges
 Post a challenge in #agents: "shortest Python script that sorts a list." Agents compete, see each other's answers, iterate. Gamified agent development.
 ### Honeypot agents
 Agent with fake credentials, fake services, fake data. See what it tries to do. Test agent safety before trusting it with real access. Could also test prompt injection resistance.
 ### Agent-written agents
 An agent creates a new template (persona + config) and asks the overseer to spawn it. Self-replicating agent system. Needs careful guardrails.
 ### IRC games
 Agents play text-based games with each other or with humans. Trivia, 20 questions, collaborative storytelling. Tests agent personality and creativity in a low-stakes way.
 ### Dream mode
 An agent left running overnight with `trigger: all` in an empty channel, talking to itself. Stream of consciousness. Review in the morning. Probably nonsense, but occasionally insightful.
--- a/README.md
+++ b/README.md
@@ -0,0 +1,251 @@
 # Fireclaw
 Multi-agent system powered by Firecracker microVMs. Each AI agent runs in its own hardware-isolated VM, connects to IRC, and responds via local LLMs.
 ## What it does
 **Command runner:** Execute arbitrary commands in ephemeral microVMs with KVM isolation.
 ```
 $ fireclaw run "uname -a"
 Linux 172.16.0.200 5.10.225 #1 SMP x86_64 Linux
 ```
 **Multi-agent system:** Spawn AI agents as long-running VMs. Each agent connects to IRC, responds via Ollama, has tool access, and remembers across restarts.
 ```
 # In IRC (#control):
 <human>    !invoke worker
 <overseer> Agent "worker" started: worker [qwen2.5:7b] (172.16.0.2)
 # In IRC (#agents):
 <human>    worker: what's your uptime?
 <worker>   System has been up for 3 minutes with load average 0.00.
 # Private message:
 /msg worker hello
 <worker>   Hey! How can I help?
 # Hot-reload model:
 <human>    !model worker phi4-mini
 <worker>   [reloaded: model=phi4-mini]
 ```
 ## Architecture
 ```
 grogbox host
  ├── fireclaw overseer        systemd service, IRC bot in #control
  ├── Ollama                   0.0.0.0:11434, 5 models available
  └── nyx.fireclaw.local       ngircd IRC server (FireclawNet)
 Firecracker VMs (fcbr0 bridge, 172.16.0.0/24)
  ├── worker (172.16.0.2)      general assistant in #agents
  ├── coder (172.16.0.3)       code-focused agent in #agents
  └── research (172.16.0.4)    research agent in #agents
 ```
 Each agent VM:
 - Runs a Python IRC bot (stdlib only, zero deps)
 - Connects to nyx.fireclaw.local at 172.16.0.1:6667
 - Calls Ollama at 172.16.0.1:11434 for LLM responses
 - Has tool access — shell commands + podman containers
 - Has persistent workspace at `/workspace` (survives restarts)
 - Has persistent memory — saves/loads facts across restarts
 - Accepts `/invite` to join any channel
 - Responds to DMs without mention
 - Runs as unprivileged `agent` user
 - Is fully isolated from the host and other agents
 ## Requirements
 - Linux with KVM (`/dev/kvm`)
 - Firecracker v1.15+ at `/usr/local/bin/firecracker`
 - `sudo` access (for tap devices, rootfs mounting)
 - Node.js 20+
 - Ollama (for LLM responses)
 - ngircd (for IRC)
 ## Quick start
 ```bash
 cd ~/projects/fireclaw
 npm install
 npm run build
 sudo npm link
 # One-time setup
 fireclaw setup
 fireclaw snapshot create
 # Start the overseer (or use systemd)
 sudo systemctl start fireclaw-overseer
 # Connect to IRC and start spawning agents
 # irssi -c localhost -n human
 # /join #control
 # !invoke worker
 # /join #agents
 # worker: hello!
 ```
 ## IRC Channel Layout
 | Channel | Purpose |
 |---|---|
 | `#control` | Overseer commands only (!invoke, !destroy, !list, etc.) |
 | `#agents` | Common room — all agents join here |
 | `/msg <nick>` | Private DM with an agent (no mention needed) |
 | `/invite <nick> #room` | Pull an agent into any channel |
 ## CLI Reference
 ```
 fireclaw run [options] "<command>"     Run a command in an ephemeral microVM
  -t, --timeout <seconds>              Command timeout (default: 60)
  -v, --verbose                        Show boot/cleanup progress
  --no-snapshot                        Force cold boot
 fireclaw overseer [options]            Start the overseer daemon
  --server <host>                      IRC server (default: localhost)
  --port <port>                        IRC port (default: 6667)
  --nick <nick>                        Bot nickname (default: overseer)
  --channel <chan>                      Control channel (default: #control)
 fireclaw agent start <template>        Start an agent VM
  --name <name>                        Override agent name
  --model <model>                      Override LLM model
 fireclaw agent stop <name>             Stop an agent VM
 fireclaw agent list                    List running agents
 fireclaw snapshot create               Create VM snapshot for fast restores
 fireclaw setup                         One-time setup
 ```
 ## IRC Commands (via overseer in #control)
 | Command | Description |
 |---|---|
 | `!invoke <template> [name]` | Spawn an agent VM |
 | `!destroy <name>` | Kill an agent VM (graceful IRC QUIT) |
 | `!list` | Show running agents |
 | `!model <name> <model>` | Hot-reload agent's LLM model |
 | `!templates` | List available agent templates |
 | `!help` | Show commands |
 ## Agent Templates
 Templates live in `~/.fireclaw/templates/`:
 | Template | Nick | Model | Tools | Role |
 |---|---|---|---|---|
 | worker | worker | qwen2.5:7b | yes | General purpose |
 | coder | coder | qwen2.5-coder:7b | yes | Code-focused |
 | researcher | research | llama3.1:8b | yes | Thorough research |
 | quick | quick | phi4-mini | no | Fast one-liners (~5s) |
 | creative | muse | gemma3:4b | no | Writing, brainstorming |
 ## Available Models
 | Model | Size | Speed (CPU) | Tools | Best for |
 |---|---|---|---|---|
 | qwen2.5-coder:7b | 4.7 GB | ~15s | yes | Code tasks |
 | qwen2.5:7b | 4.7 GB | ~15s | partial | General chat |
 | llama3.1:8b | 4.9 GB | ~15s | partial | Instruction following |
 | gemma3:4b | 3.3 GB | ~8s | no | Creative, balanced |
 | phi4-mini | 2.5 GB | ~5s | no | Fast, simple answers |
 ## Performance
 | Mode | Time |
 |---|---|
 | Snapshot restore (ephemeral run) | ~1.1s |
 | Cold boot (ephemeral run) | ~2.9s |
 | Agent VM boot to IRC connect | ~5s |
 ## Security Model
 - Agent processes run as unprivileged `agent` user inside VMs
 - Root SSH retained for overseer management only
 - Each VM gets hardware-level KVM isolation
 - Persistent workspace is per-agent — no cross-agent access
 - Overseer runs on host (trusted), agents in VMs (untrusted)
 - Agent-to-agent cooldown (10s) prevents infinite loops
 - Graceful shutdown — agents send IRC QUIT before VM kill
 ## Source Files
 ```
 src/
  index.ts              Entry point
  cli.ts                CLI commands (commander)
  vm.ts                 Ephemeral VM lifecycle (cold boot + snapshot)
  firecracker-api.ts    Firecracker REST client
  snapshot.ts           Snapshot creation workflow
  overseer.ts           Overseer daemon (IRC + agent lifecycle)
  agent-manager.ts      Start/stop/list/reload agent VMs
  network.ts            Bridge, tap devices, IP allocation
  rootfs.ts             Image copy, SSH key injection
  ssh.ts                SSH execution (ssh2)
  cleanup.ts            Signal handlers
  config.ts             Constants, paths
  types.ts              Interfaces
  setup.ts              One-time setup
 agent/
  agent.py              Python IRC bot (stdlib only, baked into rootfs)
 scripts/
  setup-bridge.sh       Create fcbr0 bridge + NAT rules
  teardown-bridge.sh    Remove bridge + NAT rules
 tests/
  test-suite.sh         Regression tests (20 tests)
 ```
 ## Data Directory
 ```
 ~/.fireclaw/
  vmlinux                   Firecracker kernel (5.10)
  base-rootfs.ext4          Alpine base (ephemeral runs)
  agent-rootfs.ext4         Agent image (1 GiB sparse, Alpine + Python + podman)
  snapshot-rootfs.ext4      Snapshot rootfs
  snapshot.{state,mem}      VM snapshot
  id_ed25519[.pub]          SSH keypair
  agents.json               Running agent state
  templates/                Agent persona templates
  workspaces/               Persistent agent storage (64 MiB ext4 each)
  runs/                     Per-agent rootfs copies
  ip-pool.json              IP allocation
 ```
 ### Agent Workspace (inside VM at /workspace)
 ```
 /workspace/
  MEMORY.md                 Memory index — loaded into system prompt
  memory/
    user_prefs.md           Learned facts about users
    project_x.md            Ongoing project context
 ```
 ## Testing
 ```bash
 # Requires overseer running via systemd
 ./tests/test-suite.sh
 ```
 20 tests covering: overseer commands, agent lifecycle, IRC interaction, tool access, error handling, CLI operations, crash recovery, graceful shutdown.
 ## Known Limitations
 - Snapshot mode doesn't support concurrent ephemeral runs (single fixed IP/tap)
 - `sudo` required for tap device and rootfs mount operations
 - CPU inference: 5-30s per response depending on model
 - No thin provisioning — full rootfs copy per agent (~146 MiB)
 ## License
 Apache-2.0
--- a/ROADMAP.md
+++ b/ROADMAP.md
@@ -0,0 +1,65 @@
 # Fireclaw Roadmap
 ## Phase 1: Core CLI (done)
 - [x] Firecracker microVM lifecycle (boot, exec, destroy)
 - [x] SSH-based command execution
 - [x] Network isolation (tap + bridge + NAT)
 - [x] IP pool management for concurrent VMs
 - [x] Signal handling and cleanup
 - [x] CLI interface (`fireclaw run`, `fireclaw setup`)
 ## Phase 2: Fast & Useful (done)
 - [x] Alpine Linux rootfs (1 GiB sparse, 146 MiB on disk)
 - [x] Precompiled binary, global `fireclaw` command
 - [x] Snapshot & restore (~1.1s vs ~2.9s cold boot)
 ## Phase 3: Multi-Agent System (done)
 - [x] ngircd configured (`nyx.fireclaw.local`, FireclawNet)
 - [x] Channel layout: #control (overseer), #agents (common room), DMs, /invite
 - [x] Ollama with 5 models (qwen2.5-coder, qwen2.5, llama3.1, gemma3, phi4-mini)
 - [x] Agent rootfs — Alpine + Python IRC bot + podman + tools
 - [x] Agent manager — start/stop/list/reload long-running VMs
 - [x] Overseer — host-side IRC bot, !invoke/!destroy/!list/!model/!templates
 - [x] 5 agent templates — worker, coder, researcher, quick, creative
 - [x] Agent tool access — shell commands + podman containers
 - [x] Persistent workspace — 64 MiB ext4 as second virtio drive at /workspace
 - [x] Agent memory system — MEMORY.md + save_memory tool, survives restarts
 - [x] Agent hot-reload — SSH config update + SIGHUP, no VM restart
 - [x] Non-root agents — unprivileged `agent` user
 - [x] Agent-to-agent via IRC mentions, 10s cooldown
 - [x] DM support — private messages without mention
 - [x] /invite support — agents auto-join invited channels
 - [x] Overseer resilience — crash recovery, agent adoption, KillMode=process
 - [x] Graceful shutdown — SSH SIGTERM → IRC QUIT → kill VM
 - [x] Systemd service — fireclaw-overseer.service
 - [x] Regression test suite — 20 tests
 ## Phase 4: Hardening & Performance
 - [ ] Network policies per agent — iptables rules per tap device
 - [ ] Warm pool — pre-booted VMs from snapshots for instant spawns
 - [ ] Concurrent snapshot runs via network namespaces
 - [ ] Thin provisioning — device-mapper snapshots instead of full rootfs copies
 - [ ] Thread safety — lock around IRC socket writes in agent.py
 - [ ] Agent health checks — overseer monitors and restarts dead agents
 ## Phase 5: Advanced Features
 - [ ] Persistent agent memory v2 — richer structure, auto-save from conversations
 - [ ] Scheduled/cron tasks — agents that run on a timer
 - [ ] Advanced tool use — MCP tools, multi-step execution, file I/O
 - [ ] Cost tracking — log duration, model, tokens per interaction
 - [ ] Execution recording — full audit trail of agent actions
 ## Phase 6: Ideas & Experiments
 - [ ] vsock — replace SSH with virtio-vsock for lower overhead
 - [ ] Web dashboard — status page for running agents
 - [ ] Podman-in-Firecracker — double isolation for untrusted container images
 - [ ] Honeypot mode — test agent safety with fake credentials/services
 - [ ] Self-healing rootfs — agents evolve their own images
 - [ ] Claude API backend — for tasks requiring deep reasoning
 - [ ] IRC federation — link nyx.fireclaw.local ↔ odin for external access
--- a/TODO.md
+++ b/TODO.md
@@ -0,0 +1,38 @@
 # TODO
 ## Done
 - [x] Firecracker CLI runner with snapshots (~1.1s)
 - [x] Alpine rootfs with ca-certificates, podman, python3
 - [x] Global `fireclaw` command
 - [x] Multi-agent system — overseer + agent VMs + IRC + Ollama
 - [x] 5 agent templates (worker, coder, researcher, quick, creative)
 - [x] 5 Ollama models (qwen2.5-coder, qwen2.5, llama3.1, gemma3, phi4-mini)
 - [x] Agent tool access — shell commands + podman containers
 - [x] Persistent workspace + memory system (MEMORY.md pattern)
 - [x] Agent hot-reload — model/persona swap via SSH + SIGHUP
 - [x] Non-root agents — unprivileged `agent` user
 - [x] Agent-to-agent via IRC mentions (10s cooldown)
 - [x] DM support — private messages, no mention needed
 - [x] /invite support — agents auto-join invited channels
 - [x] Channel layout — #control (commands), #agents (common), DMs
 - [x] Overseer resilience — crash recovery, agent adoption
 - [x] Graceful shutdown — IRC QUIT before VM kill
 - [x] Systemd service (KillMode=process)
 - [x] Regression test suite (20 tests)
 ## Next up
 - [ ] Network policies per agent — restrict internet access
 - [ ] Warm pool — pre-booted VMs for instant agent spawns
 - [ ] Persistent agent memory improvements — richer memory structure, auto-save from conversations
 - [ ] Thin provisioning — device-mapper snapshots instead of full rootfs copies
 ## Polish
 - [ ] Fix trigger matching — only trigger when nick is at the start of the message, not anywhere in text. Currently "say hi to worker" triggers worker even when addressed to another agent.
 - [ ] Cost tracking per agent interaction
 - [ ] Execution recording / audit trail
 - [ ] Agent health checks — overseer pings agents, restarts dead ones
 - [ ] Thread safety in agent.py — lock around IRC socket writes
 - [ ] Update regression tests for new channel layout
--- a/agent/agent.py
+++ b/agent/agent.py
@@ -0,0 +1,528 @@
 #!/usr/bin/env python3
 """Fireclaw IRC agent — connects to IRC, responds via Ollama with tool access."""
 import socket
 import json
 import sys
 import time
 import subprocess
 import urllib.request
 import urllib.error
 import signal
 import threading
 from collections import deque
 # Load config
 with open("/etc/agent/config.json") as f:
    CONFIG = json.load(f)
 PERSONA = ""
 try:
    with open("/etc/agent/persona.md") as f:
        PERSONA = f.read().strip()
 except FileNotFoundError:
    PERSONA = "You are a helpful assistant."
 NICK = CONFIG.get("nick", "agent")
 CHANNEL = CONFIG.get("channel", "#agents")
 SERVER = CONFIG.get("server", "172.16.0.1")
 PORT = CONFIG.get("port", 6667)
 OLLAMA_URL = CONFIG.get("ollama_url", "http://172.16.0.1:11434")
 CONTEXT_SIZE = CONFIG.get("context_size", 20)
 MAX_RESPONSE_LINES = CONFIG.get("max_response_lines", 50)
 TOOLS_ENABLED = CONFIG.get("tools", True)
 MAX_TOOL_ROUNDS = CONFIG.get("max_tool_rounds", 5)
 WORKSPACE = "/workspace"
 # Mutable runtime config — can be hot-reloaded via SIGHUP
 RUNTIME = {
    "model": CONFIG.get("model", "qwen2.5-coder:7b"),
    "trigger": CONFIG.get("trigger", "mention"),
    "persona": PERSONA,
 }
 # Recent messages for context
 recent = deque(maxlen=CONTEXT_SIZE)
 # Load persistent memory from workspace
 AGENT_MEMORY = ""
 try:
    import os
    with open(f"{WORKSPACE}/MEMORY.md") as f:
        AGENT_MEMORY = f.read().strip()
    # Also load all memory files referenced in the index
    mem_dir = f"{WORKSPACE}/memory"
    if os.path.isdir(mem_dir):
        for fname in sorted(os.listdir(mem_dir)):
            if fname.endswith(".md"):
                try:
                    with open(f"{mem_dir}/{fname}") as f:
                        topic = fname.replace(".md", "")
                        AGENT_MEMORY += f"\n\n## {topic}\n{f.read().strip()}"
                except Exception:
                    pass
 except FileNotFoundError:
    pass
 # Tool definitions for Ollama chat API
 TOOLS = [
    {
        "type": "function",
        "function": {
            "name": "run_command",
            "description": "Execute a shell command on this system and return the output. Use this to check system info, run scripts, fetch URLs, process data, etc.",
            "parameters": {
                "type": "object",
                "properties": {
                    "command": {
                        "type": "string",
                        "description": "The shell command to execute (bash)",
                    }
                },
                "required": ["command"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "save_memory",
            "description": "Save something important to your persistent memory. Use this to remember facts about users, lessons learned, project context, or anything you want to recall in future conversations. Memories survive restarts.",
            "parameters": {
                "type": "object",
                "properties": {
                    "topic": {
                        "type": "string",
                        "description": "Short topic name for the memory file (e.g. 'user_prefs', 'project_x', 'lessons')",
                    },
                    "content": {
                        "type": "string",
                        "description": "The memory content to save",
                    },
                },
                "required": ["topic", "content"],
            },
        },
    },
 ]
 def log(msg):
    print(f"[agent:{NICK}] {msg}", flush=True)
 class IRCClient:
    def __init__(self, server, port, nick):
        self.server = server
        self.port = port
        self.nick = nick
        self.sock = None
        self.buf = ""
    def connect(self):
        self.sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        self.sock.settimeout(300)
        self.sock.connect((self.server, self.port))
        self.send(f"NICK {self.nick}")
        self.send(f"USER {self.nick} 0 * :Fireclaw Agent")
    def send(self, msg):
        self.sock.sendall(f"{msg}\r\n".encode("utf-8"))
    def join(self, channel):
        self.send(f"JOIN {channel}")
    def say(self, target, text):
        for line in text.split("\n"):
            line = line.strip()
            if line:
                while len(line) > 400:
                    self.send(f"PRIVMSG {target} :{line[:400]}")
                    line = line[400:]
                self.send(f"PRIVMSG {target} :{line}")
    def set_bot_mode(self):
        self.send(f"MODE {self.nick} +B")
    def recv_lines(self):
        try:
            data = self.sock.recv(4096)
        except socket.timeout:
            return []
        if not data:
            raise ConnectionError("Connection closed")
        self.buf += data.decode("utf-8", errors="replace")
        lines = self.buf.split("\r\n")
        self.buf = lines.pop()
        return lines
 def run_command(command):
    """Execute a shell command and return output."""
    log(f"Running command: {command[:100]}")
    try:
        result = subprocess.run(
            ["bash", "-c", command],
            capture_output=True,
            text=True,
            timeout=120,
        )
        output = result.stdout
        if result.stderr:
            output += f"\n[stderr] {result.stderr}"
        if result.returncode != 0:
            output += f"\n[exit code: {result.returncode}]"
        # Truncate very long output
        if len(output) > 2000:
            output = output[:2000] + "\n[output truncated]"
        return output.strip() or "[no output]"
    except subprocess.TimeoutExpired:
        return "[command timed out after 120s]"
    except Exception as e:
        return f"[error: {e}]"
 def save_memory(topic, content):
    """Save a memory to the persistent workspace."""
    import os
    mem_dir = f"{WORKSPACE}/memory"
    os.makedirs(mem_dir, exist_ok=True)
    # Write the memory file
    filepath = f"{mem_dir}/{topic}.md"
    with open(filepath, "w") as f:
        f.write(content + "\n")
    # Update MEMORY.md index
    index_path = f"{WORKSPACE}/MEMORY.md"
    existing = ""
    try:
        with open(index_path) as f:
            existing = f.read()
    except FileNotFoundError:
        existing = "# Agent Memory\n"
    # Add or update entry
    entry = f"- [{topic}](memory/{topic}.md)"
    if topic not in existing:
        with open(index_path, "a") as f:
            f.write(f"\n{entry}")
    # Reload memory into global
    global AGENT_MEMORY
    with open(index_path) as f:
        AGENT_MEMORY = f.read().strip()
    log(f"Memory saved: {topic}")
    return f"Memory saved to {filepath}"
 def try_parse_tool_call(text):
    """Try to parse a text-based tool call from model output.
    Handles formats like:
      {"name": "run_command", "arguments": {"command": "uptime"}}
      <tool_call>{"name": "run_command", ...}</tool_call>
    Returns (name, args) tuple or None.
    """
    import re
    # Strip tool_call tags if present
    text = re.sub(r"</?tool_call>", "", text).strip()
    # Try to find JSON in the text
    for start in range(len(text)):
        if text[start] == "{":
            for end in range(len(text), start, -1):
                if text[end - 1] == "}":
                    try:
                        obj = json.loads(text[start:end])
                        name = obj.get("name")
                        args = obj.get("arguments", {})
                        if name and isinstance(args, dict):
                            return (name, args)
                    except json.JSONDecodeError:
                        continue
    return None
 def ollama_request(payload):
    """Make a request to Ollama API."""
    data = json.dumps(payload).encode("utf-8")
    req = urllib.request.Request(
        f"{OLLAMA_URL}/api/chat",
        data=data,
        headers={"Content-Type": "application/json"},
    )
    with urllib.request.urlopen(req, timeout=120) as resp:
        return json.loads(resp.read())
 def query_ollama(messages):
    """Call Ollama chat API with tool support. Returns final response text."""
    payload = {
        "model": RUNTIME["model"],
        "messages": messages,
        "stream": False,
        "options": {"num_predict": 512},
    }
    if TOOLS_ENABLED:
        payload["tools"] = TOOLS
    for round_num in range(MAX_TOOL_ROUNDS):
        try:
            data = ollama_request(payload)
        except (urllib.error.URLError, TimeoutError) as e:
            return f"[error: {e}]"
        msg = data.get("message", {})
        # Check for structured tool calls from API
        tool_calls = msg.get("tool_calls")
        if tool_calls:
            messages.append(msg)
            for tc in tool_calls:
                fn = tc.get("function", {})
                fn_name = fn.get("name", "")
                fn_args = fn.get("arguments", {})
                if fn_name == "run_command":
                    cmd = fn_args.get("command", "")
                    log(f"Tool call [{round_num+1}/{MAX_TOOL_ROUNDS}]: {cmd[:80]}")
                    result = run_command(cmd)
                    messages.append({"role": "tool", "content": result})
                elif fn_name == "save_memory":
                    topic = fn_args.get("topic", "note")
                    content = fn_args.get("content", "")
                    log(f"Tool call [{round_num+1}/{MAX_TOOL_ROUNDS}]: save_memory({topic})")
                    result = save_memory(topic, content)
                    messages.append({"role": "tool", "content": result})
                else:
                    messages.append({
                        "role": "tool",
                        "content": f"[unknown tool: {fn_name}]",
                    })
            payload["messages"] = messages
            continue
        # Check for text-based tool calls (model dumped JSON as text)
        content = msg.get("content", "").strip()
        parsed_tool = try_parse_tool_call(content)
        if parsed_tool:
            fn_name, fn_args = parsed_tool
            messages.append({"role": "assistant", "content": content})
            if fn_name == "run_command":
                cmd = fn_args.get("command", "")
                log(f"Text tool call [{round_num+1}/{MAX_TOOL_ROUNDS}]: {cmd[:80]}")
                result = run_command(cmd)
                messages.append({"role": "user", "content": f"Command output:\n{result}\n\nNow provide your response to the user based on this output."})
            elif fn_name == "save_memory":
                topic = fn_args.get("topic", "note")
                mem_content = fn_args.get("content", "")
                log(f"Text tool call [{round_num+1}/{MAX_TOOL_ROUNDS}]: save_memory({topic})")
                result = save_memory(topic, mem_content)
                messages.append({"role": "user", "content": f"{result}\n\nNow respond to the user."})
            payload["messages"] = messages
            continue
        # No tool calls — return the text response
        return content
    return "[max tool rounds reached]"
 def build_messages(question, channel):
    """Build chat messages with system prompt and conversation history."""
    system = RUNTIME["persona"]
    if TOOLS_ENABLED:
        system += "\n\nYou have access to tools:"
        system += "\n- run_command: Execute shell commands on your system."
        system += "\n- save_memory: Save important information to your persistent workspace (/workspace/memory/). Use this to remember things across restarts — user preferences, learned facts, project context."
        system += "\nUse tools when needed rather than guessing. Your workspace at /workspace persists across restarts."
    if AGENT_MEMORY and AGENT_MEMORY != "# Agent Memory":
        system += f"\n\nIMPORTANT - Your persistent memory (facts you saved previously, use these to answer questions):\n{AGENT_MEMORY}"
    system += f"\n\nYou are in IRC channel {channel}. Your nick is {NICK}. Keep responses concise — this is IRC."
    messages = [{"role": "system", "content": system}]
    # Build conversation history as alternating user/assistant messages
    channel_msgs = [m for m in recent if m["channel"] == channel]
    for msg in channel_msgs[-CONTEXT_SIZE:]:
        if msg["nick"] == NICK:
            messages.append({"role": "assistant", "content": msg["text"]})
        else:
            messages.append({"role": "user", "content": f"<{msg['nick']}> {msg['text']}"})
    # Ensure the last message is from the user (the triggering question)
    # If the deque already captured it, don't double-add
    last = messages[-1] if len(messages) > 1 else None
    if not last or last.get("role") != "user" or question not in last.get("content", ""):
        messages.append({"role": "user", "content": question})
    return messages
 def should_trigger(text):
    """Check if this message should trigger a response."""
    if RUNTIME["trigger"] == "all":
        return True
    lower = text.lower()
    return NICK.lower() in lower or text.startswith("!ask ")
 def extract_question(text):
    """Extract the actual question from the trigger."""
    lower = text.lower()
    for prefix in [
        f"{NICK.lower()}: ",
        f"{NICK.lower()}, ",
        f"@{NICK.lower()} ",
        f"{NICK.lower()} ",
    ]:
        if lower.startswith(prefix):
            return text[len(prefix):]
    if text.startswith("!ask "):
        return text[5:]
    return text
 # Track last response time to prevent agent-to-agent loops
 _last_response_time = 0
 _AGENT_COOLDOWN = 10  # seconds between responses to prevent loops
 def handle_message(irc, source_nick, target, text):
    """Process an incoming PRIVMSG."""
    global _last_response_time
    is_dm = not target.startswith("#")
    channel = source_nick if is_dm else target
    reply_to = source_nick if is_dm else target
    recent.append({"nick": source_nick, "text": text, "channel": channel})
    if source_nick == NICK:
        return
    # DMs always trigger, channel messages need mention
    if not is_dm and not should_trigger(text):
        return
    # Cooldown to prevent agent-to-agent loops
    now = time.time()
    if now - _last_response_time < _AGENT_COOLDOWN:
        log(f"Cooldown active, ignoring trigger from {source_nick}")
        return
    _last_response_time = now
    question = extract_question(text) if not is_dm else text
    log(f"Triggered by {source_nick} in {channel}: {question[:80]}")
    def do_respond():
        try:
            messages = build_messages(question, channel)
            response = query_ollama(messages)
            if not response:
                return
            lines = response.split("\n")
            if len(lines) > MAX_RESPONSE_LINES:
                lines = lines[:MAX_RESPONSE_LINES]
                lines.append(f"[truncated, {MAX_RESPONSE_LINES} lines max]")
            irc.say(reply_to, "\n".join(lines))
            recent.append({"nick": NICK, "text": response[:200], "channel": channel})
        except Exception as e:
            log(f"Error handling message: {e}")
            try:
                irc.say(reply_to, f"[error: {e}]")
            except Exception:
                pass
    threading.Thread(target=do_respond, daemon=True).start()
 def run():
    log(f"Starting agent: nick={NICK} channel={CHANNEL} model={RUNTIME['model']} tools={TOOLS_ENABLED}")
    while True:
        try:
            irc = IRCClient(SERVER, PORT, NICK)
            log(f"Connecting to {SERVER}:{PORT}...")
            irc.connect()
            # Hot-reload on SIGHUP — re-read config and persona
            def handle_sighup(signum, frame):
                log("SIGHUP received, reloading config...")
                try:
                    with open("/etc/agent/config.json") as f:
                        new_config = json.load(f)
                    RUNTIME["model"] = new_config.get("model", RUNTIME["model"])
                    RUNTIME["trigger"] = new_config.get("trigger", RUNTIME["trigger"])
                    try:
                        with open("/etc/agent/persona.md") as f:
                            RUNTIME["persona"] = f.read().strip()
                    except FileNotFoundError:
                        pass
                    log(f"Reloaded: model={RUNTIME['model']} trigger={RUNTIME['trigger']}")
                    irc.say(CHANNEL, f"[reloaded: model={RUNTIME['model']}]")
                except Exception as e:
                    log(f"Reload failed: {e}")
            signal.signal(signal.SIGHUP, handle_sighup)
            # Graceful shutdown on SIGTERM — send IRC QUIT
            def handle_sigterm(signum, frame):
                log("SIGTERM received, quitting IRC...")
                try:
                    irc.send("QUIT :Agent shutting down")
                except Exception:
                    pass
                time.sleep(0.5)
                sys.exit(0)
            signal.signal(signal.SIGTERM, handle_sigterm)
            registered = False
            while True:
                lines = irc.recv_lines()
                for line in lines:
                    if line.startswith("PING"):
                        irc.send(f"PONG {line.split(' ', 1)[1]}")
                        continue
                    parts = line.split(" ")
                    if len(parts) < 2:
                        continue
                    if parts[1] == "001" and not registered:
                        registered = True
                        log("Registered with server")
                        irc.set_bot_mode()
                        irc.join("#agents")
                        log(f"Joined #agents")
                    # Handle INVITE — auto-join invited channels
                    if parts[1] == "INVITE" and len(parts) >= 3:
                        invited_channel = parts[-1].lstrip(":")
                        inviter = parts[0].split("!")[0].lstrip(":")
                        log(f"Invited to {invited_channel} by {inviter}, joining...")
                        irc.join(invited_channel)
                    if parts[1] == "PRIVMSG" and len(parts) >= 4:
                        source_nick = parts[0].split("!")[0].lstrip(":")
                        target = parts[2]
                        text = " ".join(parts[3:]).lstrip(":")
                        handle_message(irc, source_nick, target, text)
        except (ConnectionError, OSError, socket.timeout) as e:
            log(f"Disconnected: {e}. Reconnecting in 5s...")
            time.sleep(5)
        except KeyboardInterrupt:
            log("Shutting down.")
            sys.exit(0)
 if __name__ == "__main__":
    run()
--- a/eslint.config.js
+++ b/eslint.config.js
@@ -0,0 +1,25 @@
 import tseslint from "@typescript-eslint/eslint-plugin";
 import tsparser from "@typescript-eslint/parser";
 export default [
  {
    files: ["src/**/*.ts"],
    languageOptions: {
      parser: tsparser,
      parserOptions: {
        projectService: true,
      },
    },
    plugins: {
      "@typescript-eslint": tseslint,
    },
    rules: {
      ...tseslint.configs.recommended.rules,
      "@typescript-eslint/no-unused-vars": [
        "error",
        { argsIgnorePattern: "^_" },
      ],
      "no-console": "warn",
    },
  },
 ];
--- a/package-lock.json
+++ b/package-lock.json
--- a/package.json
+++ b/package.json
@@ -0,0 +1,27 @@
 {
  "name": "fireclaw",
  "version": "0.1.0",
  "description": "Run commands in ephemeral Firecracker microVMs",
  "type": "module",
  "bin": {
    "fireclaw": "./dist/index.js"
  },
  "scripts": {
    "build": "tsc",
    "dev": "tsx src/index.ts"
  },
  "dependencies": {
    "commander": "^13.0.0",
    "irc-framework": "^4.14.0",
    "ssh2": "^1.16.0"
  },
  "devDependencies": {
    "@types/node": "^22.0.0",
    "@types/ssh2": "^1.15.0",
    "@typescript-eslint/eslint-plugin": "^8.58.0",
    "@typescript-eslint/parser": "^8.58.0",
    "eslint": "^10.2.0",
    "tsx": "^4.0.0",
    "typescript": "^5.7.0"
  }
 }
--- a/scripts/setup-bridge.sh
+++ b/scripts/setup-bridge.sh
@@ -0,0 +1,30 @@
 #!/bin/bash
 # Set up the fireclaw bridge and NAT rules
 # Run with sudo
 set -euo pipefail
 BRIDGE="fcbr0"
 BRIDGE_IP="172.16.0.1/24"
 SUBNET="172.16.0.0/24"
 EXT_IFACE=$(ip route show default | awk '{print $5; exit}')
 echo "Creating bridge ${BRIDGE}..."
 ip link add ${BRIDGE} type bridge 2>/dev/null || echo "Bridge already exists"
 ip addr add ${BRIDGE_IP} dev ${BRIDGE} 2>/dev/null || echo "Address already set"
 ip link set ${BRIDGE} up
 echo "Enabling IP forwarding..."
 sysctl -w net.ipv4.ip_forward=1
 echo "Setting up NAT via ${EXT_IFACE}..."
 iptables -t nat -C POSTROUTING -s ${SUBNET} -o ${EXT_IFACE} -j MASQUERADE 2>/dev/null || \
  iptables -t nat -A POSTROUTING -s ${SUBNET} -o ${EXT_IFACE} -j MASQUERADE
 iptables -C FORWARD -i ${BRIDGE} -o ${EXT_IFACE} -j ACCEPT 2>/dev/null || \
  iptables -A FORWARD -i ${BRIDGE} -o ${EXT_IFACE} -j ACCEPT
 iptables -C FORWARD -i ${EXT_IFACE} -o ${BRIDGE} -m state --state RELATED,ESTABLISHED -j ACCEPT 2>/dev/null || \
  iptables -A FORWARD -i ${EXT_IFACE} -o ${BRIDGE} -m state --state RELATED,ESTABLISHED -j ACCEPT
 echo "Done. Bridge ${BRIDGE} ready."
--- a/scripts/teardown-bridge.sh
+++ b/scripts/teardown-bridge.sh
@@ -0,0 +1,20 @@
 #!/bin/bash
 # Remove the fireclaw bridge and NAT rules
 # Run with sudo
 set -euo pipefail
 BRIDGE="fcbr0"
 SUBNET="172.16.0.0/24"
 EXT_IFACE=$(ip route show default | awk '{print $5; exit}')
 echo "Removing NAT rules..."
 iptables -t nat -D POSTROUTING -s ${SUBNET} -o ${EXT_IFACE} -j MASQUERADE 2>/dev/null || true
 iptables -D FORWARD -i ${BRIDGE} -o ${EXT_IFACE} -j ACCEPT 2>/dev/null || true
 iptables -D FORWARD -i ${EXT_IFACE} -o ${BRIDGE} -m state --state RELATED,ESTABLISHED -j ACCEPT 2>/dev/null || true
 echo "Removing bridge ${BRIDGE}..."
 ip link set ${BRIDGE} down 2>/dev/null || true
 ip link del ${BRIDGE} 2>/dev/null || true
 echo "Done."
--- a/src/agent-manager.ts
+++ b/src/agent-manager.ts
@@ -0,0 +1,557 @@
 import { spawn, type ChildProcess } from "node:child_process";
 import {
  existsSync,
  mkdirSync,
  readFileSync,
  writeFileSync,
  copyFileSync,
  unlinkSync,
  readdirSync,
 } from "node:fs";
 import { join } from "node:path";
 import { execFileSync } from "node:child_process";
 import { CONFIG } from "./config.js";
 import {
  ensureBridge,
  ensureNat,
  allocateIp,
  releaseIp,
  createTap,
  deleteTap,
  macFromOctet,
 } from "./network.js";
 import * as api from "./firecracker-api.js";
 export interface AgentInfo {
  name: string;
  nick: string;
  model: string;
  template: string;
  ip: string;
  octet: number;
  tapDevice: string;
  socketPath: string;
  rootfsPath: string;
  pid: number;
  startedAt: string;
 }
 interface AgentTemplate {
  name: string;
  nick: string;
  model: string;
  trigger: string;
  persona: string;
 }
 const AGENTS_FILE = join(CONFIG.baseDir, "agents.json");
 const TEMPLATES_DIR = join(CONFIG.baseDir, "templates");
 const AGENT_ROOTFS = join(CONFIG.baseDir, "agent-rootfs.ext4");
 const WORKSPACES_DIR = CONFIG.workspacesDir;
 function log(msg: string) {
  process.stderr.write(`[agent-mgr] ${msg}\n`);
 }
 function loadAgents(): Record<string, AgentInfo> {
  try {
    return JSON.parse(readFileSync(AGENTS_FILE, "utf-8"));
  } catch {
    return {};
  }
 }
 function saveAgents(agents: Record<string, AgentInfo>) {
  writeFileSync(AGENTS_FILE, JSON.stringify(agents, null, 2));
 }
 export function loadTemplate(name: string): AgentTemplate {
  const path = join(TEMPLATES_DIR, `${name}.json`);
  if (!existsSync(path)) {
    throw new Error(`Template "${name}" not found at ${path}`);
  }
  return JSON.parse(readFileSync(path, "utf-8"));
 }
 export function listTemplates(): string[] {
  try {
    return readdirSync(TEMPLATES_DIR)
      .filter((f) => f.endsWith(".json"))
      .map((f) => f.replace(".json", ""));
  } catch {
    return [];
  }
 }
 function injectAgentConfig(
  rootfsPath: string,
  config: { nick: string; model: string; trigger: string },
  persona: string
 ) {
  const mountPoint = `/tmp/fireclaw-agent-${Date.now()}`;
  mkdirSync(mountPoint, { recursive: true });
  try {
    execFileSync("sudo", ["mount", "-o", "loop", rootfsPath, mountPoint], {
      stdio: "pipe",
    });
    execFileSync(
      "sudo",
      ["mkdir", "-p", join(mountPoint, "etc/agent")],
      { stdio: "pipe" }
    );
    // Write config
    const configJson = JSON.stringify({
      nick: config.nick,
      model: config.model,
      trigger: config.trigger,
      server: "172.16.0.1",
      port: 6667,
      ollama_url: "http://172.16.0.1:11434",
    });
    execFileSync(
      "sudo",
      [
        "bash",
        "-c",
        `echo '${configJson}' > ${join(mountPoint, "etc/agent/config.json")}`,
      ],
      { stdio: "pipe" }
    );
    // Write persona
    execFileSync(
      "sudo",
      [
        "bash",
        "-c",
        `cat > ${join(mountPoint, "etc/agent/persona.md")} << 'PERSONA_EOF'\n${persona}\nPERSONA_EOF`,
      ],
      { stdio: "pipe" }
    );
    // Inject SSH key for debugging access
    execFileSync("sudo", ["mkdir", "-p", join(mountPoint, "root/.ssh")], {
      stdio: "pipe",
    });
    if (existsSync(CONFIG.sshPubKeyPath)) {
      execFileSync(
        "sudo",
        [
          "cp",
          CONFIG.sshPubKeyPath,
          join(mountPoint, "root/.ssh/authorized_keys"),
        ],
        { stdio: "pipe" }
      );
      execFileSync(
        "sudo",
        ["chmod", "600", join(mountPoint, "root/.ssh/authorized_keys")],
        { stdio: "pipe" }
      );
    }
  } finally {
    try {
      execFileSync("sudo", ["umount", mountPoint], { stdio: "pipe" });
    } catch {}
    try {
      execFileSync("rmdir", [mountPoint], { stdio: "pipe" });
    } catch {}
  }
 }
 function ensureWorkspace(agentName: string): string {
  mkdirSync(WORKSPACES_DIR, { recursive: true });
  const imgPath = join(WORKSPACES_DIR, `${agentName}.ext4`);
  if (!existsSync(imgPath)) {
    log(`Creating workspace for "${agentName}" (${CONFIG.workspaceSizeMib} MiB)...`);
    execFileSync("truncate", ["-s", `${CONFIG.workspaceSizeMib}M`, imgPath], {
      stdio: "pipe",
    });
    execFileSync("sudo", ["/usr/sbin/mkfs.ext4", "-q", imgPath], {
      stdio: "pipe",
    });
    // Seed with MEMORY.md template
    const mountPoint = `/tmp/fireclaw-ws-${Date.now()}`;
    mkdirSync(mountPoint, { recursive: true });
    try {
      execFileSync("sudo", ["mount", "-o", "loop", imgPath, mountPoint], {
        stdio: "pipe",
      });
      execFileSync(
        "sudo",
        ["bash", "-c", `mkdir -p ${mountPoint}/memory && echo '# Agent Memory' > ${mountPoint}/MEMORY.md`],
        { stdio: "pipe" }
      );
      execFileSync("sudo", ["chown", "-R", "0:0", mountPoint], {
        stdio: "pipe",
      });
    } finally {
      try { execFileSync("sudo", ["umount", mountPoint], { stdio: "pipe" }); } catch {}
      try { execFileSync("rmdir", [mountPoint], { stdio: "pipe" }); } catch {}
    }
  }
  return imgPath;
 }
 function waitForSocket(socketPath: string): Promise<void> {
  return new Promise((resolve, reject) => {
    const deadline = Date.now() + 5_000;
    const check = () => {
      if (existsSync(socketPath)) {
        setTimeout(resolve, 200);
        return;
      }
      if (Date.now() > deadline) {
        reject(new Error("Firecracker socket did not appear"));
        return;
      }
      setTimeout(check, 50);
    };
    check();
  });
 }
 export async function startAgent(
  templateName: string,
  overrides?: { name?: string; model?: string }
 ): Promise<AgentInfo> {
  if (!existsSync(AGENT_ROOTFS)) {
    throw new Error(
      `Agent rootfs not found at ${AGENT_ROOTFS}. Build it first.`
    );
  }
  const template = loadTemplate(templateName);
  const name = overrides?.name ?? template.name;
  const nick = overrides?.name ?? template.nick;
  const model = overrides?.model ?? template.model;
  // Check not already running
  const agents = loadAgents();
  if (agents[name]) {
    throw new Error(`Agent "${name}" is already running`);
  }
  log(`Starting agent "${name}" (template: ${templateName})...`);
  // Allocate resources
  const { ip, octet } = allocateIp();
  const tapDevice = `fctap${octet}`;
  const socketPath = join(CONFIG.socketDir, `agent-${name}.sock`);
  const rootfsPath = join(CONFIG.runsDir, `agent-${name}.ext4`);
  mkdirSync(CONFIG.socketDir, { recursive: true });
  mkdirSync(CONFIG.runsDir, { recursive: true });
  // Prepare rootfs
  copyFileSync(AGENT_ROOTFS, rootfsPath);
  injectAgentConfig(
    rootfsPath,
    { nick, model, trigger: template.trigger },
    template.persona
  );
  // Create/get persistent workspace
  const workspacePath = ensureWorkspace(name);
  // Setup network
  ensureBridge();
  ensureNat();
  createTap(tapDevice);
  // Boot VM
  const proc = spawn(
    CONFIG.firecrackerBin,
    ["--api-sock", socketPath],
    { stdio: "pipe", detached: true }
  );
  proc.unref();
  await waitForSocket(socketPath);
  const bootArgs = [
    "console=ttyS0",
    "reboot=k",
    "panic=1",
    "pci=off",
    "root=/dev/vda",
    "rw",
    `ip=${ip}::${CONFIG.bridge.gateway}:${CONFIG.bridge.netmask}::eth0:off`,
  ].join(" ");
  await api.putBootSource(socketPath, CONFIG.kernelPath, bootArgs);
  await api.putDrive(socketPath, "rootfs", rootfsPath);
  await api.putDrive(socketPath, "workspace", workspacePath, false, false);
  await api.putNetworkInterface(
    socketPath,
    "eth0",
    tapDevice,
    macFromOctet(octet)
  );
  await api.putMachineConfig(
    socketPath,
    CONFIG.vm.vcpuCount,
    CONFIG.vm.memSizeMib
  );
  await api.startInstance(socketPath);
  const info: AgentInfo = {
    name,
    nick,
    model,
    template: templateName,
    ip,
    octet,
    tapDevice,
    socketPath,
    rootfsPath,
    pid: proc.pid!,
    startedAt: new Date().toISOString(),
  };
  agents[name] = info;
  saveAgents(agents);
  log(`Agent "${name}" started: nick=${nick} ip=${ip}`);
  return info;
 }
 export async function stopAgent(name: string) {
  const agents = loadAgents();
  const info = agents[name];
  if (!info) {
    throw new Error(`Agent "${name}" is not running`);
  }
  log(`Stopping agent "${name}"...`);
  // Graceful shutdown: SSH in and kill the agent process so it sends IRC QUIT
  try {
    execFileSync(
      "ssh",
      [
        "-o", "StrictHostKeyChecking=no",
        "-o", "UserKnownHostsFile=/dev/null",
        "-o", "ConnectTimeout=3",
        "-i", CONFIG.sshKeyPath,
        `root@${info.ip}`,
        "killall python3 2>/dev/null; sleep 1",
      ],
      { stdio: "pipe", timeout: 5_000 }
    );
  } catch {
    // Best effort — VM might already be unreachable
  }
  // Kill firecracker process and wait for it to die
  try {
    process.kill(info.pid, "SIGKILL");
    // Wait for process to actually exit before cleaning up resources
    for (let i = 0; i < 20; i++) {
      try {
        process.kill(info.pid, 0); // Check if alive
        await new Promise((r) => setTimeout(r, 200));
      } catch {
        break; // Process is gone
      }
    }
  } catch {
    // Already dead
  }
  // Small delay to let kernel release the tap device
  await new Promise((r) => setTimeout(r, 500));
  // Cleanup with retry for tap
  try {
    unlinkSync(info.socketPath);
  } catch {}
  for (let attempt = 0; attempt < 3; attempt++) {
    try {
      deleteTap(info.tapDevice);
      break;
    } catch {
      if (attempt < 2) await new Promise((r) => setTimeout(r, 1000));
    }
  }
  releaseIp(info.octet);
  try {
    unlinkSync(info.rootfsPath);
  } catch {}
  delete agents[name];
  saveAgents(agents);
  log(`Agent "${name}" stopped.`);
 }
 export function listAgents(): AgentInfo[] {
  const agents = loadAgents();
  // Verify processes are still alive
  for (const [name, info] of Object.entries(agents)) {
    try {
      process.kill(info.pid, 0);
    } catch {
      // Process is dead, clean up
      log(`Agent "${name}" is dead, cleaning up...`);
      try {
        deleteTap(info.tapDevice);
      } catch {}
      try {
        releaseIp(info.octet);
      } catch {}
      try {
        unlinkSync(info.rootfsPath);
      } catch {}
      try {
        unlinkSync(info.socketPath);
      } catch {}
      delete agents[name];
    }
  }
  saveAgents(agents);
  return Object.values(agents);
 }
 export async function reloadAgent(
  name: string,
  updates: { model?: string; persona?: string; trigger?: string }
 ) {
  const agents = loadAgents();
  const info = agents[name];
  if (!info) {
    throw new Error(`Agent "${name}" is not running`);
  }
  log(`Reloading agent "${name}"...`);
  // Build updated config
  const configUpdates: Record<string, string> = {};
  if (updates.model) {
    configUpdates.model = updates.model;
    info.model = updates.model;
  }
  if (updates.trigger) configUpdates.trigger = updates.trigger;
  // Write updated config as a temp file on the VM via SSH
  const sshOpts = [
    "-o", "StrictHostKeyChecking=no",
    "-o", "UserKnownHostsFile=/dev/null",
    "-o", "ConnectTimeout=5",
    "-i", CONFIG.sshKeyPath,
  ];
  const sshTarget = `root@${info.ip}`;
  try {
    if (Object.keys(configUpdates).length > 0) {
      // Read current config from VM
      const currentRaw = execFileSync(
        "ssh",
        [...sshOpts, sshTarget, "cat /etc/agent/config.json"],
        { encoding: "utf-8", timeout: 10_000 }
      );
      const current = JSON.parse(currentRaw);
      Object.assign(current, configUpdates);
      const newConfig = JSON.stringify(current);
      // Write back via stdin
      execFileSync(
        "ssh",
        [...sshOpts, sshTarget, `cat > /etc/agent/config.json`],
        { input: newConfig, timeout: 10_000 }
      );
    }
    if (updates.persona) {
      execFileSync(
        "ssh",
        [...sshOpts, sshTarget, `cat > /etc/agent/persona.md`],
        { input: updates.persona, timeout: 10_000 }
      );
    }
    // Signal agent to reload
    execFileSync(
      "ssh",
      [...sshOpts, sshTarget, "killall -HUP python3"],
      { stdio: "pipe", timeout: 10_000 }
    );
  } catch (err) {
    throw new Error(`Failed to reload agent: ${err}`);
  }
  saveAgents(agents);
  log(`Agent "${name}" reloaded.`);
 }
 export function reconcileAgents(): { adopted: string[]; cleaned: string[] } {
  const agents = loadAgents();
  const adopted: string[] = [];
  const cleaned: string[] = [];
  for (const [name, info] of Object.entries(agents)) {
    let alive = false;
    try {
      process.kill(info.pid, 0);
      alive = true;
    } catch {
      // Process is dead
    }
    if (alive) {
      adopted.push(name);
      log(`Adopted running agent "${name}" (PID ${info.pid}, ${info.ip})`);
    } else {
      log(`Cleaning dead agent "${name}" (PID ${info.pid} gone)...`);
      // Clean up resources from dead agent
      try { deleteTap(info.tapDevice); } catch {}
      try { releaseIp(info.octet); } catch {}
      try { unlinkSync(info.rootfsPath); } catch {}
      try { unlinkSync(info.socketPath); } catch {}
      delete agents[name];
      cleaned.push(name);
    }
  }
  // Scan for orphan firecracker processes not in agents.json
  try {
    const psOutput = execFileSync("pgrep", ["-a", "firecracker"], {
      encoding: "utf-8",
    });
    for (const line of psOutput.trim().split("\n")) {
      if (!line) continue;
      const match = line.match(/agent-(\S+)\.sock/);
      if (match) {
        const agentName = match[1];
        if (!agents[agentName]) {
          const pid = parseInt(line.split(/\s+/)[0]);
          log(`Found orphan firecracker process for "${agentName}" (PID ${pid}), killing...`);
          try { process.kill(pid, "SIGKILL"); } catch {}
          cleaned.push(`orphan:${agentName}`);
        }
      }
    }
  } catch {
    // No firecracker processes running — that's fine
  }
  saveAgents(agents);
  if (adopted.length === 0 && cleaned.length === 0) {
    log("No agents to reconcile.");
  } else {
    log(`Reconciled: ${adopted.length} adopted, ${cleaned.length} cleaned.`);
  }
  return { adopted, cleaned };
 }
 export async function stopAllAgents() {
  const agents = loadAgents();
  for (const name of Object.keys(agents)) {
    await stopAgent(name);
  }
 }
--- a/src/cleanup.ts
+++ b/src/cleanup.ts
@@ -0,0 +1,47 @@
 import type { VMInstance } from "./vm.js";
 const activeVms = new Set<VMInstance>();
 export function registerVm(vm: VMInstance) {
  activeVms.add(vm);
 }
 export function unregisterVm(vm: VMInstance) {
  activeVms.delete(vm);
 }
 async function cleanupAll() {
  const vms = Array.from(activeVms);
  activeVms.clear();
  await Promise.allSettled(vms.map((vm) => vm.destroy()));
 }
 let registered = false;
 export function installSignalHandlers() {
  if (registered) return;
  registered = true;
  const handler = async (signal: string) => {
    process.stderr.write(`\n[fireclaw] Caught ${signal}, cleaning up...\n`);
    await cleanupAll();
    process.exit(signal === "SIGINT" ? 130 : 143);
  };
  process.on("SIGINT", () => handler("SIGINT"));
  process.on("SIGTERM", () => handler("SIGTERM"));
  process.on("uncaughtException", async (err) => {
    process.stderr.write(`[fireclaw] Uncaught exception: ${err.message}\n`);
    await cleanupAll();
    process.exit(1);
  });
  process.on("unhandledRejection", async (reason) => {
    process.stderr.write(
      `[fireclaw] Unhandled rejection: ${reason}\n`
    );
    await cleanupAll();
    process.exit(1);
  });
 }
--- a/src/cli.ts
+++ b/src/cli.ts
@@ -0,0 +1,134 @@
 import { Command } from "commander";
 import { VMInstance } from "./vm.js";
 import { installSignalHandlers } from "./cleanup.js";
 import { runSetup } from "./setup.js";
 import { createSnapshot } from "./snapshot.js";
 import { runOverseer } from "./overseer.js";
 import {
  startAgent,
  stopAgent,
  listAgents,
 } from "./agent-manager.js";
 export function createCli() {
  const program = new Command();
  program
    .name("fireclaw")
    .description("Run commands in ephemeral Firecracker microVMs")
    .version("0.1.0");
  program
    .command("run")
    .description("Run a command inside a fresh microVM")
    .argument("<command>", "Command to execute inside the microVM")
    .option("-t, --timeout <seconds>", "Timeout in seconds", "60")
    .option("-v, --verbose", "Show detailed progress", false)
    .option("--mem <mib>", "Memory in MiB", "256")
    .option("--vcpu <count>", "Number of vCPUs", "1")
    .option("--no-snapshot", "Force cold boot, skip snapshot restore")
    .action(async (command: string, opts) => {
      installSignalHandlers();
      const result = await VMInstance.run(command, {
        timeout: parseInt(opts.timeout) * 1000,
        verbose: opts.verbose,
        mem: parseInt(opts.mem),
        vcpu: parseInt(opts.vcpu),
        noSnapshot: opts.snapshot === false,
      });
      if (!opts.verbose) {
        if (result.stdout) process.stdout.write(result.stdout);
        if (result.stderr) process.stderr.write(result.stderr);
      }
      process.exit(result.exitCode);
    });
  program
    .command("setup")
    .description("Download kernel, rootfs, and configure networking")
    .action(async () => {
      await runSetup();
    });
  const snapshot = program
    .command("snapshot")
    .description("Manage VM snapshots");
  snapshot
    .command("create")
    .description("Boot a VM and create a snapshot for fast restores")
    .action(async () => {
      installSignalHandlers();
      await createSnapshot();
    });
  // Overseer
  program
    .command("overseer")
    .description("Start the overseer daemon (IRC bot for agent management)")
    .option("--server <host>", "IRC server", "localhost")
    .option("--port <port>", "IRC port", "6667")
    .option("--nick <nick>", "Bot nickname", "overseer")
    .option("--channel <chan>", "Control channel", "#control")
    .action(async (opts) => {
      await runOverseer({
        server: opts.server,
        port: parseInt(opts.port),
        nick: opts.nick,
        channel: opts.channel,
      });
    });
  // Agent management
  const agent = program
    .command("agent")
    .description("Manage long-running agent VMs");
  agent
    .command("start")
    .description("Start an agent VM from a template")
    .argument("<template>", "Template name")
    .option("--name <name>", "Override agent name")
    .option("--model <model>", "Override LLM model")
    .action(async (template: string, opts) => {
      installSignalHandlers();
      const info = await startAgent(template, {
        name: opts.name,
        model: opts.model,
      });
      console.log(
        `Agent "${info.name}" started: ${info.nick} [${info.model}] (${info.ip})`
      );
      process.exit(0);
    });
  agent
    .command("stop")
    .description("Stop a running agent VM")
    .argument("<name>", "Agent name")
    .action(async (name: string) => {
      await stopAgent(name);
      console.log(`Agent "${name}" stopped.`);
    });
  agent
    .command("list")
    .description("List running agent VMs")
    .action(() => {
      const agents = listAgents();
      if (agents.length === 0) {
        console.log("No agents running.");
        return;
      }
      for (const a of agents) {
        console.log(
          `${a.name} (${a.template}) — ${a.nick} [${a.model}] ip=${a.ip} since ${a.startedAt}`
        );
      }
    });
  return program;
 }
--- a/src/config.ts
+++ b/src/config.ts
@@ -0,0 +1,57 @@
 import { homedir } from "node:os";
 import { join } from "node:path";
 const HOME = homedir();
 export const CONFIG = {
  firecrackerBin: "/usr/local/bin/firecracker",
  baseDir: join(HOME, ".fireclaw"),
  kernelPath: join(HOME, ".fireclaw", "vmlinux"),
  baseRootfs: join(HOME, ".fireclaw", "base-rootfs.ext4"),
  runsDir: join(HOME, ".fireclaw", "runs"),
  sshKeyPath: join(HOME, ".fireclaw", "id_ed25519"),
  sshPubKeyPath: join(HOME, ".fireclaw", "id_ed25519.pub"),
  socketDir: "/tmp/fireclaw",
  ipPoolFile: join(HOME, ".fireclaw", "ip-pool.json"),
  ipPoolLock: join(HOME, ".fireclaw", "ip-pool.lock"),
  bridge: {
    name: "fcbr0",
    ip: "172.16.0.1",
    subnet: "172.16.0.0/24",
    netmask: "255.255.255.0",
    gateway: "172.16.0.1",
    prefix: "172.16.0",
    minHost: 2,
    maxHost: 254,
  },
  vm: {
    vcpuCount: 1,
    memSizeMib: 256,
    defaultTimeoutMs: 60_000,
    bootTimeoutMs: 15_000,
    sshPollIntervalMs: 100,
  },
  snapshot: {
    rootfsPath: join(HOME, ".fireclaw", "snapshot-rootfs.ext4"),
    statePath: join(HOME, ".fireclaw", "snapshot.state"),
    memPath: join(HOME, ".fireclaw", "snapshot.mem"),
    tapDevice: "fctap200",
    ip: "172.16.0.200",
    octet: 200,
  },
  workspacesDir: join(HOME, ".fireclaw", "workspaces"),
  workspaceSizeMib: 64,
  // S3 URLs for Firecracker CI assets
  assets: {
    kernelUrl:
      "https://s3.amazonaws.com/spec.ccfc.min/firecracker-ci/v1.11/x86_64/vmlinux-5.10.225",
    rootfsListUrl:
      "http://spec.ccfc.min.s3.amazonaws.com/?prefix=firecracker-ci/v1.11/x86_64/ubuntu",
    rootfsBaseUrl: "https://s3.amazonaws.com/spec.ccfc.min",
  },
 } as const;
--- a/src/firecracker-api.ts
+++ b/src/firecracker-api.ts
@@ -0,0 +1,152 @@
 import http from "node:http";
 function request(
  socketPath: string,
  method: string,
  path: string,
  body?: object
 ): Promise<{ status: number; body: string }> {
  return new Promise((resolve, reject) => {
    const payload = body ? JSON.stringify(body) : undefined;
    const headers: Record<string, string> = {};
    if (payload) {
      headers["Content-Type"] = "application/json";
      headers["Content-Length"] = Buffer.byteLength(payload).toString();
    }
    const opts: http.RequestOptions = {
      socketPath,
      path,
      method,
      headers,
    };
    const req = http.request(opts, (res) => {
      const chunks: Buffer[] = [];
      res.on("data", (chunk) => chunks.push(chunk));
      res.on("end", () => {
        resolve({
          status: res.statusCode ?? 0,
          body: Buffer.concat(chunks).toString(),
        });
      });
    });
    req.on("error", reject);
    if (payload) {
      req.write(payload);
    }
    req.end();
  });
 }
 function assertOk(res: { status: number; body: string }, action: string) {
  if (res.status < 200 || res.status >= 300) {
    throw new Error(
      `Firecracker API error (${action}): ${res.status} ${res.body}`
    );
  }
 }
 export async function putBootSource(
  socketPath: string,
  kernelPath: string,
  bootArgs: string
 ) {
  const res = await request(socketPath, "PUT", "/boot-source", {
    kernel_image_path: kernelPath,
    boot_args: bootArgs,
  });
  assertOk(res, "PUT /boot-source");
 }
 export async function putDrive(
  socketPath: string,
  driveId: string,
  path: string,
  readOnly = false,
  isRoot = true
 ) {
  const res = await request(socketPath, "PUT", `/drives/${driveId}`, {
    drive_id: driveId,
    path_on_host: path,
    is_root_device: isRoot,
    is_read_only: readOnly,
  });
  assertOk(res, `PUT /drives/${driveId}`);
 }
 export async function putNetworkInterface(
  socketPath: string,
  ifaceId: string,
  hostDevName: string,
  guestMac: string
 ) {
  const res = await request(
    socketPath,
    "PUT",
    `/network-interfaces/${ifaceId}`,
    {
      iface_id: ifaceId,
      guest_mac: guestMac,
      host_dev_name: hostDevName,
    }
  );
  assertOk(res, `PUT /network-interfaces/${ifaceId}`);
 }
 export async function putMachineConfig(
  socketPath: string,
  vcpuCount: number,
  memSizeMib: number
 ) {
  const res = await request(socketPath, "PUT", "/machine-config", {
    vcpu_count: vcpuCount,
    mem_size_mib: memSizeMib,
  });
  assertOk(res, "PUT /machine-config");
 }
 export async function startInstance(socketPath: string) {
  const res = await request(socketPath, "PUT", "/actions", {
    action_type: "InstanceStart",
  });
  assertOk(res, "PUT /actions InstanceStart");
 }
 export async function patchVm(
  socketPath: string,
  state: "Paused" | "Resumed"
 ) {
  const res = await request(socketPath, "PATCH", "/vm", { state });
  assertOk(res, `PATCH /vm ${state}`);
 }
 export async function putSnapshotCreate(
  socketPath: string,
  snapshotPath: string,
  memFilePath: string
 ) {
  const res = await request(socketPath, "PUT", "/snapshot/create", {
    snapshot_type: "Full",
    snapshot_path: snapshotPath,
    mem_file_path: memFilePath,
  });
  assertOk(res, "PUT /snapshot/create");
 }
 export async function putSnapshotLoad(
  socketPath: string,
  snapshotPath: string,
  memFilePath: string
 ) {
  const res = await request(socketPath, "PUT", "/snapshot/load", {
    snapshot_path: snapshotPath,
    mem_backend: {
      backend_type: "File",
      backend_path: memFilePath,
    },
  });
  assertOk(res, "PUT /snapshot/load");
 }
--- a/src/index.ts
+++ b/src/index.ts
@@ -0,0 +1,5 @@
 #!/usr/bin/env node
 import { createCli } from "./cli.js";
 const program = createCli();
 program.parse();
--- a/src/irc-framework.d.ts
+++ b/src/irc-framework.d.ts
@@ -0,0 +1,14 @@
 declare module "irc-framework" {
  class Client {
    connect(options: {
      host: string;
      port: number;
      nick: string;
    }): void;
    join(channel: string): void;
    say(target: string, message: string): void;
    quit(message?: string): void;
    on(event: string, handler: (...args: any[]) => void): void;
  }
  export default { Client };
 }
--- a/src/network.ts
+++ b/src/network.ts
@@ -0,0 +1,165 @@
 import { execFileSync } from "node:child_process";
 import { openSync, closeSync, readFileSync, writeFileSync } from "node:fs";
 import { CONFIG } from "./config.js";
 function run(cmd: string, args: string[]) {
  execFileSync(cmd, args, { stdio: "pipe" });
 }
 function sudo(args: string[]) {
  run("sudo", args);
 }
 export function ensureBridge() {
  try {
    execFileSync("ip", ["link", "show", CONFIG.bridge.name], {
      stdio: "pipe",
    });
  } catch {
    sudo(["ip", "link", "add", CONFIG.bridge.name, "type", "bridge"]);
    sudo([
      "ip",
      "addr",
      "add",
      `${CONFIG.bridge.ip}/24`,
      "dev",
      CONFIG.bridge.name,
    ]);
    sudo(["ip", "link", "set", CONFIG.bridge.name, "up"]);
    sudo(["sysctl", "-w", "net.ipv4.ip_forward=1"]);
  }
 }
 export function ensureNat() {
  // Check if rule already exists
  try {
    execFileSync(
      "sudo",
      [
        "iptables",
        "-t",
        "nat",
        "-C",
        "POSTROUTING",
        "-s",
        CONFIG.bridge.subnet,
        "-j",
        "MASQUERADE",
      ],
      { stdio: "pipe" }
    );
  } catch {
    // Find the default route interface
    const routeOut = execFileSync("ip", ["route", "show", "default"], {
      encoding: "utf-8",
    });
    const extIface = routeOut.match(/dev\s+(\S+)/)?.[1] ?? "eno2";
    sudo([
      "iptables",
      "-t",
      "nat",
      "-A",
      "POSTROUTING",
      "-s",
      CONFIG.bridge.subnet,
      "-o",
      extIface,
      "-j",
      "MASQUERADE",
    ]);
    sudo([
      "iptables",
      "-A",
      "FORWARD",
      "-i",
      CONFIG.bridge.name,
      "-o",
      extIface,
      "-j",
      "ACCEPT",
    ]);
    sudo([
      "iptables",
      "-A",
      "FORWARD",
      "-i",
      extIface,
      "-o",
      CONFIG.bridge.name,
      "-m",
      "state",
      "--state",
      "RELATED,ESTABLISHED",
      "-j",
      "ACCEPT",
    ]);
  }
 }
 export function createTap(tapName: string) {
  sudo(["ip", "tuntap", "add", tapName, "mode", "tap"]);
  sudo(["ip", "link", "set", tapName, "master", CONFIG.bridge.name]);
  sudo(["ip", "link", "set", tapName, "up"]);
 }
 export function deleteTap(tapName: string) {
  try {
    sudo(["ip", "tuntap", "del", tapName, "mode", "tap"]);
  } catch {
    // Already gone
  }
 }
 export function macFromOctet(octet: number): string {
  return `AA:FC:00:00:00:${octet.toString(16).padStart(2, "0").toUpperCase()}`;
 }
 interface IpPool {
  allocated: number[];
 }
 function readPool(): IpPool {
  try {
    return JSON.parse(readFileSync(CONFIG.ipPoolFile, "utf-8"));
  } catch {
    return { allocated: [] };
  }
 }
 function writePool(pool: IpPool) {
  writeFileSync(CONFIG.ipPoolFile, JSON.stringify(pool));
 }
 export function allocateIp(): { ip: string; octet: number } {
  const fd = openSync(CONFIG.ipPoolLock, "w");
  try {
    // Simple flock via child process
    const pool = readPool();
    for (
      let octet = CONFIG.bridge.minHost;
      octet <= CONFIG.bridge.maxHost;
      octet++
    ) {
      if (!pool.allocated.includes(octet)) {
        pool.allocated.push(octet);
        writePool(pool);
        return { ip: `${CONFIG.bridge.prefix}.${octet}`, octet };
      }
    }
    throw new Error("No free IPs in pool");
  } finally {
    closeSync(fd);
  }
 }
 export function releaseIp(octet: number) {
  const fd = openSync(CONFIG.ipPoolLock, "w");
  try {
    const pool = readPool();
    pool.allocated = pool.allocated.filter((o) => o !== octet);
    writePool(pool);
  } finally {
    closeSync(fd);
  }
 }
--- a/src/overseer.ts
+++ b/src/overseer.ts
@@ -0,0 +1,188 @@
 import IRC from "irc-framework";
 import {
  startAgent,
  stopAgent,
  listAgents,
  stopAllAgents,
  listTemplates,
  reconcileAgents,
  reloadAgent,
  type AgentInfo,
 } from "./agent-manager.js";
 interface OverseerConfig {
  server: string;
  port: number;
  nick: string;
  channel: string;
 }
 function log(msg: string) {
  process.stderr.write(`[overseer] ${msg}\n`);
 }
 function formatAgentList(agents: AgentInfo[]): string[] {
  if (agents.length === 0) return ["No agents running."];
  return agents.map(
    (a) =>
      `${a.name} (${a.template}) — ${a.nick} [${a.model}] ip=${a.ip} since ${a.startedAt.slice(11, 19)}`
  );
 }
 export async function runOverseer(config: OverseerConfig) {
  // Reconcile agent state on startup
  log("Reconciling agent state...");
  const { adopted, cleaned } = reconcileAgents();
  if (adopted.length > 0) {
    log(`Adopted ${adopted.length} running agent(s): ${adopted.join(", ")}`);
  }
  if (cleaned.length > 0) {
    log(`Cleaned ${cleaned.length} dead agent(s): ${cleaned.join(", ")}`);
  }
  const bot = new IRC.Client();
  bot.connect({
    host: config.server,
    port: config.port,
    nick: config.nick,
  });
  bot.on("registered", () => {
    log(`Connected to ${config.server}:${config.port} as ${config.nick}`);
    bot.join(config.channel);
    bot.join("#agents");
    log(`Joined ${config.channel} and #agents`);
  });
  bot.on("message", async (event: { nick: string; target: string; message: string }) => {
    // Only handle channel messages
    if (!event.target.startsWith("#")) return;
    const text = event.message.trim();
    if (!text.startsWith("!")) return;
    const parts = text.split(/\s+/);
    const cmd = parts[0].toLowerCase();
    try {
      switch (cmd) {
        case "!invoke": {
          const template = parts[1];
          if (!template) {
            bot.say(event.target, "Usage: !invoke <template> [name]");
            return;
          }
          const name = parts[2];
          bot.say(event.target, `Invoking agent "${name ?? template}" from template "${template}"...`);
          const info = await startAgent(template, { name });
          bot.say(
            event.target,
            `Agent "${info.name}" started: ${info.nick} [${info.model}] (${info.ip})`
          );
          break;
        }
        case "!destroy": {
          const name = parts[1];
          if (!name) {
            bot.say(event.target, "Usage: !destroy <name>");
            return;
          }
          await stopAgent(name);
          bot.say(event.target, `Agent "${name}" destroyed.`);
          break;
        }
        case "!list": {
          const agents = listAgents();
          for (const line of formatAgentList(agents)) {
            bot.say(event.target, line);
          }
          break;
        }
        case "!model": {
          const name = parts[1];
          const model = parts[2];
          if (!name || !model) {
            bot.say(event.target, "Usage: !model <name> <model>");
            return;
          }
          await reloadAgent(name, { model });
          bot.say(event.target, `Agent "${name}" hot-reloaded with model ${model}.`);
          break;
        }
        case "!templates": {
          const templates = listTemplates();
          if (templates.length === 0) {
            bot.say(event.target, "No templates found.");
          } else {
            bot.say(event.target, `Templates: ${templates.join(", ")}`);
          }
          break;
        }
        case "!models": {
          try {
            const http = await import("node:http");
            const data = await new Promise<string>((resolve, reject) => {
              http.get("http://localhost:11434/api/tags", (res) => {
                const chunks: Buffer[] = [];
                res.on("data", (c) => chunks.push(c));
                res.on("end", () => resolve(Buffer.concat(chunks).toString()));
              }).on("error", reject);
            });
            const models = JSON.parse(data).models;
            if (models.length === 0) {
              bot.say(event.target, "No models available.");
            } else {
              const lines = models.map(
                (m: { name: string; size: number }) =>
                  `${m.name} (${(m.size / 1e9).toFixed(1)}GB)`
              );
              bot.say(event.target, `Models: ${lines.join(", ")}`);
            }
          } catch (e) {
            bot.say(event.target, "Error fetching models from Ollama.");
          }
          break;
        }
        case "!help": {
          bot.say(event.target, "Commands: !invoke <template> [name] | !destroy <name> | !list | !model <name> <model> | !models | !templates | !help");
          break;
        }
      }
    } catch (err) {
      const msg = err instanceof Error ? err.message : String(err);
      bot.say(event.target, `Error: ${msg}`);
      log(`Error handling command "${text}": ${msg}`);
    }
  });
  bot.on("close", () => {
    log("Disconnected. Reconnecting in 5s...");
    setTimeout(() => {
      bot.connect({
        host: config.server,
        port: config.port,
        nick: config.nick,
      });
    }, 5000);
  });
  // Graceful shutdown
  const shutdown = async () => {
    log("Shutting down, stopping all agents...");
    await stopAllAgents();
    bot.quit("Overseer shutting down");
    process.exit(0);
  };
  process.on("SIGINT", shutdown);
  process.on("SIGTERM", shutdown);
  log("Overseer started. Waiting for commands...");
 }
--- a/src/rootfs.ts
+++ b/src/rootfs.ts
@@ -0,0 +1,98 @@
 import { execFileSync } from "node:child_process";
 import {
  existsSync,
  copyFileSync,
  mkdirSync,
  unlinkSync,
 } from "node:fs";
 import { join } from "node:path";
 import { randomBytes } from "node:crypto";
 import { CONFIG } from "./config.js";
 export function ensureBaseImage() {
  if (!existsSync(CONFIG.baseRootfs)) {
    throw new Error(
      `Base rootfs not found at ${CONFIG.baseRootfs}. Run 'fireclaw setup' first.`
    );
  }
  if (!existsSync(CONFIG.kernelPath)) {
    throw new Error(
      `Kernel not found at ${CONFIG.kernelPath}. Run 'fireclaw setup' first.`
    );
  }
 }
 export function ensureSshKeypair() {
  if (!existsSync(CONFIG.sshKeyPath)) {
    execFileSync("ssh-keygen", [
      "-t",
      "ed25519",
      "-f",
      CONFIG.sshKeyPath,
      "-N",
      "",
      "-C",
      "fireclaw",
    ]);
  }
 }
 export function createRunCopy(vmId: string): string {
  mkdirSync(CONFIG.runsDir, { recursive: true });
  const dest = join(CONFIG.runsDir, `${vmId}.ext4`);
  copyFileSync(CONFIG.baseRootfs, dest);
  return dest;
 }
 export function injectSshKey(rootfsPath: string) {
  const mountPoint = `/tmp/fireclaw-mount-${randomBytes(4).toString("hex")}`;
  mkdirSync(mountPoint, { recursive: true });
  try {
    execFileSync("sudo", ["mount", "-o", "loop", rootfsPath, mountPoint], {
      stdio: "pipe",
    });
    execFileSync("sudo", ["mkdir", "-p", join(mountPoint, "root/.ssh")], {
      stdio: "pipe",
    });
    execFileSync(
      "sudo",
      [
        "cp",
        CONFIG.sshPubKeyPath,
        join(mountPoint, "root/.ssh/authorized_keys"),
      ],
      { stdio: "pipe" }
    );
    execFileSync(
      "sudo",
      ["chmod", "600", join(mountPoint, "root/.ssh/authorized_keys")],
      { stdio: "pipe" }
    );
    execFileSync(
      "sudo",
      ["chmod", "700", join(mountPoint, "root/.ssh")],
      { stdio: "pipe" }
    );
  } finally {
    try {
      execFileSync("sudo", ["umount", mountPoint], { stdio: "pipe" });
    } catch {
      // Best effort
    }
    try {
      execFileSync("rmdir", [mountPoint], { stdio: "pipe" });
    } catch {
      // Best effort
    }
  }
 }
 export function deleteRunCopy(rootfsPath: string) {
  try {
    unlinkSync(rootfsPath);
  } catch {
    // Already gone
  }
 }
--- a/src/setup.ts
+++ b/src/setup.ts
@@ -0,0 +1,117 @@
 import { execFileSync, execSync } from "node:child_process";
 import { existsSync, mkdirSync } from "node:fs";
 import { CONFIG } from "./config.js";
 import { ensureBridge, ensureNat } from "./network.js";
 import { ensureSshKeypair } from "./rootfs.js";
 function log(msg: string) {
  process.stderr.write(`[setup] ${msg}\n`);
 }
 function download(url: string, dest: string) {
  execFileSync("curl", ["-fSL", "-o", dest, url], {
    stdio: ["pipe", "pipe", "inherit"],
    timeout: 300_000,
  });
 }
 export async function runSetup() {
  log("Setting up fireclaw...");
  // Create directories
  mkdirSync(CONFIG.baseDir, { recursive: true });
  mkdirSync(CONFIG.runsDir, { recursive: true });
  mkdirSync(CONFIG.socketDir, { recursive: true });
  // Download kernel
  if (existsSync(CONFIG.kernelPath)) {
    log("Kernel already exists, skipping download.");
  } else {
    log("Downloading kernel...");
    download(CONFIG.assets.kernelUrl, CONFIG.kernelPath);
    log("Kernel downloaded.");
  }
  // Download and convert rootfs
  if (existsSync(CONFIG.baseRootfs)) {
    log("Base rootfs already exists, skipping download.");
  } else {
    log("Downloading rootfs...");
    // Find latest rootfs key from S3 listing
    const listing = execFileSync(
      "curl",
      ["-fsSL", CONFIG.assets.rootfsListUrl],
      { encoding: "utf-8", timeout: 30_000 }
    );
    const keys = [...listing.matchAll(/<Key>([^<]+)<\/Key>/g)].map(
      (m) => m[1]
    );
    const rootfsKey = keys.sort().pop();
    if (!rootfsKey) throw new Error("Could not find rootfs in S3 listing");
    const squashfsPath = `${CONFIG.baseDir}/rootfs.squashfs`;
    download(`${CONFIG.assets.rootfsBaseUrl}/${rootfsKey}`, squashfsPath);
    log("Rootfs downloaded. Converting squashfs to ext4...");
    // Convert squashfs to ext4
    const squashMount = "/tmp/fireclaw-squash";
    const ext4Mount = "/tmp/fireclaw-ext4";
    mkdirSync(squashMount, { recursive: true });
    mkdirSync(ext4Mount, { recursive: true });
    try {
      execFileSync(
        "sudo",
        ["mount", "-t", "squashfs", squashfsPath, squashMount],
        { stdio: "pipe" }
      );
      execFileSync("truncate", ["-s", "1G", CONFIG.baseRootfs], {
        stdio: "pipe",
      });
      execFileSync("sudo", ["/usr/sbin/mkfs.ext4", CONFIG.baseRootfs], {
        stdio: "pipe",
      });
      execFileSync("sudo", ["mount", CONFIG.baseRootfs, ext4Mount], {
        stdio: "pipe",
      });
      execFileSync("sudo", ["cp", "-a", `${squashMount}/.`, ext4Mount], {
        stdio: "pipe",
      });
      // Bake in DNS config
      execSync(
        `echo "nameserver 8.8.8.8" | sudo tee ${ext4Mount}/etc/resolv.conf > /dev/null`
      );
      log("Rootfs converted.");
    } finally {
      try {
        execFileSync("sudo", ["umount", squashMount], { stdio: "pipe" });
      } catch {}
      try {
        execFileSync("sudo", ["umount", ext4Mount], { stdio: "pipe" });
      } catch {}
      try {
        execFileSync("rmdir", [squashMount], { stdio: "pipe" });
      } catch {}
      try {
        execFileSync("rmdir", [ext4Mount], { stdio: "pipe" });
      } catch {}
      try {
        execFileSync("rm", ["-f", squashfsPath], { stdio: "pipe" });
      } catch {}
    }
  }
  // Generate SSH keypair
  log("Ensuring SSH keypair...");
  ensureSshKeypair();
  // Set up bridge and NAT
  log("Setting up network bridge...");
  ensureBridge();
  ensureNat();
  log("Setup complete! Run 'fireclaw run \"uname -a\"' to test.");
 }
--- a/src/snapshot.ts
+++ b/src/snapshot.ts
@@ -0,0 +1,128 @@
 import { spawn, type ChildProcess } from "node:child_process";
 import { existsSync, mkdirSync } from "node:fs";
 import { join } from "node:path";
 import { CONFIG } from "./config.js";
 import * as api from "./firecracker-api.js";
 import {
  ensureBridge,
  ensureNat,
  createTap,
  deleteTap,
  macFromOctet,
 } from "./network.js";
 import {
  ensureBaseImage,
  ensureSshKeypair,
  createRunCopy,
  injectSshKey,
 } from "./rootfs.js";
 import { waitForSsh } from "./ssh.js";
 import { copyFileSync } from "node:fs";
 function log(msg: string) {
  process.stderr.write(`[snapshot] ${msg}\n`);
 }
 function waitForSocket(socketPath: string): Promise<void> {
  return new Promise((resolve, reject) => {
    const deadline = Date.now() + 5_000;
    const check = () => {
      if (existsSync(socketPath)) {
        setTimeout(resolve, 200);
        return;
      }
      if (Date.now() > deadline) {
        reject(new Error("Firecracker socket did not appear"));
        return;
      }
      setTimeout(check, 50);
    };
    check();
  });
 }
 export function snapshotExists(): boolean {
  return (
    existsSync(CONFIG.snapshot.statePath) &&
    existsSync(CONFIG.snapshot.memPath) &&
    existsSync(CONFIG.snapshot.rootfsPath)
  );
 }
 export async function createSnapshot() {
  ensureBaseImage();
  ensureSshKeypair();
  const snap = CONFIG.snapshot;
  const socketPath = join(CONFIG.socketDir, "snapshot.sock");
  log("Preparing snapshot rootfs...");
  mkdirSync(CONFIG.socketDir, { recursive: true });
  copyFileSync(CONFIG.baseRootfs, snap.rootfsPath);
  injectSshKey(snap.rootfsPath);
  log("Setting up network...");
  ensureBridge();
  ensureNat();
  createTap(snap.tapDevice);
  let proc: ChildProcess | null = null;
  try {
    log("Booting VM for snapshot...");
    proc = spawn(CONFIG.firecrackerBin, ["--api-sock", socketPath], {
      stdio: "pipe",
      detached: false,
    });
    await waitForSocket(socketPath);
    const bootArgs = [
      "console=ttyS0",
      "reboot=k",
      "panic=1",
      "pci=off",
      "root=/dev/vda",
      "rw",
      `ip=${snap.ip}::${CONFIG.bridge.gateway}:${CONFIG.bridge.netmask}::eth0:off`,
    ].join(" ");
    await api.putBootSource(socketPath, CONFIG.kernelPath, bootArgs);
    await api.putDrive(socketPath, "rootfs", snap.rootfsPath);
    await api.putNetworkInterface(
      socketPath,
      "eth0",
      snap.tapDevice,
      macFromOctet(snap.octet)
    );
    await api.putMachineConfig(
      socketPath,
      CONFIG.vm.vcpuCount,
      CONFIG.vm.memSizeMib
    );
    await api.startInstance(socketPath);
    log("Waiting for SSH...");
    await waitForSsh(snap.ip);
    log("Pausing VM...");
    await api.patchVm(socketPath, "Paused");
    log("Creating snapshot...");
    await api.putSnapshotCreate(socketPath, snap.statePath, snap.memPath);
    log("Snapshot created successfully.");
    log(`  State: ${snap.statePath}`);
    log(`  Memory: ${snap.memPath}`);
    log(`  Rootfs: ${snap.rootfsPath}`);
  } finally {
    if (proc && !proc.killed) {
      proc.kill("SIGKILL");
    }
    try {
      const { unlinkSync } = await import("node:fs");
      unlinkSync(socketPath);
    } catch {}
    deleteTap(snap.tapDevice);
  }
 }
--- a/src/ssh.ts
+++ b/src/ssh.ts
@@ -0,0 +1,107 @@
 import { Client } from "ssh2";
 import { readFileSync } from "node:fs";
 import { createConnection } from "node:net";
 import { CONFIG } from "./config.js";
 import type { RunResult } from "./types.js";
 export function waitForSsh(
  host: string,
  port = 22,
  timeoutMs = CONFIG.vm.bootTimeoutMs
 ): Promise<void> {
  return new Promise((resolve, reject) => {
    const deadline = Date.now() + timeoutMs;
    function attempt() {
      if (Date.now() > deadline) {
        reject(new Error(`SSH not ready after ${timeoutMs}ms`));
        return;
      }
      const sock = createConnection({ host, port, timeout: 500 });
      sock.on("connect", () => {
        sock.destroy();
        resolve();
      });
      sock.on("error", () => {
        sock.destroy();
        setTimeout(attempt, CONFIG.vm.sshPollIntervalMs);
      });
      sock.on("timeout", () => {
        sock.destroy();
        setTimeout(attempt, CONFIG.vm.sshPollIntervalMs);
      });
    }
    attempt();
  });
 }
 export function execCommand(
  host: string,
  command: string,
  timeoutMs: number,
  verbose: boolean
 ): Promise<RunResult> {
  return new Promise((resolve, reject) => {
    const startTime = Date.now();
    const privateKey = readFileSync(CONFIG.sshKeyPath);
    const conn = new Client();
    const timer = setTimeout(() => {
      conn.end();
      reject(new Error(`Command timed out after ${timeoutMs}ms`));
    }, timeoutMs);
    conn.on("ready", () => {
      conn.exec(command, (err, stream) => {
        if (err) {
          clearTimeout(timer);
          conn.end();
          reject(err);
          return;
        }
        const stdoutChunks: Buffer[] = [];
        const stderrChunks: Buffer[] = [];
        stream.on("data", (data: Buffer) => {
          stdoutChunks.push(data);
          if (verbose) process.stdout.write(data);
        });
        stream.stderr.on("data", (data: Buffer) => {
          stderrChunks.push(data);
          if (verbose) process.stderr.write(data);
        });
        stream.on("close", (code: number | null) => {
          clearTimeout(timer);
          conn.end();
          resolve({
            exitCode: code ?? 1,
            stdout: Buffer.concat(stdoutChunks).toString(),
            stderr: Buffer.concat(stderrChunks).toString(),
            durationMs: Date.now() - startTime,
          });
        });
      });
    });
    conn.on("error", (err) => {
      clearTimeout(timer);
      reject(err);
    });
    conn.connect({
      host,
      port: 22,
      username: "root",
      privateKey,
      hostVerifier: () => true,
    });
  });
 }
--- a/src/types.ts
+++ b/src/types.ts
@@ -0,0 +1,24 @@
 export interface VMConfig {
  id: string;
  guestIp: string;
  tapDevice: string;
  socketPath: string;
  rootfsPath: string;
  timeoutMs: number;
  verbose: boolean;
 }
 export interface RunResult {
  exitCode: number;
  stdout: string;
  stderr: string;
  durationMs: number;
 }
 export interface RunOptions {
  timeout?: number;
  verbose?: boolean;
  mem?: number;
  vcpu?: number;
  noSnapshot?: boolean;
 }
--- a/src/vm.ts
+++ b/src/vm.ts
@@ -0,0 +1,288 @@
 import { spawn, type ChildProcess } from "node:child_process";
 import { existsSync, mkdirSync } from "node:fs";
 import { join } from "node:path";
 import { randomBytes } from "node:crypto";
 import { CONFIG } from "./config.js";
 import type { VMConfig, RunResult, RunOptions } from "./types.js";
 import * as api from "./firecracker-api.js";
 import {
  ensureBridge,
  ensureNat,
  allocateIp,
  releaseIp,
  createTap,
  deleteTap,
  macFromOctet,
 } from "./network.js";
 import {
  ensureBaseImage,
  ensureSshKeypair,
  createRunCopy,
  injectSshKey,
  deleteRunCopy,
 } from "./rootfs.js";
 import { waitForSsh, execCommand } from "./ssh.js";
 import { registerVm, unregisterVm } from "./cleanup.js";
 import { snapshotExists } from "./snapshot.js";
 function log(verbose: boolean, msg: string) {
  if (verbose) process.stderr.write(`[fireclaw] ${msg}\n`);
 }
 export class VMInstance {
  private config: VMConfig;
  private process: ChildProcess | null = null;
  private octet = 0;
  constructor(config: VMConfig) {
    this.config = config;
  }
  static async run(
    command: string,
    opts: RunOptions = {}
  ): Promise<RunResult> {
    // Try snapshot path first unless disabled
    if (!opts.noSnapshot && snapshotExists()) {
      return VMInstance.runFromSnapshot(command, opts);
    }
    return VMInstance.runColdBoot(command, opts);
  }
  private static async runFromSnapshot(
    command: string,
    opts: RunOptions
  ): Promise<RunResult> {
    const id = `fc-snap-${randomBytes(3).toString("hex")}`;
    const verbose = opts.verbose ?? false;
    const timeoutMs = opts.timeout ?? CONFIG.vm.defaultTimeoutMs;
    const snap = CONFIG.snapshot;
    mkdirSync(CONFIG.socketDir, { recursive: true });
    const config: VMConfig = {
      id,
      guestIp: snap.ip,
      tapDevice: snap.tapDevice,
      socketPath: join(CONFIG.socketDir, `${id}.sock`),
      rootfsPath: "", // shared, not per-run
      timeoutMs,
      verbose,
    };
    const vm = new VMInstance(config);
    vm.octet = 0; // no IP pool allocation for snapshot runs
    registerVm(vm);
    try {
      log(verbose, `VM ${id}: restoring from snapshot...`);
      ensureBridge();
      ensureNat();
      createTap(snap.tapDevice);
      // Spawn firecracker and load snapshot
      vm.process = spawn(
        CONFIG.firecrackerBin,
        ["--api-sock", config.socketPath],
        { stdio: "pipe", detached: false }
      );
      vm.process.on("error", (err) => {
        log(verbose, `Firecracker process error: ${err.message}`);
      });
      await vm.waitForSocket();
      await api.putSnapshotLoad(
        config.socketPath,
        snap.statePath,
        snap.memPath
      );
      await api.patchVm(config.socketPath, "Resumed");
      log(verbose, `VM ${id}: resumed, waiting for SSH...`);
      await waitForSsh(snap.ip);
      log(verbose, `VM ${id}: executing command...`);
      const result = await execCommand(snap.ip, command, timeoutMs, verbose);
      log(
        verbose,
        `VM ${id}: done (exit=${result.exitCode}, ${result.durationMs}ms)`
      );
      return result;
    } finally {
      await vm.destroy();
      unregisterVm(vm);
    }
  }
  private static async runColdBoot(
    command: string,
    opts: RunOptions
  ): Promise<RunResult> {
    const id = `fc-${randomBytes(3).toString("hex")}`;
    const verbose = opts.verbose ?? false;
    const timeoutMs = opts.timeout ?? CONFIG.vm.defaultTimeoutMs;
    // Pre-flight checks
    ensureBaseImage();
    ensureSshKeypair();
    // Allocate resources
    const { ip, octet } = allocateIp();
    const tapDevice = `fctap${octet}`;
    mkdirSync(CONFIG.socketDir, { recursive: true });
    const config: VMConfig = {
      id,
      guestIp: ip,
      tapDevice,
      socketPath: join(CONFIG.socketDir, `${id}.sock`),
      rootfsPath: "",
      timeoutMs,
      verbose,
    };
    const vm = new VMInstance(config);
    vm.octet = octet;
    registerVm(vm);
    try {
      log(verbose, `VM ${id}: preparing rootfs...`);
      config.rootfsPath = createRunCopy(id);
      injectSshKey(config.rootfsPath);
      log(verbose, `VM ${id}: creating tap ${tapDevice}...`);
      ensureBridge();
      ensureNat();
      createTap(tapDevice);
      log(verbose, `VM ${id}: booting...`);
      await vm.boot(opts);
      log(verbose, `VM ${id}: waiting for SSH at ${ip}...`);
      await waitForSsh(ip);
      log(verbose, `VM ${id}: executing command...`);
      const result = await execCommand(ip, command, timeoutMs, verbose);
      log(
        verbose,
        `VM ${id}: done (exit=${result.exitCode}, ${result.durationMs}ms)`
      );
      return result;
    } finally {
      await vm.destroy();
      unregisterVm(vm);
    }
  }
  private async boot(opts: RunOptions) {
    const { config } = this;
    const vcpu = opts.vcpu ?? CONFIG.vm.vcpuCount;
    const mem = opts.mem ?? CONFIG.vm.memSizeMib;
    // Spawn firecracker
    this.process = spawn(
      CONFIG.firecrackerBin,
      ["--api-sock", config.socketPath],
      {
        stdio: "pipe",
        detached: false,
      }
    );
    this.process.on("error", (err) => {
      log(config.verbose, `Firecracker process error: ${err.message}`);
    });
    // Wait for socket
    await this.waitForSocket();
    // Configure via API
    const bootArgs = [
      "console=ttyS0",
      "reboot=k",
      "panic=1",
      "pci=off",
      "root=/dev/vda",
      "rw",
      `ip=${config.guestIp}::${CONFIG.bridge.gateway}:${CONFIG.bridge.netmask}::eth0:off`,
    ].join(" ");
    await api.putBootSource(config.socketPath, CONFIG.kernelPath, bootArgs);
    await api.putDrive(config.socketPath, "rootfs", config.rootfsPath);
    await api.putNetworkInterface(
      config.socketPath,
      "eth0",
      config.tapDevice,
      macFromOctet(this.octet)
    );
    await api.putMachineConfig(config.socketPath, vcpu, mem);
    await api.startInstance(config.socketPath);
  }
  private waitForSocket(): Promise<void> {
    const socketPath = this.config.socketPath;
    return new Promise((resolve, reject) => {
      const deadline = Date.now() + 5_000;
      const check = () => {
        if (existsSync(socketPath)) {
          setTimeout(resolve, 200);
          return;
        }
        if (Date.now() > deadline) {
          reject(new Error("Firecracker socket did not appear"));
          return;
        }
        setTimeout(check, 50);
      };
      check();
    });
  }
  async destroy() {
    const { config } = this;
    log(config.verbose, `VM ${config.id}: cleaning up...`);
    // Kill firecracker
    if (this.process && !this.process.killed) {
      this.process.kill("SIGTERM");
      await new Promise<void>((resolve) => {
        const timer = setTimeout(() => {
          if (this.process && !this.process.killed) {
            this.process.kill("SIGKILL");
          }
          resolve();
        }, 2_000);
        this.process!.on("exit", () => {
          clearTimeout(timer);
          resolve();
        });
      });
    }
    // Clean up socket
    try {
      const { unlinkSync } = await import("node:fs");
      unlinkSync(config.socketPath);
    } catch {
      // Already gone
    }
    // Clean up tap device
    deleteTap(config.tapDevice);
    // Release IP (skip for snapshot runs which don't allocate from pool)
    if (this.octet > 0) {
      releaseIp(this.octet);
    }
    // Delete rootfs copy (skip for snapshot runs which share rootfs)
    if (config.rootfsPath) {
      deleteRunCopy(config.rootfsPath);
    }
  }
 }
--- a/tests/test-suite.sh
+++ b/tests/test-suite.sh
@@ -0,0 +1,305 @@
 #!/bin/bash
 # Fireclaw regression test suite
 # Requires: overseer running, no agents active
 # Usage: ./tests/test-suite.sh
 set -uo pipefail
 PASS=0
 FAIL=0
 SKIP=0
 irc_cmd() {
  local wait=${1:-3}
  shift
  local cmds=""
  for cmd in "$@"; do
    cmds+="PRIVMSG #agents :${cmd}\r\n"
    cmds+="$(printf 'SLEEP %s' "$wait")\r\n"
  done
  {
    echo -e "NICK fctest\r\nUSER fctest 0 * :test\r\n"
    sleep 2
    echo -e "JOIN #agents\r\nJOIN #dev\r\n"
    sleep 1
    for cmd in "$@"; do
      echo -e "PRIVMSG #agents :${cmd}\r\n"
      sleep "$wait"
    done
    echo -e "QUIT\r\n"
  } | nc -q 2 127.0.0.1 6667 2>&1
 }
 assert_contains() {
  local output="$1"
  local expected="$2"
  local test_name="$3"
  if echo "$output" | grep -q "$expected"; then
    echo "  PASS: $test_name"
    ((PASS++))
  else
    echo "  FAIL: $test_name (expected: $expected)"
    ((FAIL++))
  fi
 }
 assert_not_contains() {
  local output="$1"
  local expected="$2"
  local test_name="$3"
  if echo "$output" | grep -q "$expected"; then
    echo "  FAIL: $test_name (unexpected: $expected)"
    ((FAIL++))
  else
    echo "  PASS: $test_name"
    ((PASS++))
  fi
 }
 cleanup_agent() {
  local name="$1"
  fireclaw agent stop "$name" 2>/dev/null || true
  sleep 1
  # Clean any leaked taps
  for tap in $(ip link show 2>/dev/null | grep -oP 'fctap\d+' | sort -u); do
    if [ "$tap" != "fctap200" ]; then
      # Check if tap is used by a running agent
      if ! fireclaw agent list 2>/dev/null | grep -q "$tap"; then
        sudo ip tuntap del "$tap" mode tap 2>/dev/null || true
      fi
    fi
  done
 }
 echo "========================================="
 echo "Fireclaw Regression Test Suite"
 echo "========================================="
 echo ""
 # Precondition: check overseer is running
 echo "[Pre] Checking overseer..."
 OUT=$(irc_cmd 2 "!help")
 if echo "$OUT" | grep -q "overseer.*PRIVMSG.*Commands:"; then
  echo "  OK: Overseer is running"
 else
  echo "  ERROR: Overseer not running. Start with: fireclaw overseer"
  exit 1
 fi
 # Clean any leftover agents
 echo "[Pre] Cleaning leftover agents..."
 for agent in $(fireclaw agent list 2>/dev/null | awk '{print $1}'); do
  cleanup_agent "$agent"
 done
 echo ""
 # ==========================================
 echo "--- Test 1: Overseer help command ---"
 OUT=$(irc_cmd 2 "!help")
 assert_contains "$OUT" "Commands:" "!help returns command list"
 echo "--- Test 2: Overseer templates ---"
 OUT=$(irc_cmd 2 "!templates")
 assert_contains "$OUT" "worker" "templates includes worker"
 assert_contains "$OUT" "coder" "templates includes coder"
 assert_contains "$OUT" "researcher" "templates includes researcher"
 echo "--- Test 3: Empty agent list ---"
 OUT=$(irc_cmd 2 "!list")
 assert_contains "$OUT" "No agents running" "no agents when clean"
 echo "--- Test 4: Invalid template ---"
 OUT=$(irc_cmd 2 "!invoke faketype")
 assert_contains "$OUT" "not found" "invalid template rejected"
 echo "--- Test 5: Missing arguments ---"
 OUT=$(irc_cmd 2 "!invoke" "!destroy" "!model")
 assert_contains "$OUT" "Usage:" "!invoke usage shown"
 echo "--- Test 6: Destroy nonexistent ---"
 OUT=$(irc_cmd 2 "!destroy ghost")
 assert_contains "$OUT" "not running" "destroy nonexistent handled"
 echo ""
 echo "--- Test 7: Spawn worker agent ---"
 OUT=$(irc_cmd 8 "!invoke worker")
 assert_contains "$OUT" "Agent \"worker\" started" "worker spawned"
 # Verify agent is running
 AGENTS=$(fireclaw agent list 2>&1)
 assert_contains "$AGENTS" "worker" "worker in agent list"
 echo "--- Test 8: Worker appears in IRC ---"
 OUT=$({
  echo -e "NICK fctest2\r\nUSER fctest2 0 * :t\r\n"
  sleep 2
  echo -e "JOIN #agents\r\n"
  sleep 1
  echo -e "NAMES #agents\r\n"
  sleep 1
  echo -e "QUIT\r\n"
 } | nc -q 2 127.0.0.1 6667 2>&1)
 assert_contains "$OUT" "worker" "worker in NAMES list"
 echo "--- Test 9: Worker responds to mention ---"
 OUT=$({
  echo -e "NICK fctest3\r\nUSER fctest3 0 * :t\r\n"
  sleep 2
  echo -e "JOIN #agents\r\n"
  sleep 1
  echo -e "PRIVMSG #agents :worker: say pong\r\n"
  sleep 25
  echo -e "QUIT\r\n"
 } | nc -q 2 127.0.0.1 6667 2>&1)
 assert_contains "$OUT" ":worker.*PRIVMSG" "worker responded"
 echo "--- Test 10: Worker ignores non-mention ---"
 OUT=$({
  echo -e "NICK fctest4\r\nUSER fctest4 0 * :t\r\n"
  sleep 2
  echo -e "JOIN #agents\r\n"
  sleep 1
  echo -e "PRIVMSG #agents :hello everyone\r\n"
  sleep 10
  echo -e "QUIT\r\n"
 } | nc -q 2 127.0.0.1 6667 2>&1)
 assert_not_contains "$OUT" ":worker.*PRIVMSG" "worker stays quiet on non-mention"
 echo "--- Test 11: Duplicate prevention ---"
 OUT=$(irc_cmd 3 "!invoke worker")
 assert_contains "$OUT" "already running" "duplicate rejected"
 echo "--- Test 12: Named agent from template ---"
 OUT=$(irc_cmd 8 "!invoke worker helper2")
 assert_contains "$OUT" 'Agent "helper2" started' "named agent spawned"
 sleep 2
 echo "--- Test 13: Multiple agents listed ---"
 OUT=$(irc_cmd 3 "!list")
 assert_contains "$OUT" "worker" "worker in list"
 assert_contains "$OUT" "helper2" "helper2 in list"
 echo "--- Test 14: Destroy specific agent ---"
 OUT=$(irc_cmd 3 "!destroy helper2")
 assert_contains "$OUT" 'Agent "helper2" destroyed' "helper2 destroyed"
 # Verify only worker remains
 sleep 2
 OUT=$({
  echo -e "NICK fcverify\r\nUSER fcverify 0 * :t\r\n"
  sleep 2
  echo -e "JOIN #agents\r\n"
  sleep 1
  echo -e "PRIVMSG #agents :!list\r\n"
  sleep 3
  echo -e "QUIT\r\n"
 } | nc -q 2 127.0.0.1 6667 2>&1 | grep ":overseer.*PRIVMSG.*#agents")
 assert_contains "$OUT" "worker" "worker still running"
 # Check that helper2 doesn't appear in the list output (exclude destroy confirmation)
 LIST_LINE=$(echo "$OUT" | grep -v "destroyed" | grep -v "Invoking" || true)
 assert_not_contains "$LIST_LINE" "helper2" "helper2 removed from list"
 cleanup_agent "helper2"
 echo "--- Test 15: Destroy worker ---"
 OUT=$(irc_cmd 3 "!destroy worker")
 assert_contains "$OUT" 'Agent "worker" destroyed' "worker destroyed"
 cleanup_agent "worker"
 echo "--- Test 16: Clean after destroy ---"
 OUT=$(irc_cmd 2 "!list")
 assert_contains "$OUT" "No agents running" "all agents cleaned"
 echo ""
 echo "--- Test 17: CLI agent start/list/stop ---"
 fireclaw agent start worker 2>&1 | grep -q "started" && echo "  PASS: CLI agent start" && ((PASS++)) || { echo "  FAIL: CLI agent start"; ((FAIL++)); }
 sleep 3
 fireclaw agent list 2>&1 | grep -q "worker" && echo "  PASS: CLI agent list" && ((PASS++)) || { echo "  FAIL: CLI agent list"; ((FAIL++)); }
 fireclaw agent stop worker 2>&1 | grep -q "stopped" && echo "  PASS: CLI agent stop" && ((PASS++)) || { echo "  FAIL: CLI agent stop"; ((FAIL++)); }
 cleanup_agent "worker"
 echo "--- Test 18: Ephemeral run still works ---"
 OUT=$(fireclaw run "echo ephemeral-test" 2>&1)
 assert_contains "$OUT" "ephemeral-test" "fireclaw run works"
 echo ""
 echo "--- Test 19: Overseer crash recovery ---"
 # Spawn a worker
 fireclaw agent start worker 2>&1 | grep -q "started" && echo "  PASS: worker started for crash test" && ((PASS++)) || { echo "  FAIL: start worker for crash test"; ((FAIL++)); }
 sleep 5
 WORKER_PID=$(python3 -c "import json; d=json.load(open('$HOME/.fireclaw/agents.json')); print(d.get('worker',{}).get('pid',''))" 2>/dev/null)
 # SIGKILL the overseer (simulates crash, KillMode=process keeps worker alive)
 OVERSEER_PID=$(systemctl show fireclaw-overseer -p MainPID --value 2>/dev/null)
 if [ -n "$OVERSEER_PID" ] && [ "$OVERSEER_PID" != "0" ]; then
  sudo kill -9 "$OVERSEER_PID" 2>/dev/null
  sleep 8  # Wait for systemd to auto-restart (RestartSec=5 + buffer)
  # Check worker survived
  if kill -0 "$WORKER_PID" 2>/dev/null; then
    echo "  PASS: worker survived overseer crash" && ((PASS++))
  else
    echo "  FAIL: worker died with overseer" && ((FAIL++))
  fi
  # Check overseer restarted and adopted
  NEW_PID=$(systemctl show fireclaw-overseer -p MainPID --value 2>/dev/null)
  if [ -n "$NEW_PID" ] && [ "$NEW_PID" != "0" ] && [ "$NEW_PID" != "$OVERSEER_PID" ]; then
    echo "  PASS: overseer auto-restarted" && ((PASS++))
  else
    echo "  FAIL: overseer did not restart" && ((FAIL++))
  fi
  # Check adopted via !list
  sleep 2
  OUT=$({
    echo -e "NICK fcrecov\r\nUSER fcrecov 0 * :t\r\n"
    sleep 2
    echo -e "JOIN #agents\r\n"
    sleep 1
    echo -e "PRIVMSG #agents :!list\r\n"
    sleep 3
    echo -e "QUIT\r\n"
  } | nc -q 2 127.0.0.1 6667 2>&1)
  assert_contains "$OUT" "worker" "overseer adopted worker after crash"
  # Cleanup
  OUT2=$({
    echo -e "NICK fcrecov2\r\nUSER fcrecov2 0 * :t\r\n"
    sleep 2
    echo -e "JOIN #agents\r\n"
    sleep 1
    echo -e "PRIVMSG #agents :!destroy worker\r\n"
    sleep 5
    echo -e "QUIT\r\n"
  } | nc -q 2 127.0.0.1 6667 2>&1)
 else
  echo "  SKIP: overseer not running via systemd, skipping crash test" && ((SKIP++))
  ((SKIP++))
  ((SKIP++))
  ((SKIP++))
  cleanup_agent "worker"
 fi
 echo ""
 echo "--- Test 20: Graceful agent shutdown (IRC QUIT) ---"
 # Spawn and destroy, check for clean QUIT
 {
  echo -e "NICK fcquit\r\nUSER fcquit 0 * :t\r\n"
  sleep 2
  echo -e "JOIN #agents\r\n"
  sleep 1
  echo -e "PRIVMSG #agents :!invoke worker\r\n"
  sleep 8
  echo -e "PRIVMSG #agents :!destroy worker\r\n"
  sleep 5
  echo -e "QUIT\r\n"
 } | nc -q 2 127.0.0.1 6667 2>&1 > /tmp/fc-quit-test.txt
 if grep -q "QUIT.*shutting down" /tmp/fc-quit-test.txt; then
  echo "  PASS: agent sent IRC QUIT on destroy" && ((PASS++))
 else
  echo "  FAIL: agent did not send IRC QUIT" && ((FAIL++))
 fi
 rm -f /tmp/fc-quit-test.txt
 cleanup_agent "worker"
 echo ""
 echo "========================================="
 echo "Results: $PASS passed, $FAIL failed, $SKIP skipped"
 echo "========================================="
 [ "$FAIL" -eq 0 ] && exit 0 || exit 1
--- a/tsconfig.json
+++ b/tsconfig.json
@@ -0,0 +1,16 @@
 {
  "compilerOptions": {
    "target": "ES2022",
    "module": "Node16",
    "moduleResolution": "Node16",
    "outDir": "./dist",
    "rootDir": "./src",
    "strict": true,
    "esModuleInterop": true,
    "skipLibCheck": true,
    "forceConsistentCasingInFileNames": true,
    "declaration": true,
    "sourceMap": true
  },
  "include": ["src/**/*"]
 }