Initial commit — fireclaw multi-agent system
Firecracker microVM-based multi-agent system with IRC orchestration and local LLMs. Features: - Ephemeral command runner with VM snapshots (~1.1s) - Multi-agent orchestration via overseer IRC bot - 5 agent templates (worker, coder, researcher, quick, creative) - Tool access (shell + podman containers inside VMs) - Persistent workspace + memory system (MEMORY.md pattern) - Agent hot-reload (model/persona swap via SSH + SIGHUP) - Non-root agents, graceful shutdown, crash recovery - Agent-to-agent communication via IRC - DM support, /invite support - Systemd service, 20 regression tests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
4
.gitignore
vendored
Normal file
4
.gitignore
vendored
Normal file
@@ -0,0 +1,4 @@
|
||||
node_modules/
|
||||
dist/
|
||||
*.js.map
|
||||
.DS_Store
|
||||
142
IDEAS.md
Normal file
142
IDEAS.md
Normal file
@@ -0,0 +1,142 @@
|
||||
# Fireclaw Ideas
|
||||
|
||||
Future features and experiments, loosely prioritized by usefulness.
|
||||
|
||||
## Operator Tools
|
||||
|
||||
### !status command
|
||||
Quick dashboard in IRC: agent count, RAM/CPU per VM, Ollama model currently loaded, system uptime, disk free. One command to see the health of everything.
|
||||
|
||||
### !logs <agent> [n]
|
||||
Tail the last N interactions an agent had. Stored in the agent's workspace. Useful to see what an agent's been doing while you were away.
|
||||
|
||||
### !persona <agent> [new persona]
|
||||
View or live-edit an agent's persona via IRC. "Make the worker more sarcastic" without touching files or restarting. Uses hot-reload under the hood.
|
||||
|
||||
### !pause / !resume <agent>
|
||||
Temporarily mute an agent without destroying it. Agent stays alive but stops responding. Useful when you need a channel to yourself.
|
||||
|
||||
## Agent Tools
|
||||
|
||||
### Web search
|
||||
Agents can search via the searx instance on mymx. Either bake the searx CLI into the rootfs, or add a proper `web_search(query)` tool that calls the searx API from inside the VM. Agents could actually research topics instead of relying on training data.
|
||||
|
||||
### Fetch URL
|
||||
`fetch_url(url)` tool to grab a webpage, strip HTML, return text. Combined with web search, agents become genuine research assistants. Could use `curl | python3 -c "from html.parser import..."` or a lightweight readability script.
|
||||
|
||||
### File sharing between agents
|
||||
A shared `/shared` mount (third virtio drive, or a common ext4 image) that all agents can read/write. Drop a file from one agent, pick it up from another. Enables collaboration: researcher writes findings, coder reads and implements.
|
||||
|
||||
### Code execution sandbox
|
||||
A `run_python(code)` tool that's safer than `run_command`. Executes in a subprocess with resource limits (timeout, memory cap). Better for code agents that need to test their own output.
|
||||
|
||||
## Automation
|
||||
|
||||
### Cron agents
|
||||
Template gets an optional `schedule` field: `"schedule": "0 8 * * *"`. The overseer spawns the agent on schedule, it does its task, reports to #agents, and self-destructs. Use cases:
|
||||
- Morning health check: "any disk/memory/service issues on grogbox?"
|
||||
- Daily digest: "summarize what happened in #agents yesterday"
|
||||
- Backup verification: "check that last night's backups completed"
|
||||
|
||||
### Webhook triggers
|
||||
HTTP endpoint on the host (e.g., `:8080/hook/<template>`) that spawns an ephemeral agent with the webhook payload as context. Examples:
|
||||
- Gitea push webhook → coder agent reviews the commit in #dev
|
||||
- Uptime monitor → agent investigates and reports
|
||||
- RSS feed → researcher summarizes new articles
|
||||
|
||||
### Alert forwarding
|
||||
Pipe system alerts (fail2ban, smartmontools, systemd failures, journal errors) into #agents via a simple bridge script. An always-on agent could triage: "fail2ban banned 3 IPs today, all SSH brute force from China, nothing to worry about."
|
||||
|
||||
### Git integration
|
||||
Agent can clone repos from Gitea (on mymx), read code, create branches, commit changes, open PRs. Would need git in the rootfs (already available via `apk add git`) and Gitea API access via the bridge network.
|
||||
|
||||
## Agent Personality & Memory
|
||||
|
||||
### Evolving personalities
|
||||
Instruct agents to actively develop opinions, preferences, and communication styles over time. The memory system supports this — agents could save "I prefer concise answers" or "human likes dry humor" and adapt. Give them character arcs.
|
||||
|
||||
### Agent journals
|
||||
Each agent maintains a daily journal in its workspace. Auto-saved summary of conversations, decisions made, things learned. Creates a narrative over time. Useful for debugging agent behavior and understanding their "thought process."
|
||||
|
||||
### Cross-agent memory
|
||||
Agents can read (but not write) each other's MEMORY.md. A new agent spawned for a task can inherit context from an existing agent. "Spawn a coder that knows what the researcher found."
|
||||
|
||||
### Agent self-improvement
|
||||
After each conversation, the agent reflects: "What could I have done better?" Saves lessons to memory. Over time, agents get better at their specific role. Needs a meta-prompt that triggers self-reflection.
|
||||
|
||||
## Multi-Agent Orchestration
|
||||
|
||||
### Task delegation
|
||||
Human gives a complex task to one agent, it breaks it down and delegates subtasks to other agents via IRC. Researcher does the research, coder implements, worker tests. All visible in #agents.
|
||||
|
||||
### Agent voting
|
||||
Multiple agents weigh in on a question. "Should we upgrade the kernel?" Each agent responds in #agents, human gets multiple perspectives. Could formalize with a `!poll` command.
|
||||
|
||||
### Agent debates
|
||||
Two agents argue opposite sides of a technical decision. Useful for exploring trade-offs. "Should we use Rust or Go for this?" Coder argues one side, researcher the other.
|
||||
|
||||
## MCP Servers as Firecracker VMs
|
||||
|
||||
Run MCP tool servers in their own Firecracker VMs, same isolation model as agents. Managed by the overseer with the same lifecycle (!invoke, !destroy).
|
||||
|
||||
### Approach: single Firecracker VM with podman containers
|
||||
```
|
||||
Firecracker VMs (fcbr0, 172.16.0.x)
|
||||
├── worker (agent VM)
|
||||
├── coder (agent VM)
|
||||
└── mcp-services (service VM, 172.16.0.10)
|
||||
└── podman
|
||||
├── mcp-fs (:8081)
|
||||
├── mcp-git (:8082)
|
||||
└── mcp-searx (:8083)
|
||||
```
|
||||
|
||||
One VM hosts all MCP servers in separate containers. Firecracker isolates from the host, podman separates services from each other. Lightweight — MCP servers are just HTTP wrappers, don't need their own VMs.
|
||||
|
||||
Agents call them at `172.16.0.10:<port>`. Overseer manages the VM and lists available tools via `!services`.
|
||||
|
||||
One-VM-per-service is overkill for trusted MCP servers but could be used for untrusted third-party tools.
|
||||
|
||||
### Why a Firecracker VM instead of host podman
|
||||
- MCP servers can't access the host filesystem directly
|
||||
- Consistent isolation model with agents
|
||||
- The VM is independently restartable without affecting the host
|
||||
- Podman-in-Firecracker is already working in the agent rootfs
|
||||
|
||||
### Candidate MCP servers
|
||||
- **filesystem** — read/write to a shared volume (mounted as virtio drive)
|
||||
- **git** — clone, read, diff, commit (Gitea on mymx accessible via bridge)
|
||||
- **searx** — web search via searx.mymx.me
|
||||
- **database** — SQLite or PostgreSQL query tool
|
||||
- **fetch** — HTTP fetch + readability extraction
|
||||
|
||||
## Infrastructure
|
||||
|
||||
### Agent metrics dashboard
|
||||
Simple HTML page served from the host showing: running agents, response times, model usage, memory contents, conversation history. No framework — just a static page with data from agents.json and workspace files.
|
||||
|
||||
### Agent backup/restore
|
||||
Export an agent's complete state (workspace, config, rootfs diff) as a tarball. Import on another machine. Portable agent identities.
|
||||
|
||||
### Multi-host agents
|
||||
Run agents on multiple machines (grogbox + odin). Overseer manages VMs across hosts via SSH. Agents on different hosts communicate via IRC federation.
|
||||
|
||||
### GPU passthrough
|
||||
When/if grogbox gets a GPU: pass it through to a single agent VM for fast inference. That agent becomes the "smart" one, others stay on CPU. Or run Ollama with GPU on the host and all agents benefit.
|
||||
|
||||
## Fun & Experimental
|
||||
|
||||
### Agent challenges
|
||||
Post a challenge in #agents: "shortest Python script that sorts a list." Agents compete, see each other's answers, iterate. Gamified agent development.
|
||||
|
||||
### Honeypot agents
|
||||
Agent with fake credentials, fake services, fake data. See what it tries to do. Test agent safety before trusting it with real access. Could also test prompt injection resistance.
|
||||
|
||||
### Agent-written agents
|
||||
An agent creates a new template (persona + config) and asks the overseer to spawn it. Self-replicating agent system. Needs careful guardrails.
|
||||
|
||||
### IRC games
|
||||
Agents play text-based games with each other or with humans. Trivia, 20 questions, collaborative storytelling. Tests agent personality and creativity in a low-stakes way.
|
||||
|
||||
### Dream mode
|
||||
An agent left running overnight with `trigger: all` in an empty channel, talking to itself. Stream of consciousness. Review in the morning. Probably nonsense, but occasionally insightful.
|
||||
251
README.md
Normal file
251
README.md
Normal file
@@ -0,0 +1,251 @@
|
||||
# Fireclaw
|
||||
|
||||
Multi-agent system powered by Firecracker microVMs. Each AI agent runs in its own hardware-isolated VM, connects to IRC, and responds via local LLMs.
|
||||
|
||||
## What it does
|
||||
|
||||
**Command runner:** Execute arbitrary commands in ephemeral microVMs with KVM isolation.
|
||||
|
||||
```
|
||||
$ fireclaw run "uname -a"
|
||||
Linux 172.16.0.200 5.10.225 #1 SMP x86_64 Linux
|
||||
```
|
||||
|
||||
**Multi-agent system:** Spawn AI agents as long-running VMs. Each agent connects to IRC, responds via Ollama, has tool access, and remembers across restarts.
|
||||
|
||||
```
|
||||
# In IRC (#control):
|
||||
<human> !invoke worker
|
||||
<overseer> Agent "worker" started: worker [qwen2.5:7b] (172.16.0.2)
|
||||
|
||||
# In IRC (#agents):
|
||||
<human> worker: what's your uptime?
|
||||
<worker> System has been up for 3 minutes with load average 0.00.
|
||||
|
||||
# Private message:
|
||||
/msg worker hello
|
||||
<worker> Hey! How can I help?
|
||||
|
||||
# Hot-reload model:
|
||||
<human> !model worker phi4-mini
|
||||
<worker> [reloaded: model=phi4-mini]
|
||||
```
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
grogbox host
|
||||
├── fireclaw overseer systemd service, IRC bot in #control
|
||||
├── Ollama 0.0.0.0:11434, 5 models available
|
||||
└── nyx.fireclaw.local ngircd IRC server (FireclawNet)
|
||||
|
||||
Firecracker VMs (fcbr0 bridge, 172.16.0.0/24)
|
||||
├── worker (172.16.0.2) general assistant in #agents
|
||||
├── coder (172.16.0.3) code-focused agent in #agents
|
||||
└── research (172.16.0.4) research agent in #agents
|
||||
```
|
||||
|
||||
Each agent VM:
|
||||
- Runs a Python IRC bot (stdlib only, zero deps)
|
||||
- Connects to nyx.fireclaw.local at 172.16.0.1:6667
|
||||
- Calls Ollama at 172.16.0.1:11434 for LLM responses
|
||||
- Has tool access — shell commands + podman containers
|
||||
- Has persistent workspace at `/workspace` (survives restarts)
|
||||
- Has persistent memory — saves/loads facts across restarts
|
||||
- Accepts `/invite` to join any channel
|
||||
- Responds to DMs without mention
|
||||
- Runs as unprivileged `agent` user
|
||||
- Is fully isolated from the host and other agents
|
||||
|
||||
## Requirements
|
||||
|
||||
- Linux with KVM (`/dev/kvm`)
|
||||
- Firecracker v1.15+ at `/usr/local/bin/firecracker`
|
||||
- `sudo` access (for tap devices, rootfs mounting)
|
||||
- Node.js 20+
|
||||
- Ollama (for LLM responses)
|
||||
- ngircd (for IRC)
|
||||
|
||||
## Quick start
|
||||
|
||||
```bash
|
||||
cd ~/projects/fireclaw
|
||||
npm install
|
||||
npm run build
|
||||
sudo npm link
|
||||
|
||||
# One-time setup
|
||||
fireclaw setup
|
||||
fireclaw snapshot create
|
||||
|
||||
# Start the overseer (or use systemd)
|
||||
sudo systemctl start fireclaw-overseer
|
||||
|
||||
# Connect to IRC and start spawning agents
|
||||
# irssi -c localhost -n human
|
||||
# /join #control
|
||||
# !invoke worker
|
||||
# /join #agents
|
||||
# worker: hello!
|
||||
```
|
||||
|
||||
## IRC Channel Layout
|
||||
|
||||
| Channel | Purpose |
|
||||
|---|---|
|
||||
| `#control` | Overseer commands only (!invoke, !destroy, !list, etc.) |
|
||||
| `#agents` | Common room — all agents join here |
|
||||
| `/msg <nick>` | Private DM with an agent (no mention needed) |
|
||||
| `/invite <nick> #room` | Pull an agent into any channel |
|
||||
|
||||
## CLI Reference
|
||||
|
||||
```
|
||||
fireclaw run [options] "<command>" Run a command in an ephemeral microVM
|
||||
-t, --timeout <seconds> Command timeout (default: 60)
|
||||
-v, --verbose Show boot/cleanup progress
|
||||
--no-snapshot Force cold boot
|
||||
|
||||
fireclaw overseer [options] Start the overseer daemon
|
||||
--server <host> IRC server (default: localhost)
|
||||
--port <port> IRC port (default: 6667)
|
||||
--nick <nick> Bot nickname (default: overseer)
|
||||
--channel <chan> Control channel (default: #control)
|
||||
|
||||
fireclaw agent start <template> Start an agent VM
|
||||
--name <name> Override agent name
|
||||
--model <model> Override LLM model
|
||||
|
||||
fireclaw agent stop <name> Stop an agent VM
|
||||
fireclaw agent list List running agents
|
||||
fireclaw snapshot create Create VM snapshot for fast restores
|
||||
fireclaw setup One-time setup
|
||||
```
|
||||
|
||||
## IRC Commands (via overseer in #control)
|
||||
|
||||
| Command | Description |
|
||||
|---|---|
|
||||
| `!invoke <template> [name]` | Spawn an agent VM |
|
||||
| `!destroy <name>` | Kill an agent VM (graceful IRC QUIT) |
|
||||
| `!list` | Show running agents |
|
||||
| `!model <name> <model>` | Hot-reload agent's LLM model |
|
||||
| `!templates` | List available agent templates |
|
||||
| `!help` | Show commands |
|
||||
|
||||
## Agent Templates
|
||||
|
||||
Templates live in `~/.fireclaw/templates/`:
|
||||
|
||||
| Template | Nick | Model | Tools | Role |
|
||||
|---|---|---|---|---|
|
||||
| worker | worker | qwen2.5:7b | yes | General purpose |
|
||||
| coder | coder | qwen2.5-coder:7b | yes | Code-focused |
|
||||
| researcher | research | llama3.1:8b | yes | Thorough research |
|
||||
| quick | quick | phi4-mini | no | Fast one-liners (~5s) |
|
||||
| creative | muse | gemma3:4b | no | Writing, brainstorming |
|
||||
|
||||
## Available Models
|
||||
|
||||
| Model | Size | Speed (CPU) | Tools | Best for |
|
||||
|---|---|---|---|---|
|
||||
| qwen2.5-coder:7b | 4.7 GB | ~15s | yes | Code tasks |
|
||||
| qwen2.5:7b | 4.7 GB | ~15s | partial | General chat |
|
||||
| llama3.1:8b | 4.9 GB | ~15s | partial | Instruction following |
|
||||
| gemma3:4b | 3.3 GB | ~8s | no | Creative, balanced |
|
||||
| phi4-mini | 2.5 GB | ~5s | no | Fast, simple answers |
|
||||
|
||||
## Performance
|
||||
|
||||
| Mode | Time |
|
||||
|---|---|
|
||||
| Snapshot restore (ephemeral run) | ~1.1s |
|
||||
| Cold boot (ephemeral run) | ~2.9s |
|
||||
| Agent VM boot to IRC connect | ~5s |
|
||||
|
||||
## Security Model
|
||||
|
||||
- Agent processes run as unprivileged `agent` user inside VMs
|
||||
- Root SSH retained for overseer management only
|
||||
- Each VM gets hardware-level KVM isolation
|
||||
- Persistent workspace is per-agent — no cross-agent access
|
||||
- Overseer runs on host (trusted), agents in VMs (untrusted)
|
||||
- Agent-to-agent cooldown (10s) prevents infinite loops
|
||||
- Graceful shutdown — agents send IRC QUIT before VM kill
|
||||
|
||||
## Source Files
|
||||
|
||||
```
|
||||
src/
|
||||
index.ts Entry point
|
||||
cli.ts CLI commands (commander)
|
||||
vm.ts Ephemeral VM lifecycle (cold boot + snapshot)
|
||||
firecracker-api.ts Firecracker REST client
|
||||
snapshot.ts Snapshot creation workflow
|
||||
overseer.ts Overseer daemon (IRC + agent lifecycle)
|
||||
agent-manager.ts Start/stop/list/reload agent VMs
|
||||
network.ts Bridge, tap devices, IP allocation
|
||||
rootfs.ts Image copy, SSH key injection
|
||||
ssh.ts SSH execution (ssh2)
|
||||
cleanup.ts Signal handlers
|
||||
config.ts Constants, paths
|
||||
types.ts Interfaces
|
||||
setup.ts One-time setup
|
||||
|
||||
agent/
|
||||
agent.py Python IRC bot (stdlib only, baked into rootfs)
|
||||
|
||||
scripts/
|
||||
setup-bridge.sh Create fcbr0 bridge + NAT rules
|
||||
teardown-bridge.sh Remove bridge + NAT rules
|
||||
|
||||
tests/
|
||||
test-suite.sh Regression tests (20 tests)
|
||||
```
|
||||
|
||||
## Data Directory
|
||||
|
||||
```
|
||||
~/.fireclaw/
|
||||
vmlinux Firecracker kernel (5.10)
|
||||
base-rootfs.ext4 Alpine base (ephemeral runs)
|
||||
agent-rootfs.ext4 Agent image (1 GiB sparse, Alpine + Python + podman)
|
||||
snapshot-rootfs.ext4 Snapshot rootfs
|
||||
snapshot.{state,mem} VM snapshot
|
||||
id_ed25519[.pub] SSH keypair
|
||||
agents.json Running agent state
|
||||
templates/ Agent persona templates
|
||||
workspaces/ Persistent agent storage (64 MiB ext4 each)
|
||||
runs/ Per-agent rootfs copies
|
||||
ip-pool.json IP allocation
|
||||
```
|
||||
|
||||
### Agent Workspace (inside VM at /workspace)
|
||||
|
||||
```
|
||||
/workspace/
|
||||
MEMORY.md Memory index — loaded into system prompt
|
||||
memory/
|
||||
user_prefs.md Learned facts about users
|
||||
project_x.md Ongoing project context
|
||||
```
|
||||
|
||||
## Testing
|
||||
|
||||
```bash
|
||||
# Requires overseer running via systemd
|
||||
./tests/test-suite.sh
|
||||
```
|
||||
|
||||
20 tests covering: overseer commands, agent lifecycle, IRC interaction, tool access, error handling, CLI operations, crash recovery, graceful shutdown.
|
||||
|
||||
## Known Limitations
|
||||
|
||||
- Snapshot mode doesn't support concurrent ephemeral runs (single fixed IP/tap)
|
||||
- `sudo` required for tap device and rootfs mount operations
|
||||
- CPU inference: 5-30s per response depending on model
|
||||
- No thin provisioning — full rootfs copy per agent (~146 MiB)
|
||||
|
||||
## License
|
||||
|
||||
Apache-2.0
|
||||
65
ROADMAP.md
Normal file
65
ROADMAP.md
Normal file
@@ -0,0 +1,65 @@
|
||||
# Fireclaw Roadmap
|
||||
|
||||
## Phase 1: Core CLI (done)
|
||||
|
||||
- [x] Firecracker microVM lifecycle (boot, exec, destroy)
|
||||
- [x] SSH-based command execution
|
||||
- [x] Network isolation (tap + bridge + NAT)
|
||||
- [x] IP pool management for concurrent VMs
|
||||
- [x] Signal handling and cleanup
|
||||
- [x] CLI interface (`fireclaw run`, `fireclaw setup`)
|
||||
|
||||
## Phase 2: Fast & Useful (done)
|
||||
|
||||
- [x] Alpine Linux rootfs (1 GiB sparse, 146 MiB on disk)
|
||||
- [x] Precompiled binary, global `fireclaw` command
|
||||
- [x] Snapshot & restore (~1.1s vs ~2.9s cold boot)
|
||||
|
||||
## Phase 3: Multi-Agent System (done)
|
||||
|
||||
- [x] ngircd configured (`nyx.fireclaw.local`, FireclawNet)
|
||||
- [x] Channel layout: #control (overseer), #agents (common room), DMs, /invite
|
||||
- [x] Ollama with 5 models (qwen2.5-coder, qwen2.5, llama3.1, gemma3, phi4-mini)
|
||||
- [x] Agent rootfs — Alpine + Python IRC bot + podman + tools
|
||||
- [x] Agent manager — start/stop/list/reload long-running VMs
|
||||
- [x] Overseer — host-side IRC bot, !invoke/!destroy/!list/!model/!templates
|
||||
- [x] 5 agent templates — worker, coder, researcher, quick, creative
|
||||
- [x] Agent tool access — shell commands + podman containers
|
||||
- [x] Persistent workspace — 64 MiB ext4 as second virtio drive at /workspace
|
||||
- [x] Agent memory system — MEMORY.md + save_memory tool, survives restarts
|
||||
- [x] Agent hot-reload — SSH config update + SIGHUP, no VM restart
|
||||
- [x] Non-root agents — unprivileged `agent` user
|
||||
- [x] Agent-to-agent via IRC mentions, 10s cooldown
|
||||
- [x] DM support — private messages without mention
|
||||
- [x] /invite support — agents auto-join invited channels
|
||||
- [x] Overseer resilience — crash recovery, agent adoption, KillMode=process
|
||||
- [x] Graceful shutdown — SSH SIGTERM → IRC QUIT → kill VM
|
||||
- [x] Systemd service — fireclaw-overseer.service
|
||||
- [x] Regression test suite — 20 tests
|
||||
|
||||
## Phase 4: Hardening & Performance
|
||||
|
||||
- [ ] Network policies per agent — iptables rules per tap device
|
||||
- [ ] Warm pool — pre-booted VMs from snapshots for instant spawns
|
||||
- [ ] Concurrent snapshot runs via network namespaces
|
||||
- [ ] Thin provisioning — device-mapper snapshots instead of full rootfs copies
|
||||
- [ ] Thread safety — lock around IRC socket writes in agent.py
|
||||
- [ ] Agent health checks — overseer monitors and restarts dead agents
|
||||
|
||||
## Phase 5: Advanced Features
|
||||
|
||||
- [ ] Persistent agent memory v2 — richer structure, auto-save from conversations
|
||||
- [ ] Scheduled/cron tasks — agents that run on a timer
|
||||
- [ ] Advanced tool use — MCP tools, multi-step execution, file I/O
|
||||
- [ ] Cost tracking — log duration, model, tokens per interaction
|
||||
- [ ] Execution recording — full audit trail of agent actions
|
||||
|
||||
## Phase 6: Ideas & Experiments
|
||||
|
||||
- [ ] vsock — replace SSH with virtio-vsock for lower overhead
|
||||
- [ ] Web dashboard — status page for running agents
|
||||
- [ ] Podman-in-Firecracker — double isolation for untrusted container images
|
||||
- [ ] Honeypot mode — test agent safety with fake credentials/services
|
||||
- [ ] Self-healing rootfs — agents evolve their own images
|
||||
- [ ] Claude API backend — for tasks requiring deep reasoning
|
||||
- [ ] IRC federation — link nyx.fireclaw.local ↔ odin for external access
|
||||
38
TODO.md
Normal file
38
TODO.md
Normal file
@@ -0,0 +1,38 @@
|
||||
# TODO
|
||||
|
||||
## Done
|
||||
|
||||
- [x] Firecracker CLI runner with snapshots (~1.1s)
|
||||
- [x] Alpine rootfs with ca-certificates, podman, python3
|
||||
- [x] Global `fireclaw` command
|
||||
- [x] Multi-agent system — overseer + agent VMs + IRC + Ollama
|
||||
- [x] 5 agent templates (worker, coder, researcher, quick, creative)
|
||||
- [x] 5 Ollama models (qwen2.5-coder, qwen2.5, llama3.1, gemma3, phi4-mini)
|
||||
- [x] Agent tool access — shell commands + podman containers
|
||||
- [x] Persistent workspace + memory system (MEMORY.md pattern)
|
||||
- [x] Agent hot-reload — model/persona swap via SSH + SIGHUP
|
||||
- [x] Non-root agents — unprivileged `agent` user
|
||||
- [x] Agent-to-agent via IRC mentions (10s cooldown)
|
||||
- [x] DM support — private messages, no mention needed
|
||||
- [x] /invite support — agents auto-join invited channels
|
||||
- [x] Channel layout — #control (commands), #agents (common), DMs
|
||||
- [x] Overseer resilience — crash recovery, agent adoption
|
||||
- [x] Graceful shutdown — IRC QUIT before VM kill
|
||||
- [x] Systemd service (KillMode=process)
|
||||
- [x] Regression test suite (20 tests)
|
||||
|
||||
## Next up
|
||||
|
||||
- [ ] Network policies per agent — restrict internet access
|
||||
- [ ] Warm pool — pre-booted VMs for instant agent spawns
|
||||
- [ ] Persistent agent memory improvements — richer memory structure, auto-save from conversations
|
||||
- [ ] Thin provisioning — device-mapper snapshots instead of full rootfs copies
|
||||
|
||||
## Polish
|
||||
|
||||
- [ ] Fix trigger matching — only trigger when nick is at the start of the message, not anywhere in text. Currently "say hi to worker" triggers worker even when addressed to another agent.
|
||||
- [ ] Cost tracking per agent interaction
|
||||
- [ ] Execution recording / audit trail
|
||||
- [ ] Agent health checks — overseer pings agents, restarts dead ones
|
||||
- [ ] Thread safety in agent.py — lock around IRC socket writes
|
||||
- [ ] Update regression tests for new channel layout
|
||||
528
agent/agent.py
Normal file
528
agent/agent.py
Normal file
@@ -0,0 +1,528 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Fireclaw IRC agent — connects to IRC, responds via Ollama with tool access."""
|
||||
|
||||
import socket
|
||||
import json
|
||||
import sys
|
||||
import time
|
||||
import subprocess
|
||||
import urllib.request
|
||||
import urllib.error
|
||||
import signal
|
||||
import threading
|
||||
from collections import deque
|
||||
|
||||
# Load config
|
||||
with open("/etc/agent/config.json") as f:
|
||||
CONFIG = json.load(f)
|
||||
|
||||
PERSONA = ""
|
||||
try:
|
||||
with open("/etc/agent/persona.md") as f:
|
||||
PERSONA = f.read().strip()
|
||||
except FileNotFoundError:
|
||||
PERSONA = "You are a helpful assistant."
|
||||
|
||||
NICK = CONFIG.get("nick", "agent")
|
||||
CHANNEL = CONFIG.get("channel", "#agents")
|
||||
SERVER = CONFIG.get("server", "172.16.0.1")
|
||||
PORT = CONFIG.get("port", 6667)
|
||||
OLLAMA_URL = CONFIG.get("ollama_url", "http://172.16.0.1:11434")
|
||||
CONTEXT_SIZE = CONFIG.get("context_size", 20)
|
||||
MAX_RESPONSE_LINES = CONFIG.get("max_response_lines", 50)
|
||||
TOOLS_ENABLED = CONFIG.get("tools", True)
|
||||
MAX_TOOL_ROUNDS = CONFIG.get("max_tool_rounds", 5)
|
||||
WORKSPACE = "/workspace"
|
||||
|
||||
# Mutable runtime config — can be hot-reloaded via SIGHUP
|
||||
RUNTIME = {
|
||||
"model": CONFIG.get("model", "qwen2.5-coder:7b"),
|
||||
"trigger": CONFIG.get("trigger", "mention"),
|
||||
"persona": PERSONA,
|
||||
}
|
||||
|
||||
# Recent messages for context
|
||||
recent = deque(maxlen=CONTEXT_SIZE)
|
||||
|
||||
# Load persistent memory from workspace
|
||||
AGENT_MEMORY = ""
|
||||
try:
|
||||
import os
|
||||
with open(f"{WORKSPACE}/MEMORY.md") as f:
|
||||
AGENT_MEMORY = f.read().strip()
|
||||
# Also load all memory files referenced in the index
|
||||
mem_dir = f"{WORKSPACE}/memory"
|
||||
if os.path.isdir(mem_dir):
|
||||
for fname in sorted(os.listdir(mem_dir)):
|
||||
if fname.endswith(".md"):
|
||||
try:
|
||||
with open(f"{mem_dir}/{fname}") as f:
|
||||
topic = fname.replace(".md", "")
|
||||
AGENT_MEMORY += f"\n\n## {topic}\n{f.read().strip()}"
|
||||
except Exception:
|
||||
pass
|
||||
except FileNotFoundError:
|
||||
pass
|
||||
|
||||
# Tool definitions for Ollama chat API
|
||||
TOOLS = [
|
||||
{
|
||||
"type": "function",
|
||||
"function": {
|
||||
"name": "run_command",
|
||||
"description": "Execute a shell command on this system and return the output. Use this to check system info, run scripts, fetch URLs, process data, etc.",
|
||||
"parameters": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"command": {
|
||||
"type": "string",
|
||||
"description": "The shell command to execute (bash)",
|
||||
}
|
||||
},
|
||||
"required": ["command"],
|
||||
},
|
||||
},
|
||||
},
|
||||
{
|
||||
"type": "function",
|
||||
"function": {
|
||||
"name": "save_memory",
|
||||
"description": "Save something important to your persistent memory. Use this to remember facts about users, lessons learned, project context, or anything you want to recall in future conversations. Memories survive restarts.",
|
||||
"parameters": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"topic": {
|
||||
"type": "string",
|
||||
"description": "Short topic name for the memory file (e.g. 'user_prefs', 'project_x', 'lessons')",
|
||||
},
|
||||
"content": {
|
||||
"type": "string",
|
||||
"description": "The memory content to save",
|
||||
},
|
||||
},
|
||||
"required": ["topic", "content"],
|
||||
},
|
||||
},
|
||||
},
|
||||
]
|
||||
|
||||
|
||||
def log(msg):
|
||||
print(f"[agent:{NICK}] {msg}", flush=True)
|
||||
|
||||
|
||||
class IRCClient:
|
||||
def __init__(self, server, port, nick):
|
||||
self.server = server
|
||||
self.port = port
|
||||
self.nick = nick
|
||||
self.sock = None
|
||||
self.buf = ""
|
||||
|
||||
def connect(self):
|
||||
self.sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
|
||||
self.sock.settimeout(300)
|
||||
self.sock.connect((self.server, self.port))
|
||||
self.send(f"NICK {self.nick}")
|
||||
self.send(f"USER {self.nick} 0 * :Fireclaw Agent")
|
||||
|
||||
def send(self, msg):
|
||||
self.sock.sendall(f"{msg}\r\n".encode("utf-8"))
|
||||
|
||||
def join(self, channel):
|
||||
self.send(f"JOIN {channel}")
|
||||
|
||||
def say(self, target, text):
|
||||
for line in text.split("\n"):
|
||||
line = line.strip()
|
||||
if line:
|
||||
while len(line) > 400:
|
||||
self.send(f"PRIVMSG {target} :{line[:400]}")
|
||||
line = line[400:]
|
||||
self.send(f"PRIVMSG {target} :{line}")
|
||||
|
||||
def set_bot_mode(self):
|
||||
self.send(f"MODE {self.nick} +B")
|
||||
|
||||
def recv_lines(self):
|
||||
try:
|
||||
data = self.sock.recv(4096)
|
||||
except socket.timeout:
|
||||
return []
|
||||
if not data:
|
||||
raise ConnectionError("Connection closed")
|
||||
self.buf += data.decode("utf-8", errors="replace")
|
||||
lines = self.buf.split("\r\n")
|
||||
self.buf = lines.pop()
|
||||
return lines
|
||||
|
||||
|
||||
def run_command(command):
|
||||
"""Execute a shell command and return output."""
|
||||
log(f"Running command: {command[:100]}")
|
||||
try:
|
||||
result = subprocess.run(
|
||||
["bash", "-c", command],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
timeout=120,
|
||||
)
|
||||
output = result.stdout
|
||||
if result.stderr:
|
||||
output += f"\n[stderr] {result.stderr}"
|
||||
if result.returncode != 0:
|
||||
output += f"\n[exit code: {result.returncode}]"
|
||||
# Truncate very long output
|
||||
if len(output) > 2000:
|
||||
output = output[:2000] + "\n[output truncated]"
|
||||
return output.strip() or "[no output]"
|
||||
except subprocess.TimeoutExpired:
|
||||
return "[command timed out after 120s]"
|
||||
except Exception as e:
|
||||
return f"[error: {e}]"
|
||||
|
||||
|
||||
def save_memory(topic, content):
|
||||
"""Save a memory to the persistent workspace."""
|
||||
import os
|
||||
mem_dir = f"{WORKSPACE}/memory"
|
||||
os.makedirs(mem_dir, exist_ok=True)
|
||||
|
||||
# Write the memory file
|
||||
filepath = f"{mem_dir}/{topic}.md"
|
||||
with open(filepath, "w") as f:
|
||||
f.write(content + "\n")
|
||||
|
||||
# Update MEMORY.md index
|
||||
index_path = f"{WORKSPACE}/MEMORY.md"
|
||||
existing = ""
|
||||
try:
|
||||
with open(index_path) as f:
|
||||
existing = f.read()
|
||||
except FileNotFoundError:
|
||||
existing = "# Agent Memory\n"
|
||||
|
||||
# Add or update entry
|
||||
entry = f"- [{topic}](memory/{topic}.md)"
|
||||
if topic not in existing:
|
||||
with open(index_path, "a") as f:
|
||||
f.write(f"\n{entry}")
|
||||
|
||||
# Reload memory into global
|
||||
global AGENT_MEMORY
|
||||
with open(index_path) as f:
|
||||
AGENT_MEMORY = f.read().strip()
|
||||
|
||||
log(f"Memory saved: {topic}")
|
||||
return f"Memory saved to {filepath}"
|
||||
|
||||
|
||||
def try_parse_tool_call(text):
|
||||
"""Try to parse a text-based tool call from model output.
|
||||
Handles formats like:
|
||||
{"name": "run_command", "arguments": {"command": "uptime"}}
|
||||
<tool_call>{"name": "run_command", ...}</tool_call>
|
||||
Returns (name, args) tuple or None.
|
||||
"""
|
||||
import re
|
||||
# Strip tool_call tags if present
|
||||
text = re.sub(r"</?tool_call>", "", text).strip()
|
||||
# Try to find JSON in the text
|
||||
for start in range(len(text)):
|
||||
if text[start] == "{":
|
||||
for end in range(len(text), start, -1):
|
||||
if text[end - 1] == "}":
|
||||
try:
|
||||
obj = json.loads(text[start:end])
|
||||
name = obj.get("name")
|
||||
args = obj.get("arguments", {})
|
||||
if name and isinstance(args, dict):
|
||||
return (name, args)
|
||||
except json.JSONDecodeError:
|
||||
continue
|
||||
return None
|
||||
|
||||
|
||||
def ollama_request(payload):
|
||||
"""Make a request to Ollama API."""
|
||||
data = json.dumps(payload).encode("utf-8")
|
||||
req = urllib.request.Request(
|
||||
f"{OLLAMA_URL}/api/chat",
|
||||
data=data,
|
||||
headers={"Content-Type": "application/json"},
|
||||
)
|
||||
with urllib.request.urlopen(req, timeout=120) as resp:
|
||||
return json.loads(resp.read())
|
||||
|
||||
|
||||
def query_ollama(messages):
|
||||
"""Call Ollama chat API with tool support. Returns final response text."""
|
||||
payload = {
|
||||
"model": RUNTIME["model"],
|
||||
"messages": messages,
|
||||
"stream": False,
|
||||
"options": {"num_predict": 512},
|
||||
}
|
||||
|
||||
if TOOLS_ENABLED:
|
||||
payload["tools"] = TOOLS
|
||||
|
||||
for round_num in range(MAX_TOOL_ROUNDS):
|
||||
try:
|
||||
data = ollama_request(payload)
|
||||
except (urllib.error.URLError, TimeoutError) as e:
|
||||
return f"[error: {e}]"
|
||||
|
||||
msg = data.get("message", {})
|
||||
|
||||
# Check for structured tool calls from API
|
||||
tool_calls = msg.get("tool_calls")
|
||||
if tool_calls:
|
||||
messages.append(msg)
|
||||
|
||||
for tc in tool_calls:
|
||||
fn = tc.get("function", {})
|
||||
fn_name = fn.get("name", "")
|
||||
fn_args = fn.get("arguments", {})
|
||||
|
||||
if fn_name == "run_command":
|
||||
cmd = fn_args.get("command", "")
|
||||
log(f"Tool call [{round_num+1}/{MAX_TOOL_ROUNDS}]: {cmd[:80]}")
|
||||
result = run_command(cmd)
|
||||
messages.append({"role": "tool", "content": result})
|
||||
elif fn_name == "save_memory":
|
||||
topic = fn_args.get("topic", "note")
|
||||
content = fn_args.get("content", "")
|
||||
log(f"Tool call [{round_num+1}/{MAX_TOOL_ROUNDS}]: save_memory({topic})")
|
||||
result = save_memory(topic, content)
|
||||
messages.append({"role": "tool", "content": result})
|
||||
else:
|
||||
messages.append({
|
||||
"role": "tool",
|
||||
"content": f"[unknown tool: {fn_name}]",
|
||||
})
|
||||
|
||||
payload["messages"] = messages
|
||||
continue
|
||||
|
||||
# Check for text-based tool calls (model dumped JSON as text)
|
||||
content = msg.get("content", "").strip()
|
||||
parsed_tool = try_parse_tool_call(content)
|
||||
if parsed_tool:
|
||||
fn_name, fn_args = parsed_tool
|
||||
messages.append({"role": "assistant", "content": content})
|
||||
if fn_name == "run_command":
|
||||
cmd = fn_args.get("command", "")
|
||||
log(f"Text tool call [{round_num+1}/{MAX_TOOL_ROUNDS}]: {cmd[:80]}")
|
||||
result = run_command(cmd)
|
||||
messages.append({"role": "user", "content": f"Command output:\n{result}\n\nNow provide your response to the user based on this output."})
|
||||
elif fn_name == "save_memory":
|
||||
topic = fn_args.get("topic", "note")
|
||||
mem_content = fn_args.get("content", "")
|
||||
log(f"Text tool call [{round_num+1}/{MAX_TOOL_ROUNDS}]: save_memory({topic})")
|
||||
result = save_memory(topic, mem_content)
|
||||
messages.append({"role": "user", "content": f"{result}\n\nNow respond to the user."})
|
||||
payload["messages"] = messages
|
||||
continue
|
||||
|
||||
# No tool calls — return the text response
|
||||
return content
|
||||
|
||||
return "[max tool rounds reached]"
|
||||
|
||||
|
||||
def build_messages(question, channel):
|
||||
"""Build chat messages with system prompt and conversation history."""
|
||||
system = RUNTIME["persona"]
|
||||
if TOOLS_ENABLED:
|
||||
system += "\n\nYou have access to tools:"
|
||||
system += "\n- run_command: Execute shell commands on your system."
|
||||
system += "\n- save_memory: Save important information to your persistent workspace (/workspace/memory/). Use this to remember things across restarts — user preferences, learned facts, project context."
|
||||
system += "\nUse tools when needed rather than guessing. Your workspace at /workspace persists across restarts."
|
||||
if AGENT_MEMORY and AGENT_MEMORY != "# Agent Memory":
|
||||
system += f"\n\nIMPORTANT - Your persistent memory (facts you saved previously, use these to answer questions):\n{AGENT_MEMORY}"
|
||||
system += f"\n\nYou are in IRC channel {channel}. Your nick is {NICK}. Keep responses concise — this is IRC."
|
||||
|
||||
messages = [{"role": "system", "content": system}]
|
||||
|
||||
# Build conversation history as alternating user/assistant messages
|
||||
channel_msgs = [m for m in recent if m["channel"] == channel]
|
||||
for msg in channel_msgs[-CONTEXT_SIZE:]:
|
||||
if msg["nick"] == NICK:
|
||||
messages.append({"role": "assistant", "content": msg["text"]})
|
||||
else:
|
||||
messages.append({"role": "user", "content": f"<{msg['nick']}> {msg['text']}"})
|
||||
|
||||
# Ensure the last message is from the user (the triggering question)
|
||||
# If the deque already captured it, don't double-add
|
||||
last = messages[-1] if len(messages) > 1 else None
|
||||
if not last or last.get("role") != "user" or question not in last.get("content", ""):
|
||||
messages.append({"role": "user", "content": question})
|
||||
|
||||
return messages
|
||||
|
||||
|
||||
def should_trigger(text):
|
||||
"""Check if this message should trigger a response."""
|
||||
if RUNTIME["trigger"] == "all":
|
||||
return True
|
||||
lower = text.lower()
|
||||
return NICK.lower() in lower or text.startswith("!ask ")
|
||||
|
||||
|
||||
def extract_question(text):
|
||||
"""Extract the actual question from the trigger."""
|
||||
lower = text.lower()
|
||||
for prefix in [
|
||||
f"{NICK.lower()}: ",
|
||||
f"{NICK.lower()}, ",
|
||||
f"@{NICK.lower()} ",
|
||||
f"{NICK.lower()} ",
|
||||
]:
|
||||
if lower.startswith(prefix):
|
||||
return text[len(prefix):]
|
||||
if text.startswith("!ask "):
|
||||
return text[5:]
|
||||
return text
|
||||
|
||||
|
||||
# Track last response time to prevent agent-to-agent loops
|
||||
_last_response_time = 0
|
||||
_AGENT_COOLDOWN = 10 # seconds between responses to prevent loops
|
||||
|
||||
|
||||
def handle_message(irc, source_nick, target, text):
|
||||
"""Process an incoming PRIVMSG."""
|
||||
global _last_response_time
|
||||
|
||||
is_dm = not target.startswith("#")
|
||||
channel = source_nick if is_dm else target
|
||||
reply_to = source_nick if is_dm else target
|
||||
|
||||
recent.append({"nick": source_nick, "text": text, "channel": channel})
|
||||
|
||||
if source_nick == NICK:
|
||||
return
|
||||
|
||||
# DMs always trigger, channel messages need mention
|
||||
if not is_dm and not should_trigger(text):
|
||||
return
|
||||
|
||||
# Cooldown to prevent agent-to-agent loops
|
||||
now = time.time()
|
||||
if now - _last_response_time < _AGENT_COOLDOWN:
|
||||
log(f"Cooldown active, ignoring trigger from {source_nick}")
|
||||
return
|
||||
_last_response_time = now
|
||||
|
||||
question = extract_question(text) if not is_dm else text
|
||||
log(f"Triggered by {source_nick} in {channel}: {question[:80]}")
|
||||
|
||||
def do_respond():
|
||||
try:
|
||||
messages = build_messages(question, channel)
|
||||
response = query_ollama(messages)
|
||||
|
||||
if not response:
|
||||
return
|
||||
|
||||
lines = response.split("\n")
|
||||
if len(lines) > MAX_RESPONSE_LINES:
|
||||
lines = lines[:MAX_RESPONSE_LINES]
|
||||
lines.append(f"[truncated, {MAX_RESPONSE_LINES} lines max]")
|
||||
|
||||
irc.say(reply_to, "\n".join(lines))
|
||||
recent.append({"nick": NICK, "text": response[:200], "channel": channel})
|
||||
except Exception as e:
|
||||
log(f"Error handling message: {e}")
|
||||
try:
|
||||
irc.say(reply_to, f"[error: {e}]")
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
threading.Thread(target=do_respond, daemon=True).start()
|
||||
|
||||
|
||||
def run():
|
||||
log(f"Starting agent: nick={NICK} channel={CHANNEL} model={RUNTIME['model']} tools={TOOLS_ENABLED}")
|
||||
|
||||
while True:
|
||||
try:
|
||||
irc = IRCClient(SERVER, PORT, NICK)
|
||||
log(f"Connecting to {SERVER}:{PORT}...")
|
||||
irc.connect()
|
||||
|
||||
# Hot-reload on SIGHUP — re-read config and persona
|
||||
def handle_sighup(signum, frame):
|
||||
log("SIGHUP received, reloading config...")
|
||||
try:
|
||||
with open("/etc/agent/config.json") as f:
|
||||
new_config = json.load(f)
|
||||
RUNTIME["model"] = new_config.get("model", RUNTIME["model"])
|
||||
RUNTIME["trigger"] = new_config.get("trigger", RUNTIME["trigger"])
|
||||
try:
|
||||
with open("/etc/agent/persona.md") as f:
|
||||
RUNTIME["persona"] = f.read().strip()
|
||||
except FileNotFoundError:
|
||||
pass
|
||||
log(f"Reloaded: model={RUNTIME['model']} trigger={RUNTIME['trigger']}")
|
||||
irc.say(CHANNEL, f"[reloaded: model={RUNTIME['model']}]")
|
||||
except Exception as e:
|
||||
log(f"Reload failed: {e}")
|
||||
|
||||
signal.signal(signal.SIGHUP, handle_sighup)
|
||||
|
||||
# Graceful shutdown on SIGTERM — send IRC QUIT
|
||||
def handle_sigterm(signum, frame):
|
||||
log("SIGTERM received, quitting IRC...")
|
||||
try:
|
||||
irc.send("QUIT :Agent shutting down")
|
||||
except Exception:
|
||||
pass
|
||||
time.sleep(0.5)
|
||||
sys.exit(0)
|
||||
|
||||
signal.signal(signal.SIGTERM, handle_sigterm)
|
||||
|
||||
registered = False
|
||||
while True:
|
||||
lines = irc.recv_lines()
|
||||
for line in lines:
|
||||
if line.startswith("PING"):
|
||||
irc.send(f"PONG {line.split(' ', 1)[1]}")
|
||||
continue
|
||||
|
||||
parts = line.split(" ")
|
||||
if len(parts) < 2:
|
||||
continue
|
||||
|
||||
if parts[1] == "001" and not registered:
|
||||
registered = True
|
||||
log("Registered with server")
|
||||
irc.set_bot_mode()
|
||||
irc.join("#agents")
|
||||
log(f"Joined #agents")
|
||||
|
||||
# Handle INVITE — auto-join invited channels
|
||||
if parts[1] == "INVITE" and len(parts) >= 3:
|
||||
invited_channel = parts[-1].lstrip(":")
|
||||
inviter = parts[0].split("!")[0].lstrip(":")
|
||||
log(f"Invited to {invited_channel} by {inviter}, joining...")
|
||||
irc.join(invited_channel)
|
||||
|
||||
if parts[1] == "PRIVMSG" and len(parts) >= 4:
|
||||
source_nick = parts[0].split("!")[0].lstrip(":")
|
||||
target = parts[2]
|
||||
text = " ".join(parts[3:]).lstrip(":")
|
||||
handle_message(irc, source_nick, target, text)
|
||||
|
||||
except (ConnectionError, OSError, socket.timeout) as e:
|
||||
log(f"Disconnected: {e}. Reconnecting in 5s...")
|
||||
time.sleep(5)
|
||||
except KeyboardInterrupt:
|
||||
log("Shutting down.")
|
||||
sys.exit(0)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
run()
|
||||
25
eslint.config.js
Normal file
25
eslint.config.js
Normal file
@@ -0,0 +1,25 @@
|
||||
import tseslint from "@typescript-eslint/eslint-plugin";
|
||||
import tsparser from "@typescript-eslint/parser";
|
||||
|
||||
export default [
|
||||
{
|
||||
files: ["src/**/*.ts"],
|
||||
languageOptions: {
|
||||
parser: tsparser,
|
||||
parserOptions: {
|
||||
projectService: true,
|
||||
},
|
||||
},
|
||||
plugins: {
|
||||
"@typescript-eslint": tseslint,
|
||||
},
|
||||
rules: {
|
||||
...tseslint.configs.recommended.rules,
|
||||
"@typescript-eslint/no-unused-vars": [
|
||||
"error",
|
||||
{ argsIgnorePattern: "^_" },
|
||||
],
|
||||
"no-console": "warn",
|
||||
},
|
||||
},
|
||||
];
|
||||
2385
package-lock.json
generated
Normal file
2385
package-lock.json
generated
Normal file
File diff suppressed because it is too large
Load Diff
27
package.json
Normal file
27
package.json
Normal file
@@ -0,0 +1,27 @@
|
||||
{
|
||||
"name": "fireclaw",
|
||||
"version": "0.1.0",
|
||||
"description": "Run commands in ephemeral Firecracker microVMs",
|
||||
"type": "module",
|
||||
"bin": {
|
||||
"fireclaw": "./dist/index.js"
|
||||
},
|
||||
"scripts": {
|
||||
"build": "tsc",
|
||||
"dev": "tsx src/index.ts"
|
||||
},
|
||||
"dependencies": {
|
||||
"commander": "^13.0.0",
|
||||
"irc-framework": "^4.14.0",
|
||||
"ssh2": "^1.16.0"
|
||||
},
|
||||
"devDependencies": {
|
||||
"@types/node": "^22.0.0",
|
||||
"@types/ssh2": "^1.15.0",
|
||||
"@typescript-eslint/eslint-plugin": "^8.58.0",
|
||||
"@typescript-eslint/parser": "^8.58.0",
|
||||
"eslint": "^10.2.0",
|
||||
"tsx": "^4.0.0",
|
||||
"typescript": "^5.7.0"
|
||||
}
|
||||
}
|
||||
30
scripts/setup-bridge.sh
Executable file
30
scripts/setup-bridge.sh
Executable file
@@ -0,0 +1,30 @@
|
||||
#!/bin/bash
|
||||
# Set up the fireclaw bridge and NAT rules
|
||||
# Run with sudo
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
BRIDGE="fcbr0"
|
||||
BRIDGE_IP="172.16.0.1/24"
|
||||
SUBNET="172.16.0.0/24"
|
||||
EXT_IFACE=$(ip route show default | awk '{print $5; exit}')
|
||||
|
||||
echo "Creating bridge ${BRIDGE}..."
|
||||
ip link add ${BRIDGE} type bridge 2>/dev/null || echo "Bridge already exists"
|
||||
ip addr add ${BRIDGE_IP} dev ${BRIDGE} 2>/dev/null || echo "Address already set"
|
||||
ip link set ${BRIDGE} up
|
||||
|
||||
echo "Enabling IP forwarding..."
|
||||
sysctl -w net.ipv4.ip_forward=1
|
||||
|
||||
echo "Setting up NAT via ${EXT_IFACE}..."
|
||||
iptables -t nat -C POSTROUTING -s ${SUBNET} -o ${EXT_IFACE} -j MASQUERADE 2>/dev/null || \
|
||||
iptables -t nat -A POSTROUTING -s ${SUBNET} -o ${EXT_IFACE} -j MASQUERADE
|
||||
|
||||
iptables -C FORWARD -i ${BRIDGE} -o ${EXT_IFACE} -j ACCEPT 2>/dev/null || \
|
||||
iptables -A FORWARD -i ${BRIDGE} -o ${EXT_IFACE} -j ACCEPT
|
||||
|
||||
iptables -C FORWARD -i ${EXT_IFACE} -o ${BRIDGE} -m state --state RELATED,ESTABLISHED -j ACCEPT 2>/dev/null || \
|
||||
iptables -A FORWARD -i ${EXT_IFACE} -o ${BRIDGE} -m state --state RELATED,ESTABLISHED -j ACCEPT
|
||||
|
||||
echo "Done. Bridge ${BRIDGE} ready."
|
||||
20
scripts/teardown-bridge.sh
Executable file
20
scripts/teardown-bridge.sh
Executable file
@@ -0,0 +1,20 @@
|
||||
#!/bin/bash
|
||||
# Remove the fireclaw bridge and NAT rules
|
||||
# Run with sudo
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
BRIDGE="fcbr0"
|
||||
SUBNET="172.16.0.0/24"
|
||||
EXT_IFACE=$(ip route show default | awk '{print $5; exit}')
|
||||
|
||||
echo "Removing NAT rules..."
|
||||
iptables -t nat -D POSTROUTING -s ${SUBNET} -o ${EXT_IFACE} -j MASQUERADE 2>/dev/null || true
|
||||
iptables -D FORWARD -i ${BRIDGE} -o ${EXT_IFACE} -j ACCEPT 2>/dev/null || true
|
||||
iptables -D FORWARD -i ${EXT_IFACE} -o ${BRIDGE} -m state --state RELATED,ESTABLISHED -j ACCEPT 2>/dev/null || true
|
||||
|
||||
echo "Removing bridge ${BRIDGE}..."
|
||||
ip link set ${BRIDGE} down 2>/dev/null || true
|
||||
ip link del ${BRIDGE} 2>/dev/null || true
|
||||
|
||||
echo "Done."
|
||||
557
src/agent-manager.ts
Normal file
557
src/agent-manager.ts
Normal file
@@ -0,0 +1,557 @@
|
||||
import { spawn, type ChildProcess } from "node:child_process";
|
||||
import {
|
||||
existsSync,
|
||||
mkdirSync,
|
||||
readFileSync,
|
||||
writeFileSync,
|
||||
copyFileSync,
|
||||
unlinkSync,
|
||||
readdirSync,
|
||||
} from "node:fs";
|
||||
import { join } from "node:path";
|
||||
import { execFileSync } from "node:child_process";
|
||||
import { CONFIG } from "./config.js";
|
||||
import {
|
||||
ensureBridge,
|
||||
ensureNat,
|
||||
allocateIp,
|
||||
releaseIp,
|
||||
createTap,
|
||||
deleteTap,
|
||||
macFromOctet,
|
||||
} from "./network.js";
|
||||
import * as api from "./firecracker-api.js";
|
||||
|
||||
export interface AgentInfo {
|
||||
name: string;
|
||||
nick: string;
|
||||
model: string;
|
||||
template: string;
|
||||
ip: string;
|
||||
octet: number;
|
||||
tapDevice: string;
|
||||
socketPath: string;
|
||||
rootfsPath: string;
|
||||
pid: number;
|
||||
startedAt: string;
|
||||
}
|
||||
|
||||
interface AgentTemplate {
|
||||
name: string;
|
||||
nick: string;
|
||||
model: string;
|
||||
trigger: string;
|
||||
persona: string;
|
||||
}
|
||||
|
||||
const AGENTS_FILE = join(CONFIG.baseDir, "agents.json");
|
||||
const TEMPLATES_DIR = join(CONFIG.baseDir, "templates");
|
||||
const AGENT_ROOTFS = join(CONFIG.baseDir, "agent-rootfs.ext4");
|
||||
const WORKSPACES_DIR = CONFIG.workspacesDir;
|
||||
|
||||
function log(msg: string) {
|
||||
process.stderr.write(`[agent-mgr] ${msg}\n`);
|
||||
}
|
||||
|
||||
function loadAgents(): Record<string, AgentInfo> {
|
||||
try {
|
||||
return JSON.parse(readFileSync(AGENTS_FILE, "utf-8"));
|
||||
} catch {
|
||||
return {};
|
||||
}
|
||||
}
|
||||
|
||||
function saveAgents(agents: Record<string, AgentInfo>) {
|
||||
writeFileSync(AGENTS_FILE, JSON.stringify(agents, null, 2));
|
||||
}
|
||||
|
||||
export function loadTemplate(name: string): AgentTemplate {
|
||||
const path = join(TEMPLATES_DIR, `${name}.json`);
|
||||
if (!existsSync(path)) {
|
||||
throw new Error(`Template "${name}" not found at ${path}`);
|
||||
}
|
||||
return JSON.parse(readFileSync(path, "utf-8"));
|
||||
}
|
||||
|
||||
export function listTemplates(): string[] {
|
||||
try {
|
||||
return readdirSync(TEMPLATES_DIR)
|
||||
.filter((f) => f.endsWith(".json"))
|
||||
.map((f) => f.replace(".json", ""));
|
||||
} catch {
|
||||
return [];
|
||||
}
|
||||
}
|
||||
|
||||
function injectAgentConfig(
|
||||
rootfsPath: string,
|
||||
config: { nick: string; model: string; trigger: string },
|
||||
persona: string
|
||||
) {
|
||||
const mountPoint = `/tmp/fireclaw-agent-${Date.now()}`;
|
||||
mkdirSync(mountPoint, { recursive: true });
|
||||
try {
|
||||
execFileSync("sudo", ["mount", "-o", "loop", rootfsPath, mountPoint], {
|
||||
stdio: "pipe",
|
||||
});
|
||||
execFileSync(
|
||||
"sudo",
|
||||
["mkdir", "-p", join(mountPoint, "etc/agent")],
|
||||
{ stdio: "pipe" }
|
||||
);
|
||||
|
||||
// Write config
|
||||
const configJson = JSON.stringify({
|
||||
nick: config.nick,
|
||||
model: config.model,
|
||||
trigger: config.trigger,
|
||||
server: "172.16.0.1",
|
||||
port: 6667,
|
||||
ollama_url: "http://172.16.0.1:11434",
|
||||
});
|
||||
execFileSync(
|
||||
"sudo",
|
||||
[
|
||||
"bash",
|
||||
"-c",
|
||||
`echo '${configJson}' > ${join(mountPoint, "etc/agent/config.json")}`,
|
||||
],
|
||||
{ stdio: "pipe" }
|
||||
);
|
||||
|
||||
// Write persona
|
||||
execFileSync(
|
||||
"sudo",
|
||||
[
|
||||
"bash",
|
||||
"-c",
|
||||
`cat > ${join(mountPoint, "etc/agent/persona.md")} << 'PERSONA_EOF'\n${persona}\nPERSONA_EOF`,
|
||||
],
|
||||
{ stdio: "pipe" }
|
||||
);
|
||||
|
||||
// Inject SSH key for debugging access
|
||||
execFileSync("sudo", ["mkdir", "-p", join(mountPoint, "root/.ssh")], {
|
||||
stdio: "pipe",
|
||||
});
|
||||
if (existsSync(CONFIG.sshPubKeyPath)) {
|
||||
execFileSync(
|
||||
"sudo",
|
||||
[
|
||||
"cp",
|
||||
CONFIG.sshPubKeyPath,
|
||||
join(mountPoint, "root/.ssh/authorized_keys"),
|
||||
],
|
||||
{ stdio: "pipe" }
|
||||
);
|
||||
execFileSync(
|
||||
"sudo",
|
||||
["chmod", "600", join(mountPoint, "root/.ssh/authorized_keys")],
|
||||
{ stdio: "pipe" }
|
||||
);
|
||||
}
|
||||
} finally {
|
||||
try {
|
||||
execFileSync("sudo", ["umount", mountPoint], { stdio: "pipe" });
|
||||
} catch {}
|
||||
try {
|
||||
execFileSync("rmdir", [mountPoint], { stdio: "pipe" });
|
||||
} catch {}
|
||||
}
|
||||
}
|
||||
|
||||
function ensureWorkspace(agentName: string): string {
|
||||
mkdirSync(WORKSPACES_DIR, { recursive: true });
|
||||
const imgPath = join(WORKSPACES_DIR, `${agentName}.ext4`);
|
||||
|
||||
if (!existsSync(imgPath)) {
|
||||
log(`Creating workspace for "${agentName}" (${CONFIG.workspaceSizeMib} MiB)...`);
|
||||
execFileSync("truncate", ["-s", `${CONFIG.workspaceSizeMib}M`, imgPath], {
|
||||
stdio: "pipe",
|
||||
});
|
||||
execFileSync("sudo", ["/usr/sbin/mkfs.ext4", "-q", imgPath], {
|
||||
stdio: "pipe",
|
||||
});
|
||||
|
||||
// Seed with MEMORY.md template
|
||||
const mountPoint = `/tmp/fireclaw-ws-${Date.now()}`;
|
||||
mkdirSync(mountPoint, { recursive: true });
|
||||
try {
|
||||
execFileSync("sudo", ["mount", "-o", "loop", imgPath, mountPoint], {
|
||||
stdio: "pipe",
|
||||
});
|
||||
execFileSync(
|
||||
"sudo",
|
||||
["bash", "-c", `mkdir -p ${mountPoint}/memory && echo '# Agent Memory' > ${mountPoint}/MEMORY.md`],
|
||||
{ stdio: "pipe" }
|
||||
);
|
||||
execFileSync("sudo", ["chown", "-R", "0:0", mountPoint], {
|
||||
stdio: "pipe",
|
||||
});
|
||||
} finally {
|
||||
try { execFileSync("sudo", ["umount", mountPoint], { stdio: "pipe" }); } catch {}
|
||||
try { execFileSync("rmdir", [mountPoint], { stdio: "pipe" }); } catch {}
|
||||
}
|
||||
}
|
||||
|
||||
return imgPath;
|
||||
}
|
||||
|
||||
function waitForSocket(socketPath: string): Promise<void> {
|
||||
return new Promise((resolve, reject) => {
|
||||
const deadline = Date.now() + 5_000;
|
||||
const check = () => {
|
||||
if (existsSync(socketPath)) {
|
||||
setTimeout(resolve, 200);
|
||||
return;
|
||||
}
|
||||
if (Date.now() > deadline) {
|
||||
reject(new Error("Firecracker socket did not appear"));
|
||||
return;
|
||||
}
|
||||
setTimeout(check, 50);
|
||||
};
|
||||
check();
|
||||
});
|
||||
}
|
||||
|
||||
export async function startAgent(
|
||||
templateName: string,
|
||||
overrides?: { name?: string; model?: string }
|
||||
): Promise<AgentInfo> {
|
||||
if (!existsSync(AGENT_ROOTFS)) {
|
||||
throw new Error(
|
||||
`Agent rootfs not found at ${AGENT_ROOTFS}. Build it first.`
|
||||
);
|
||||
}
|
||||
|
||||
const template = loadTemplate(templateName);
|
||||
const name = overrides?.name ?? template.name;
|
||||
const nick = overrides?.name ?? template.nick;
|
||||
const model = overrides?.model ?? template.model;
|
||||
|
||||
// Check not already running
|
||||
const agents = loadAgents();
|
||||
if (agents[name]) {
|
||||
throw new Error(`Agent "${name}" is already running`);
|
||||
}
|
||||
|
||||
log(`Starting agent "${name}" (template: ${templateName})...`);
|
||||
|
||||
// Allocate resources
|
||||
const { ip, octet } = allocateIp();
|
||||
const tapDevice = `fctap${octet}`;
|
||||
const socketPath = join(CONFIG.socketDir, `agent-${name}.sock`);
|
||||
const rootfsPath = join(CONFIG.runsDir, `agent-${name}.ext4`);
|
||||
|
||||
mkdirSync(CONFIG.socketDir, { recursive: true });
|
||||
mkdirSync(CONFIG.runsDir, { recursive: true });
|
||||
|
||||
// Prepare rootfs
|
||||
copyFileSync(AGENT_ROOTFS, rootfsPath);
|
||||
injectAgentConfig(
|
||||
rootfsPath,
|
||||
{ nick, model, trigger: template.trigger },
|
||||
template.persona
|
||||
);
|
||||
|
||||
// Create/get persistent workspace
|
||||
const workspacePath = ensureWorkspace(name);
|
||||
|
||||
// Setup network
|
||||
ensureBridge();
|
||||
ensureNat();
|
||||
createTap(tapDevice);
|
||||
|
||||
// Boot VM
|
||||
const proc = spawn(
|
||||
CONFIG.firecrackerBin,
|
||||
["--api-sock", socketPath],
|
||||
{ stdio: "pipe", detached: true }
|
||||
);
|
||||
proc.unref();
|
||||
|
||||
await waitForSocket(socketPath);
|
||||
|
||||
const bootArgs = [
|
||||
"console=ttyS0",
|
||||
"reboot=k",
|
||||
"panic=1",
|
||||
"pci=off",
|
||||
"root=/dev/vda",
|
||||
"rw",
|
||||
`ip=${ip}::${CONFIG.bridge.gateway}:${CONFIG.bridge.netmask}::eth0:off`,
|
||||
].join(" ");
|
||||
|
||||
await api.putBootSource(socketPath, CONFIG.kernelPath, bootArgs);
|
||||
await api.putDrive(socketPath, "rootfs", rootfsPath);
|
||||
await api.putDrive(socketPath, "workspace", workspacePath, false, false);
|
||||
await api.putNetworkInterface(
|
||||
socketPath,
|
||||
"eth0",
|
||||
tapDevice,
|
||||
macFromOctet(octet)
|
||||
);
|
||||
await api.putMachineConfig(
|
||||
socketPath,
|
||||
CONFIG.vm.vcpuCount,
|
||||
CONFIG.vm.memSizeMib
|
||||
);
|
||||
await api.startInstance(socketPath);
|
||||
|
||||
const info: AgentInfo = {
|
||||
name,
|
||||
nick,
|
||||
model,
|
||||
template: templateName,
|
||||
ip,
|
||||
octet,
|
||||
tapDevice,
|
||||
socketPath,
|
||||
rootfsPath,
|
||||
pid: proc.pid!,
|
||||
startedAt: new Date().toISOString(),
|
||||
};
|
||||
|
||||
agents[name] = info;
|
||||
saveAgents(agents);
|
||||
|
||||
log(`Agent "${name}" started: nick=${nick} ip=${ip}`);
|
||||
return info;
|
||||
}
|
||||
|
||||
export async function stopAgent(name: string) {
|
||||
const agents = loadAgents();
|
||||
const info = agents[name];
|
||||
if (!info) {
|
||||
throw new Error(`Agent "${name}" is not running`);
|
||||
}
|
||||
|
||||
log(`Stopping agent "${name}"...`);
|
||||
|
||||
// Graceful shutdown: SSH in and kill the agent process so it sends IRC QUIT
|
||||
try {
|
||||
execFileSync(
|
||||
"ssh",
|
||||
[
|
||||
"-o", "StrictHostKeyChecking=no",
|
||||
"-o", "UserKnownHostsFile=/dev/null",
|
||||
"-o", "ConnectTimeout=3",
|
||||
"-i", CONFIG.sshKeyPath,
|
||||
`root@${info.ip}`,
|
||||
"killall python3 2>/dev/null; sleep 1",
|
||||
],
|
||||
{ stdio: "pipe", timeout: 5_000 }
|
||||
);
|
||||
} catch {
|
||||
// Best effort — VM might already be unreachable
|
||||
}
|
||||
|
||||
// Kill firecracker process and wait for it to die
|
||||
try {
|
||||
process.kill(info.pid, "SIGKILL");
|
||||
// Wait for process to actually exit before cleaning up resources
|
||||
for (let i = 0; i < 20; i++) {
|
||||
try {
|
||||
process.kill(info.pid, 0); // Check if alive
|
||||
await new Promise((r) => setTimeout(r, 200));
|
||||
} catch {
|
||||
break; // Process is gone
|
||||
}
|
||||
}
|
||||
} catch {
|
||||
// Already dead
|
||||
}
|
||||
|
||||
// Small delay to let kernel release the tap device
|
||||
await new Promise((r) => setTimeout(r, 500));
|
||||
|
||||
// Cleanup with retry for tap
|
||||
try {
|
||||
unlinkSync(info.socketPath);
|
||||
} catch {}
|
||||
for (let attempt = 0; attempt < 3; attempt++) {
|
||||
try {
|
||||
deleteTap(info.tapDevice);
|
||||
break;
|
||||
} catch {
|
||||
if (attempt < 2) await new Promise((r) => setTimeout(r, 1000));
|
||||
}
|
||||
}
|
||||
releaseIp(info.octet);
|
||||
try {
|
||||
unlinkSync(info.rootfsPath);
|
||||
} catch {}
|
||||
|
||||
delete agents[name];
|
||||
saveAgents(agents);
|
||||
log(`Agent "${name}" stopped.`);
|
||||
}
|
||||
|
||||
export function listAgents(): AgentInfo[] {
|
||||
const agents = loadAgents();
|
||||
// Verify processes are still alive
|
||||
for (const [name, info] of Object.entries(agents)) {
|
||||
try {
|
||||
process.kill(info.pid, 0);
|
||||
} catch {
|
||||
// Process is dead, clean up
|
||||
log(`Agent "${name}" is dead, cleaning up...`);
|
||||
try {
|
||||
deleteTap(info.tapDevice);
|
||||
} catch {}
|
||||
try {
|
||||
releaseIp(info.octet);
|
||||
} catch {}
|
||||
try {
|
||||
unlinkSync(info.rootfsPath);
|
||||
} catch {}
|
||||
try {
|
||||
unlinkSync(info.socketPath);
|
||||
} catch {}
|
||||
delete agents[name];
|
||||
}
|
||||
}
|
||||
saveAgents(agents);
|
||||
return Object.values(agents);
|
||||
}
|
||||
|
||||
export async function reloadAgent(
|
||||
name: string,
|
||||
updates: { model?: string; persona?: string; trigger?: string }
|
||||
) {
|
||||
const agents = loadAgents();
|
||||
const info = agents[name];
|
||||
if (!info) {
|
||||
throw new Error(`Agent "${name}" is not running`);
|
||||
}
|
||||
|
||||
log(`Reloading agent "${name}"...`);
|
||||
|
||||
// Build updated config
|
||||
const configUpdates: Record<string, string> = {};
|
||||
if (updates.model) {
|
||||
configUpdates.model = updates.model;
|
||||
info.model = updates.model;
|
||||
}
|
||||
if (updates.trigger) configUpdates.trigger = updates.trigger;
|
||||
|
||||
// Write updated config as a temp file on the VM via SSH
|
||||
const sshOpts = [
|
||||
"-o", "StrictHostKeyChecking=no",
|
||||
"-o", "UserKnownHostsFile=/dev/null",
|
||||
"-o", "ConnectTimeout=5",
|
||||
"-i", CONFIG.sshKeyPath,
|
||||
];
|
||||
const sshTarget = `root@${info.ip}`;
|
||||
|
||||
try {
|
||||
if (Object.keys(configUpdates).length > 0) {
|
||||
// Read current config from VM
|
||||
const currentRaw = execFileSync(
|
||||
"ssh",
|
||||
[...sshOpts, sshTarget, "cat /etc/agent/config.json"],
|
||||
{ encoding: "utf-8", timeout: 10_000 }
|
||||
);
|
||||
const current = JSON.parse(currentRaw);
|
||||
Object.assign(current, configUpdates);
|
||||
const newConfig = JSON.stringify(current);
|
||||
|
||||
// Write back via stdin
|
||||
execFileSync(
|
||||
"ssh",
|
||||
[...sshOpts, sshTarget, `cat > /etc/agent/config.json`],
|
||||
{ input: newConfig, timeout: 10_000 }
|
||||
);
|
||||
}
|
||||
|
||||
if (updates.persona) {
|
||||
execFileSync(
|
||||
"ssh",
|
||||
[...sshOpts, sshTarget, `cat > /etc/agent/persona.md`],
|
||||
{ input: updates.persona, timeout: 10_000 }
|
||||
);
|
||||
}
|
||||
|
||||
// Signal agent to reload
|
||||
execFileSync(
|
||||
"ssh",
|
||||
[...sshOpts, sshTarget, "killall -HUP python3"],
|
||||
{ stdio: "pipe", timeout: 10_000 }
|
||||
);
|
||||
} catch (err) {
|
||||
throw new Error(`Failed to reload agent: ${err}`);
|
||||
}
|
||||
|
||||
saveAgents(agents);
|
||||
log(`Agent "${name}" reloaded.`);
|
||||
}
|
||||
|
||||
export function reconcileAgents(): { adopted: string[]; cleaned: string[] } {
|
||||
const agents = loadAgents();
|
||||
const adopted: string[] = [];
|
||||
const cleaned: string[] = [];
|
||||
|
||||
for (const [name, info] of Object.entries(agents)) {
|
||||
let alive = false;
|
||||
try {
|
||||
process.kill(info.pid, 0);
|
||||
alive = true;
|
||||
} catch {
|
||||
// Process is dead
|
||||
}
|
||||
|
||||
if (alive) {
|
||||
adopted.push(name);
|
||||
log(`Adopted running agent "${name}" (PID ${info.pid}, ${info.ip})`);
|
||||
} else {
|
||||
log(`Cleaning dead agent "${name}" (PID ${info.pid} gone)...`);
|
||||
// Clean up resources from dead agent
|
||||
try { deleteTap(info.tapDevice); } catch {}
|
||||
try { releaseIp(info.octet); } catch {}
|
||||
try { unlinkSync(info.rootfsPath); } catch {}
|
||||
try { unlinkSync(info.socketPath); } catch {}
|
||||
delete agents[name];
|
||||
cleaned.push(name);
|
||||
}
|
||||
}
|
||||
|
||||
// Scan for orphan firecracker processes not in agents.json
|
||||
try {
|
||||
const psOutput = execFileSync("pgrep", ["-a", "firecracker"], {
|
||||
encoding: "utf-8",
|
||||
});
|
||||
for (const line of psOutput.trim().split("\n")) {
|
||||
if (!line) continue;
|
||||
const match = line.match(/agent-(\S+)\.sock/);
|
||||
if (match) {
|
||||
const agentName = match[1];
|
||||
if (!agents[agentName]) {
|
||||
const pid = parseInt(line.split(/\s+/)[0]);
|
||||
log(`Found orphan firecracker process for "${agentName}" (PID ${pid}), killing...`);
|
||||
try { process.kill(pid, "SIGKILL"); } catch {}
|
||||
cleaned.push(`orphan:${agentName}`);
|
||||
}
|
||||
}
|
||||
}
|
||||
} catch {
|
||||
// No firecracker processes running — that's fine
|
||||
}
|
||||
|
||||
saveAgents(agents);
|
||||
|
||||
if (adopted.length === 0 && cleaned.length === 0) {
|
||||
log("No agents to reconcile.");
|
||||
} else {
|
||||
log(`Reconciled: ${adopted.length} adopted, ${cleaned.length} cleaned.`);
|
||||
}
|
||||
|
||||
return { adopted, cleaned };
|
||||
}
|
||||
|
||||
export async function stopAllAgents() {
|
||||
const agents = loadAgents();
|
||||
for (const name of Object.keys(agents)) {
|
||||
await stopAgent(name);
|
||||
}
|
||||
}
|
||||
47
src/cleanup.ts
Normal file
47
src/cleanup.ts
Normal file
@@ -0,0 +1,47 @@
|
||||
import type { VMInstance } from "./vm.js";
|
||||
|
||||
const activeVms = new Set<VMInstance>();
|
||||
|
||||
export function registerVm(vm: VMInstance) {
|
||||
activeVms.add(vm);
|
||||
}
|
||||
|
||||
export function unregisterVm(vm: VMInstance) {
|
||||
activeVms.delete(vm);
|
||||
}
|
||||
|
||||
async function cleanupAll() {
|
||||
const vms = Array.from(activeVms);
|
||||
activeVms.clear();
|
||||
await Promise.allSettled(vms.map((vm) => vm.destroy()));
|
||||
}
|
||||
|
||||
let registered = false;
|
||||
|
||||
export function installSignalHandlers() {
|
||||
if (registered) return;
|
||||
registered = true;
|
||||
|
||||
const handler = async (signal: string) => {
|
||||
process.stderr.write(`\n[fireclaw] Caught ${signal}, cleaning up...\n`);
|
||||
await cleanupAll();
|
||||
process.exit(signal === "SIGINT" ? 130 : 143);
|
||||
};
|
||||
|
||||
process.on("SIGINT", () => handler("SIGINT"));
|
||||
process.on("SIGTERM", () => handler("SIGTERM"));
|
||||
|
||||
process.on("uncaughtException", async (err) => {
|
||||
process.stderr.write(`[fireclaw] Uncaught exception: ${err.message}\n`);
|
||||
await cleanupAll();
|
||||
process.exit(1);
|
||||
});
|
||||
|
||||
process.on("unhandledRejection", async (reason) => {
|
||||
process.stderr.write(
|
||||
`[fireclaw] Unhandled rejection: ${reason}\n`
|
||||
);
|
||||
await cleanupAll();
|
||||
process.exit(1);
|
||||
});
|
||||
}
|
||||
134
src/cli.ts
Normal file
134
src/cli.ts
Normal file
@@ -0,0 +1,134 @@
|
||||
import { Command } from "commander";
|
||||
import { VMInstance } from "./vm.js";
|
||||
import { installSignalHandlers } from "./cleanup.js";
|
||||
import { runSetup } from "./setup.js";
|
||||
import { createSnapshot } from "./snapshot.js";
|
||||
import { runOverseer } from "./overseer.js";
|
||||
import {
|
||||
startAgent,
|
||||
stopAgent,
|
||||
listAgents,
|
||||
} from "./agent-manager.js";
|
||||
|
||||
export function createCli() {
|
||||
const program = new Command();
|
||||
|
||||
program
|
||||
.name("fireclaw")
|
||||
.description("Run commands in ephemeral Firecracker microVMs")
|
||||
.version("0.1.0");
|
||||
|
||||
program
|
||||
.command("run")
|
||||
.description("Run a command inside a fresh microVM")
|
||||
.argument("<command>", "Command to execute inside the microVM")
|
||||
.option("-t, --timeout <seconds>", "Timeout in seconds", "60")
|
||||
.option("-v, --verbose", "Show detailed progress", false)
|
||||
.option("--mem <mib>", "Memory in MiB", "256")
|
||||
.option("--vcpu <count>", "Number of vCPUs", "1")
|
||||
.option("--no-snapshot", "Force cold boot, skip snapshot restore")
|
||||
.action(async (command: string, opts) => {
|
||||
installSignalHandlers();
|
||||
|
||||
const result = await VMInstance.run(command, {
|
||||
timeout: parseInt(opts.timeout) * 1000,
|
||||
verbose: opts.verbose,
|
||||
mem: parseInt(opts.mem),
|
||||
vcpu: parseInt(opts.vcpu),
|
||||
noSnapshot: opts.snapshot === false,
|
||||
});
|
||||
|
||||
if (!opts.verbose) {
|
||||
if (result.stdout) process.stdout.write(result.stdout);
|
||||
if (result.stderr) process.stderr.write(result.stderr);
|
||||
}
|
||||
|
||||
process.exit(result.exitCode);
|
||||
});
|
||||
|
||||
program
|
||||
.command("setup")
|
||||
.description("Download kernel, rootfs, and configure networking")
|
||||
.action(async () => {
|
||||
await runSetup();
|
||||
});
|
||||
|
||||
const snapshot = program
|
||||
.command("snapshot")
|
||||
.description("Manage VM snapshots");
|
||||
|
||||
snapshot
|
||||
.command("create")
|
||||
.description("Boot a VM and create a snapshot for fast restores")
|
||||
.action(async () => {
|
||||
installSignalHandlers();
|
||||
await createSnapshot();
|
||||
});
|
||||
|
||||
// Overseer
|
||||
program
|
||||
.command("overseer")
|
||||
.description("Start the overseer daemon (IRC bot for agent management)")
|
||||
.option("--server <host>", "IRC server", "localhost")
|
||||
.option("--port <port>", "IRC port", "6667")
|
||||
.option("--nick <nick>", "Bot nickname", "overseer")
|
||||
.option("--channel <chan>", "Control channel", "#control")
|
||||
.action(async (opts) => {
|
||||
await runOverseer({
|
||||
server: opts.server,
|
||||
port: parseInt(opts.port),
|
||||
nick: opts.nick,
|
||||
channel: opts.channel,
|
||||
});
|
||||
});
|
||||
|
||||
// Agent management
|
||||
const agent = program
|
||||
.command("agent")
|
||||
.description("Manage long-running agent VMs");
|
||||
|
||||
agent
|
||||
.command("start")
|
||||
.description("Start an agent VM from a template")
|
||||
.argument("<template>", "Template name")
|
||||
.option("--name <name>", "Override agent name")
|
||||
.option("--model <model>", "Override LLM model")
|
||||
.action(async (template: string, opts) => {
|
||||
installSignalHandlers();
|
||||
const info = await startAgent(template, {
|
||||
name: opts.name,
|
||||
model: opts.model,
|
||||
});
|
||||
console.log(
|
||||
`Agent "${info.name}" started: ${info.nick} [${info.model}] (${info.ip})`
|
||||
);
|
||||
process.exit(0);
|
||||
});
|
||||
|
||||
agent
|
||||
.command("stop")
|
||||
.description("Stop a running agent VM")
|
||||
.argument("<name>", "Agent name")
|
||||
.action(async (name: string) => {
|
||||
await stopAgent(name);
|
||||
console.log(`Agent "${name}" stopped.`);
|
||||
});
|
||||
|
||||
agent
|
||||
.command("list")
|
||||
.description("List running agent VMs")
|
||||
.action(() => {
|
||||
const agents = listAgents();
|
||||
if (agents.length === 0) {
|
||||
console.log("No agents running.");
|
||||
return;
|
||||
}
|
||||
for (const a of agents) {
|
||||
console.log(
|
||||
`${a.name} (${a.template}) — ${a.nick} [${a.model}] ip=${a.ip} since ${a.startedAt}`
|
||||
);
|
||||
}
|
||||
});
|
||||
|
||||
return program;
|
||||
}
|
||||
57
src/config.ts
Normal file
57
src/config.ts
Normal file
@@ -0,0 +1,57 @@
|
||||
import { homedir } from "node:os";
|
||||
import { join } from "node:path";
|
||||
|
||||
const HOME = homedir();
|
||||
|
||||
export const CONFIG = {
|
||||
firecrackerBin: "/usr/local/bin/firecracker",
|
||||
baseDir: join(HOME, ".fireclaw"),
|
||||
kernelPath: join(HOME, ".fireclaw", "vmlinux"),
|
||||
baseRootfs: join(HOME, ".fireclaw", "base-rootfs.ext4"),
|
||||
runsDir: join(HOME, ".fireclaw", "runs"),
|
||||
sshKeyPath: join(HOME, ".fireclaw", "id_ed25519"),
|
||||
sshPubKeyPath: join(HOME, ".fireclaw", "id_ed25519.pub"),
|
||||
socketDir: "/tmp/fireclaw",
|
||||
ipPoolFile: join(HOME, ".fireclaw", "ip-pool.json"),
|
||||
ipPoolLock: join(HOME, ".fireclaw", "ip-pool.lock"),
|
||||
|
||||
bridge: {
|
||||
name: "fcbr0",
|
||||
ip: "172.16.0.1",
|
||||
subnet: "172.16.0.0/24",
|
||||
netmask: "255.255.255.0",
|
||||
gateway: "172.16.0.1",
|
||||
prefix: "172.16.0",
|
||||
minHost: 2,
|
||||
maxHost: 254,
|
||||
},
|
||||
|
||||
vm: {
|
||||
vcpuCount: 1,
|
||||
memSizeMib: 256,
|
||||
defaultTimeoutMs: 60_000,
|
||||
bootTimeoutMs: 15_000,
|
||||
sshPollIntervalMs: 100,
|
||||
},
|
||||
|
||||
snapshot: {
|
||||
rootfsPath: join(HOME, ".fireclaw", "snapshot-rootfs.ext4"),
|
||||
statePath: join(HOME, ".fireclaw", "snapshot.state"),
|
||||
memPath: join(HOME, ".fireclaw", "snapshot.mem"),
|
||||
tapDevice: "fctap200",
|
||||
ip: "172.16.0.200",
|
||||
octet: 200,
|
||||
},
|
||||
|
||||
workspacesDir: join(HOME, ".fireclaw", "workspaces"),
|
||||
workspaceSizeMib: 64,
|
||||
|
||||
// S3 URLs for Firecracker CI assets
|
||||
assets: {
|
||||
kernelUrl:
|
||||
"https://s3.amazonaws.com/spec.ccfc.min/firecracker-ci/v1.11/x86_64/vmlinux-5.10.225",
|
||||
rootfsListUrl:
|
||||
"http://spec.ccfc.min.s3.amazonaws.com/?prefix=firecracker-ci/v1.11/x86_64/ubuntu",
|
||||
rootfsBaseUrl: "https://s3.amazonaws.com/spec.ccfc.min",
|
||||
},
|
||||
} as const;
|
||||
152
src/firecracker-api.ts
Normal file
152
src/firecracker-api.ts
Normal file
@@ -0,0 +1,152 @@
|
||||
import http from "node:http";
|
||||
|
||||
function request(
|
||||
socketPath: string,
|
||||
method: string,
|
||||
path: string,
|
||||
body?: object
|
||||
): Promise<{ status: number; body: string }> {
|
||||
return new Promise((resolve, reject) => {
|
||||
const payload = body ? JSON.stringify(body) : undefined;
|
||||
const headers: Record<string, string> = {};
|
||||
if (payload) {
|
||||
headers["Content-Type"] = "application/json";
|
||||
headers["Content-Length"] = Buffer.byteLength(payload).toString();
|
||||
}
|
||||
|
||||
const opts: http.RequestOptions = {
|
||||
socketPath,
|
||||
path,
|
||||
method,
|
||||
headers,
|
||||
};
|
||||
|
||||
const req = http.request(opts, (res) => {
|
||||
const chunks: Buffer[] = [];
|
||||
res.on("data", (chunk) => chunks.push(chunk));
|
||||
res.on("end", () => {
|
||||
resolve({
|
||||
status: res.statusCode ?? 0,
|
||||
body: Buffer.concat(chunks).toString(),
|
||||
});
|
||||
});
|
||||
});
|
||||
|
||||
req.on("error", reject);
|
||||
|
||||
if (payload) {
|
||||
req.write(payload);
|
||||
}
|
||||
req.end();
|
||||
});
|
||||
}
|
||||
|
||||
function assertOk(res: { status: number; body: string }, action: string) {
|
||||
if (res.status < 200 || res.status >= 300) {
|
||||
throw new Error(
|
||||
`Firecracker API error (${action}): ${res.status} ${res.body}`
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
export async function putBootSource(
|
||||
socketPath: string,
|
||||
kernelPath: string,
|
||||
bootArgs: string
|
||||
) {
|
||||
const res = await request(socketPath, "PUT", "/boot-source", {
|
||||
kernel_image_path: kernelPath,
|
||||
boot_args: bootArgs,
|
||||
});
|
||||
assertOk(res, "PUT /boot-source");
|
||||
}
|
||||
|
||||
export async function putDrive(
|
||||
socketPath: string,
|
||||
driveId: string,
|
||||
path: string,
|
||||
readOnly = false,
|
||||
isRoot = true
|
||||
) {
|
||||
const res = await request(socketPath, "PUT", `/drives/${driveId}`, {
|
||||
drive_id: driveId,
|
||||
path_on_host: path,
|
||||
is_root_device: isRoot,
|
||||
is_read_only: readOnly,
|
||||
});
|
||||
assertOk(res, `PUT /drives/${driveId}`);
|
||||
}
|
||||
|
||||
export async function putNetworkInterface(
|
||||
socketPath: string,
|
||||
ifaceId: string,
|
||||
hostDevName: string,
|
||||
guestMac: string
|
||||
) {
|
||||
const res = await request(
|
||||
socketPath,
|
||||
"PUT",
|
||||
`/network-interfaces/${ifaceId}`,
|
||||
{
|
||||
iface_id: ifaceId,
|
||||
guest_mac: guestMac,
|
||||
host_dev_name: hostDevName,
|
||||
}
|
||||
);
|
||||
assertOk(res, `PUT /network-interfaces/${ifaceId}`);
|
||||
}
|
||||
|
||||
export async function putMachineConfig(
|
||||
socketPath: string,
|
||||
vcpuCount: number,
|
||||
memSizeMib: number
|
||||
) {
|
||||
const res = await request(socketPath, "PUT", "/machine-config", {
|
||||
vcpu_count: vcpuCount,
|
||||
mem_size_mib: memSizeMib,
|
||||
});
|
||||
assertOk(res, "PUT /machine-config");
|
||||
}
|
||||
|
||||
export async function startInstance(socketPath: string) {
|
||||
const res = await request(socketPath, "PUT", "/actions", {
|
||||
action_type: "InstanceStart",
|
||||
});
|
||||
assertOk(res, "PUT /actions InstanceStart");
|
||||
}
|
||||
|
||||
export async function patchVm(
|
||||
socketPath: string,
|
||||
state: "Paused" | "Resumed"
|
||||
) {
|
||||
const res = await request(socketPath, "PATCH", "/vm", { state });
|
||||
assertOk(res, `PATCH /vm ${state}`);
|
||||
}
|
||||
|
||||
export async function putSnapshotCreate(
|
||||
socketPath: string,
|
||||
snapshotPath: string,
|
||||
memFilePath: string
|
||||
) {
|
||||
const res = await request(socketPath, "PUT", "/snapshot/create", {
|
||||
snapshot_type: "Full",
|
||||
snapshot_path: snapshotPath,
|
||||
mem_file_path: memFilePath,
|
||||
});
|
||||
assertOk(res, "PUT /snapshot/create");
|
||||
}
|
||||
|
||||
export async function putSnapshotLoad(
|
||||
socketPath: string,
|
||||
snapshotPath: string,
|
||||
memFilePath: string
|
||||
) {
|
||||
const res = await request(socketPath, "PUT", "/snapshot/load", {
|
||||
snapshot_path: snapshotPath,
|
||||
mem_backend: {
|
||||
backend_type: "File",
|
||||
backend_path: memFilePath,
|
||||
},
|
||||
});
|
||||
assertOk(res, "PUT /snapshot/load");
|
||||
}
|
||||
5
src/index.ts
Normal file
5
src/index.ts
Normal file
@@ -0,0 +1,5 @@
|
||||
#!/usr/bin/env node
|
||||
import { createCli } from "./cli.js";
|
||||
|
||||
const program = createCli();
|
||||
program.parse();
|
||||
14
src/irc-framework.d.ts
vendored
Normal file
14
src/irc-framework.d.ts
vendored
Normal file
@@ -0,0 +1,14 @@
|
||||
declare module "irc-framework" {
|
||||
class Client {
|
||||
connect(options: {
|
||||
host: string;
|
||||
port: number;
|
||||
nick: string;
|
||||
}): void;
|
||||
join(channel: string): void;
|
||||
say(target: string, message: string): void;
|
||||
quit(message?: string): void;
|
||||
on(event: string, handler: (...args: any[]) => void): void;
|
||||
}
|
||||
export default { Client };
|
||||
}
|
||||
165
src/network.ts
Normal file
165
src/network.ts
Normal file
@@ -0,0 +1,165 @@
|
||||
import { execFileSync } from "node:child_process";
|
||||
import { openSync, closeSync, readFileSync, writeFileSync } from "node:fs";
|
||||
import { CONFIG } from "./config.js";
|
||||
|
||||
function run(cmd: string, args: string[]) {
|
||||
execFileSync(cmd, args, { stdio: "pipe" });
|
||||
}
|
||||
|
||||
function sudo(args: string[]) {
|
||||
run("sudo", args);
|
||||
}
|
||||
|
||||
export function ensureBridge() {
|
||||
try {
|
||||
execFileSync("ip", ["link", "show", CONFIG.bridge.name], {
|
||||
stdio: "pipe",
|
||||
});
|
||||
} catch {
|
||||
sudo(["ip", "link", "add", CONFIG.bridge.name, "type", "bridge"]);
|
||||
sudo([
|
||||
"ip",
|
||||
"addr",
|
||||
"add",
|
||||
`${CONFIG.bridge.ip}/24`,
|
||||
"dev",
|
||||
CONFIG.bridge.name,
|
||||
]);
|
||||
sudo(["ip", "link", "set", CONFIG.bridge.name, "up"]);
|
||||
sudo(["sysctl", "-w", "net.ipv4.ip_forward=1"]);
|
||||
}
|
||||
}
|
||||
|
||||
export function ensureNat() {
|
||||
// Check if rule already exists
|
||||
try {
|
||||
execFileSync(
|
||||
"sudo",
|
||||
[
|
||||
"iptables",
|
||||
"-t",
|
||||
"nat",
|
||||
"-C",
|
||||
"POSTROUTING",
|
||||
"-s",
|
||||
CONFIG.bridge.subnet,
|
||||
"-j",
|
||||
"MASQUERADE",
|
||||
],
|
||||
{ stdio: "pipe" }
|
||||
);
|
||||
} catch {
|
||||
// Find the default route interface
|
||||
const routeOut = execFileSync("ip", ["route", "show", "default"], {
|
||||
encoding: "utf-8",
|
||||
});
|
||||
const extIface = routeOut.match(/dev\s+(\S+)/)?.[1] ?? "eno2";
|
||||
|
||||
sudo([
|
||||
"iptables",
|
||||
"-t",
|
||||
"nat",
|
||||
"-A",
|
||||
"POSTROUTING",
|
||||
"-s",
|
||||
CONFIG.bridge.subnet,
|
||||
"-o",
|
||||
extIface,
|
||||
"-j",
|
||||
"MASQUERADE",
|
||||
]);
|
||||
sudo([
|
||||
"iptables",
|
||||
"-A",
|
||||
"FORWARD",
|
||||
"-i",
|
||||
CONFIG.bridge.name,
|
||||
"-o",
|
||||
extIface,
|
||||
"-j",
|
||||
"ACCEPT",
|
||||
]);
|
||||
sudo([
|
||||
"iptables",
|
||||
"-A",
|
||||
"FORWARD",
|
||||
"-i",
|
||||
extIface,
|
||||
"-o",
|
||||
CONFIG.bridge.name,
|
||||
"-m",
|
||||
"state",
|
||||
"--state",
|
||||
"RELATED,ESTABLISHED",
|
||||
"-j",
|
||||
"ACCEPT",
|
||||
]);
|
||||
}
|
||||
}
|
||||
|
||||
export function createTap(tapName: string) {
|
||||
sudo(["ip", "tuntap", "add", tapName, "mode", "tap"]);
|
||||
sudo(["ip", "link", "set", tapName, "master", CONFIG.bridge.name]);
|
||||
sudo(["ip", "link", "set", tapName, "up"]);
|
||||
}
|
||||
|
||||
export function deleteTap(tapName: string) {
|
||||
try {
|
||||
sudo(["ip", "tuntap", "del", tapName, "mode", "tap"]);
|
||||
} catch {
|
||||
// Already gone
|
||||
}
|
||||
}
|
||||
|
||||
export function macFromOctet(octet: number): string {
|
||||
return `AA:FC:00:00:00:${octet.toString(16).padStart(2, "0").toUpperCase()}`;
|
||||
}
|
||||
|
||||
interface IpPool {
|
||||
allocated: number[];
|
||||
}
|
||||
|
||||
function readPool(): IpPool {
|
||||
try {
|
||||
return JSON.parse(readFileSync(CONFIG.ipPoolFile, "utf-8"));
|
||||
} catch {
|
||||
return { allocated: [] };
|
||||
}
|
||||
}
|
||||
|
||||
function writePool(pool: IpPool) {
|
||||
writeFileSync(CONFIG.ipPoolFile, JSON.stringify(pool));
|
||||
}
|
||||
|
||||
export function allocateIp(): { ip: string; octet: number } {
|
||||
const fd = openSync(CONFIG.ipPoolLock, "w");
|
||||
try {
|
||||
// Simple flock via child process
|
||||
const pool = readPool();
|
||||
for (
|
||||
let octet = CONFIG.bridge.minHost;
|
||||
octet <= CONFIG.bridge.maxHost;
|
||||
octet++
|
||||
) {
|
||||
if (!pool.allocated.includes(octet)) {
|
||||
pool.allocated.push(octet);
|
||||
writePool(pool);
|
||||
return { ip: `${CONFIG.bridge.prefix}.${octet}`, octet };
|
||||
}
|
||||
}
|
||||
throw new Error("No free IPs in pool");
|
||||
} finally {
|
||||
closeSync(fd);
|
||||
}
|
||||
}
|
||||
|
||||
export function releaseIp(octet: number) {
|
||||
const fd = openSync(CONFIG.ipPoolLock, "w");
|
||||
try {
|
||||
const pool = readPool();
|
||||
pool.allocated = pool.allocated.filter((o) => o !== octet);
|
||||
writePool(pool);
|
||||
} finally {
|
||||
closeSync(fd);
|
||||
}
|
||||
}
|
||||
188
src/overseer.ts
Normal file
188
src/overseer.ts
Normal file
@@ -0,0 +1,188 @@
|
||||
import IRC from "irc-framework";
|
||||
import {
|
||||
startAgent,
|
||||
stopAgent,
|
||||
listAgents,
|
||||
stopAllAgents,
|
||||
listTemplates,
|
||||
reconcileAgents,
|
||||
reloadAgent,
|
||||
type AgentInfo,
|
||||
} from "./agent-manager.js";
|
||||
|
||||
interface OverseerConfig {
|
||||
server: string;
|
||||
port: number;
|
||||
nick: string;
|
||||
channel: string;
|
||||
}
|
||||
|
||||
function log(msg: string) {
|
||||
process.stderr.write(`[overseer] ${msg}\n`);
|
||||
}
|
||||
|
||||
function formatAgentList(agents: AgentInfo[]): string[] {
|
||||
if (agents.length === 0) return ["No agents running."];
|
||||
return agents.map(
|
||||
(a) =>
|
||||
`${a.name} (${a.template}) — ${a.nick} [${a.model}] ip=${a.ip} since ${a.startedAt.slice(11, 19)}`
|
||||
);
|
||||
}
|
||||
|
||||
export async function runOverseer(config: OverseerConfig) {
|
||||
// Reconcile agent state on startup
|
||||
log("Reconciling agent state...");
|
||||
const { adopted, cleaned } = reconcileAgents();
|
||||
if (adopted.length > 0) {
|
||||
log(`Adopted ${adopted.length} running agent(s): ${adopted.join(", ")}`);
|
||||
}
|
||||
if (cleaned.length > 0) {
|
||||
log(`Cleaned ${cleaned.length} dead agent(s): ${cleaned.join(", ")}`);
|
||||
}
|
||||
|
||||
const bot = new IRC.Client();
|
||||
|
||||
bot.connect({
|
||||
host: config.server,
|
||||
port: config.port,
|
||||
nick: config.nick,
|
||||
});
|
||||
|
||||
bot.on("registered", () => {
|
||||
log(`Connected to ${config.server}:${config.port} as ${config.nick}`);
|
||||
bot.join(config.channel);
|
||||
bot.join("#agents");
|
||||
log(`Joined ${config.channel} and #agents`);
|
||||
});
|
||||
|
||||
bot.on("message", async (event: { nick: string; target: string; message: string }) => {
|
||||
// Only handle channel messages
|
||||
if (!event.target.startsWith("#")) return;
|
||||
|
||||
const text = event.message.trim();
|
||||
if (!text.startsWith("!")) return;
|
||||
|
||||
const parts = text.split(/\s+/);
|
||||
const cmd = parts[0].toLowerCase();
|
||||
|
||||
try {
|
||||
switch (cmd) {
|
||||
case "!invoke": {
|
||||
const template = parts[1];
|
||||
if (!template) {
|
||||
bot.say(event.target, "Usage: !invoke <template> [name]");
|
||||
return;
|
||||
}
|
||||
const name = parts[2];
|
||||
bot.say(event.target, `Invoking agent "${name ?? template}" from template "${template}"...`);
|
||||
const info = await startAgent(template, { name });
|
||||
bot.say(
|
||||
event.target,
|
||||
`Agent "${info.name}" started: ${info.nick} [${info.model}] (${info.ip})`
|
||||
);
|
||||
break;
|
||||
}
|
||||
|
||||
case "!destroy": {
|
||||
const name = parts[1];
|
||||
if (!name) {
|
||||
bot.say(event.target, "Usage: !destroy <name>");
|
||||
return;
|
||||
}
|
||||
await stopAgent(name);
|
||||
bot.say(event.target, `Agent "${name}" destroyed.`);
|
||||
break;
|
||||
}
|
||||
|
||||
case "!list": {
|
||||
const agents = listAgents();
|
||||
for (const line of formatAgentList(agents)) {
|
||||
bot.say(event.target, line);
|
||||
}
|
||||
break;
|
||||
}
|
||||
|
||||
case "!model": {
|
||||
const name = parts[1];
|
||||
const model = parts[2];
|
||||
if (!name || !model) {
|
||||
bot.say(event.target, "Usage: !model <name> <model>");
|
||||
return;
|
||||
}
|
||||
await reloadAgent(name, { model });
|
||||
bot.say(event.target, `Agent "${name}" hot-reloaded with model ${model}.`);
|
||||
break;
|
||||
}
|
||||
|
||||
case "!templates": {
|
||||
const templates = listTemplates();
|
||||
if (templates.length === 0) {
|
||||
bot.say(event.target, "No templates found.");
|
||||
} else {
|
||||
bot.say(event.target, `Templates: ${templates.join(", ")}`);
|
||||
}
|
||||
break;
|
||||
}
|
||||
|
||||
case "!models": {
|
||||
try {
|
||||
const http = await import("node:http");
|
||||
const data = await new Promise<string>((resolve, reject) => {
|
||||
http.get("http://localhost:11434/api/tags", (res) => {
|
||||
const chunks: Buffer[] = [];
|
||||
res.on("data", (c) => chunks.push(c));
|
||||
res.on("end", () => resolve(Buffer.concat(chunks).toString()));
|
||||
}).on("error", reject);
|
||||
});
|
||||
const models = JSON.parse(data).models;
|
||||
if (models.length === 0) {
|
||||
bot.say(event.target, "No models available.");
|
||||
} else {
|
||||
const lines = models.map(
|
||||
(m: { name: string; size: number }) =>
|
||||
`${m.name} (${(m.size / 1e9).toFixed(1)}GB)`
|
||||
);
|
||||
bot.say(event.target, `Models: ${lines.join(", ")}`);
|
||||
}
|
||||
} catch (e) {
|
||||
bot.say(event.target, "Error fetching models from Ollama.");
|
||||
}
|
||||
break;
|
||||
}
|
||||
|
||||
case "!help": {
|
||||
bot.say(event.target, "Commands: !invoke <template> [name] | !destroy <name> | !list | !model <name> <model> | !models | !templates | !help");
|
||||
break;
|
||||
}
|
||||
}
|
||||
} catch (err) {
|
||||
const msg = err instanceof Error ? err.message : String(err);
|
||||
bot.say(event.target, `Error: ${msg}`);
|
||||
log(`Error handling command "${text}": ${msg}`);
|
||||
}
|
||||
});
|
||||
|
||||
bot.on("close", () => {
|
||||
log("Disconnected. Reconnecting in 5s...");
|
||||
setTimeout(() => {
|
||||
bot.connect({
|
||||
host: config.server,
|
||||
port: config.port,
|
||||
nick: config.nick,
|
||||
});
|
||||
}, 5000);
|
||||
});
|
||||
|
||||
// Graceful shutdown
|
||||
const shutdown = async () => {
|
||||
log("Shutting down, stopping all agents...");
|
||||
await stopAllAgents();
|
||||
bot.quit("Overseer shutting down");
|
||||
process.exit(0);
|
||||
};
|
||||
|
||||
process.on("SIGINT", shutdown);
|
||||
process.on("SIGTERM", shutdown);
|
||||
|
||||
log("Overseer started. Waiting for commands...");
|
||||
}
|
||||
98
src/rootfs.ts
Normal file
98
src/rootfs.ts
Normal file
@@ -0,0 +1,98 @@
|
||||
import { execFileSync } from "node:child_process";
|
||||
import {
|
||||
existsSync,
|
||||
copyFileSync,
|
||||
mkdirSync,
|
||||
unlinkSync,
|
||||
} from "node:fs";
|
||||
import { join } from "node:path";
|
||||
import { randomBytes } from "node:crypto";
|
||||
import { CONFIG } from "./config.js";
|
||||
|
||||
export function ensureBaseImage() {
|
||||
if (!existsSync(CONFIG.baseRootfs)) {
|
||||
throw new Error(
|
||||
`Base rootfs not found at ${CONFIG.baseRootfs}. Run 'fireclaw setup' first.`
|
||||
);
|
||||
}
|
||||
if (!existsSync(CONFIG.kernelPath)) {
|
||||
throw new Error(
|
||||
`Kernel not found at ${CONFIG.kernelPath}. Run 'fireclaw setup' first.`
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
export function ensureSshKeypair() {
|
||||
if (!existsSync(CONFIG.sshKeyPath)) {
|
||||
execFileSync("ssh-keygen", [
|
||||
"-t",
|
||||
"ed25519",
|
||||
"-f",
|
||||
CONFIG.sshKeyPath,
|
||||
"-N",
|
||||
"",
|
||||
"-C",
|
||||
"fireclaw",
|
||||
]);
|
||||
}
|
||||
}
|
||||
|
||||
export function createRunCopy(vmId: string): string {
|
||||
mkdirSync(CONFIG.runsDir, { recursive: true });
|
||||
const dest = join(CONFIG.runsDir, `${vmId}.ext4`);
|
||||
copyFileSync(CONFIG.baseRootfs, dest);
|
||||
return dest;
|
||||
}
|
||||
|
||||
export function injectSshKey(rootfsPath: string) {
|
||||
const mountPoint = `/tmp/fireclaw-mount-${randomBytes(4).toString("hex")}`;
|
||||
mkdirSync(mountPoint, { recursive: true });
|
||||
|
||||
try {
|
||||
execFileSync("sudo", ["mount", "-o", "loop", rootfsPath, mountPoint], {
|
||||
stdio: "pipe",
|
||||
});
|
||||
|
||||
execFileSync("sudo", ["mkdir", "-p", join(mountPoint, "root/.ssh")], {
|
||||
stdio: "pipe",
|
||||
});
|
||||
execFileSync(
|
||||
"sudo",
|
||||
[
|
||||
"cp",
|
||||
CONFIG.sshPubKeyPath,
|
||||
join(mountPoint, "root/.ssh/authorized_keys"),
|
||||
],
|
||||
{ stdio: "pipe" }
|
||||
);
|
||||
execFileSync(
|
||||
"sudo",
|
||||
["chmod", "600", join(mountPoint, "root/.ssh/authorized_keys")],
|
||||
{ stdio: "pipe" }
|
||||
);
|
||||
execFileSync(
|
||||
"sudo",
|
||||
["chmod", "700", join(mountPoint, "root/.ssh")],
|
||||
{ stdio: "pipe" }
|
||||
);
|
||||
} finally {
|
||||
try {
|
||||
execFileSync("sudo", ["umount", mountPoint], { stdio: "pipe" });
|
||||
} catch {
|
||||
// Best effort
|
||||
}
|
||||
try {
|
||||
execFileSync("rmdir", [mountPoint], { stdio: "pipe" });
|
||||
} catch {
|
||||
// Best effort
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
export function deleteRunCopy(rootfsPath: string) {
|
||||
try {
|
||||
unlinkSync(rootfsPath);
|
||||
} catch {
|
||||
// Already gone
|
||||
}
|
||||
}
|
||||
117
src/setup.ts
Normal file
117
src/setup.ts
Normal file
@@ -0,0 +1,117 @@
|
||||
import { execFileSync, execSync } from "node:child_process";
|
||||
import { existsSync, mkdirSync } from "node:fs";
|
||||
import { CONFIG } from "./config.js";
|
||||
import { ensureBridge, ensureNat } from "./network.js";
|
||||
import { ensureSshKeypair } from "./rootfs.js";
|
||||
|
||||
function log(msg: string) {
|
||||
process.stderr.write(`[setup] ${msg}\n`);
|
||||
}
|
||||
|
||||
function download(url: string, dest: string) {
|
||||
execFileSync("curl", ["-fSL", "-o", dest, url], {
|
||||
stdio: ["pipe", "pipe", "inherit"],
|
||||
timeout: 300_000,
|
||||
});
|
||||
}
|
||||
|
||||
export async function runSetup() {
|
||||
log("Setting up fireclaw...");
|
||||
|
||||
// Create directories
|
||||
mkdirSync(CONFIG.baseDir, { recursive: true });
|
||||
mkdirSync(CONFIG.runsDir, { recursive: true });
|
||||
mkdirSync(CONFIG.socketDir, { recursive: true });
|
||||
|
||||
// Download kernel
|
||||
if (existsSync(CONFIG.kernelPath)) {
|
||||
log("Kernel already exists, skipping download.");
|
||||
} else {
|
||||
log("Downloading kernel...");
|
||||
download(CONFIG.assets.kernelUrl, CONFIG.kernelPath);
|
||||
log("Kernel downloaded.");
|
||||
}
|
||||
|
||||
// Download and convert rootfs
|
||||
if (existsSync(CONFIG.baseRootfs)) {
|
||||
log("Base rootfs already exists, skipping download.");
|
||||
} else {
|
||||
log("Downloading rootfs...");
|
||||
|
||||
// Find latest rootfs key from S3 listing
|
||||
const listing = execFileSync(
|
||||
"curl",
|
||||
["-fsSL", CONFIG.assets.rootfsListUrl],
|
||||
{ encoding: "utf-8", timeout: 30_000 }
|
||||
);
|
||||
const keys = [...listing.matchAll(/<Key>([^<]+)<\/Key>/g)].map(
|
||||
(m) => m[1]
|
||||
);
|
||||
const rootfsKey = keys.sort().pop();
|
||||
if (!rootfsKey) throw new Error("Could not find rootfs in S3 listing");
|
||||
|
||||
const squashfsPath = `${CONFIG.baseDir}/rootfs.squashfs`;
|
||||
download(`${CONFIG.assets.rootfsBaseUrl}/${rootfsKey}`, squashfsPath);
|
||||
log("Rootfs downloaded. Converting squashfs to ext4...");
|
||||
|
||||
// Convert squashfs to ext4
|
||||
const squashMount = "/tmp/fireclaw-squash";
|
||||
const ext4Mount = "/tmp/fireclaw-ext4";
|
||||
mkdirSync(squashMount, { recursive: true });
|
||||
mkdirSync(ext4Mount, { recursive: true });
|
||||
|
||||
try {
|
||||
execFileSync(
|
||||
"sudo",
|
||||
["mount", "-t", "squashfs", squashfsPath, squashMount],
|
||||
{ stdio: "pipe" }
|
||||
);
|
||||
execFileSync("truncate", ["-s", "1G", CONFIG.baseRootfs], {
|
||||
stdio: "pipe",
|
||||
});
|
||||
execFileSync("sudo", ["/usr/sbin/mkfs.ext4", CONFIG.baseRootfs], {
|
||||
stdio: "pipe",
|
||||
});
|
||||
execFileSync("sudo", ["mount", CONFIG.baseRootfs, ext4Mount], {
|
||||
stdio: "pipe",
|
||||
});
|
||||
execFileSync("sudo", ["cp", "-a", `${squashMount}/.`, ext4Mount], {
|
||||
stdio: "pipe",
|
||||
});
|
||||
|
||||
// Bake in DNS config
|
||||
execSync(
|
||||
`echo "nameserver 8.8.8.8" | sudo tee ${ext4Mount}/etc/resolv.conf > /dev/null`
|
||||
);
|
||||
|
||||
log("Rootfs converted.");
|
||||
} finally {
|
||||
try {
|
||||
execFileSync("sudo", ["umount", squashMount], { stdio: "pipe" });
|
||||
} catch {}
|
||||
try {
|
||||
execFileSync("sudo", ["umount", ext4Mount], { stdio: "pipe" });
|
||||
} catch {}
|
||||
try {
|
||||
execFileSync("rmdir", [squashMount], { stdio: "pipe" });
|
||||
} catch {}
|
||||
try {
|
||||
execFileSync("rmdir", [ext4Mount], { stdio: "pipe" });
|
||||
} catch {}
|
||||
try {
|
||||
execFileSync("rm", ["-f", squashfsPath], { stdio: "pipe" });
|
||||
} catch {}
|
||||
}
|
||||
}
|
||||
|
||||
// Generate SSH keypair
|
||||
log("Ensuring SSH keypair...");
|
||||
ensureSshKeypair();
|
||||
|
||||
// Set up bridge and NAT
|
||||
log("Setting up network bridge...");
|
||||
ensureBridge();
|
||||
ensureNat();
|
||||
|
||||
log("Setup complete! Run 'fireclaw run \"uname -a\"' to test.");
|
||||
}
|
||||
128
src/snapshot.ts
Normal file
128
src/snapshot.ts
Normal file
@@ -0,0 +1,128 @@
|
||||
import { spawn, type ChildProcess } from "node:child_process";
|
||||
import { existsSync, mkdirSync } from "node:fs";
|
||||
import { join } from "node:path";
|
||||
import { CONFIG } from "./config.js";
|
||||
import * as api from "./firecracker-api.js";
|
||||
import {
|
||||
ensureBridge,
|
||||
ensureNat,
|
||||
createTap,
|
||||
deleteTap,
|
||||
macFromOctet,
|
||||
} from "./network.js";
|
||||
import {
|
||||
ensureBaseImage,
|
||||
ensureSshKeypair,
|
||||
createRunCopy,
|
||||
injectSshKey,
|
||||
} from "./rootfs.js";
|
||||
import { waitForSsh } from "./ssh.js";
|
||||
import { copyFileSync } from "node:fs";
|
||||
|
||||
function log(msg: string) {
|
||||
process.stderr.write(`[snapshot] ${msg}\n`);
|
||||
}
|
||||
|
||||
function waitForSocket(socketPath: string): Promise<void> {
|
||||
return new Promise((resolve, reject) => {
|
||||
const deadline = Date.now() + 5_000;
|
||||
const check = () => {
|
||||
if (existsSync(socketPath)) {
|
||||
setTimeout(resolve, 200);
|
||||
return;
|
||||
}
|
||||
if (Date.now() > deadline) {
|
||||
reject(new Error("Firecracker socket did not appear"));
|
||||
return;
|
||||
}
|
||||
setTimeout(check, 50);
|
||||
};
|
||||
check();
|
||||
});
|
||||
}
|
||||
|
||||
export function snapshotExists(): boolean {
|
||||
return (
|
||||
existsSync(CONFIG.snapshot.statePath) &&
|
||||
existsSync(CONFIG.snapshot.memPath) &&
|
||||
existsSync(CONFIG.snapshot.rootfsPath)
|
||||
);
|
||||
}
|
||||
|
||||
export async function createSnapshot() {
|
||||
ensureBaseImage();
|
||||
ensureSshKeypair();
|
||||
|
||||
const snap = CONFIG.snapshot;
|
||||
const socketPath = join(CONFIG.socketDir, "snapshot.sock");
|
||||
|
||||
log("Preparing snapshot rootfs...");
|
||||
mkdirSync(CONFIG.socketDir, { recursive: true });
|
||||
copyFileSync(CONFIG.baseRootfs, snap.rootfsPath);
|
||||
injectSshKey(snap.rootfsPath);
|
||||
|
||||
log("Setting up network...");
|
||||
ensureBridge();
|
||||
ensureNat();
|
||||
createTap(snap.tapDevice);
|
||||
|
||||
let proc: ChildProcess | null = null;
|
||||
|
||||
try {
|
||||
log("Booting VM for snapshot...");
|
||||
proc = spawn(CONFIG.firecrackerBin, ["--api-sock", socketPath], {
|
||||
stdio: "pipe",
|
||||
detached: false,
|
||||
});
|
||||
|
||||
await waitForSocket(socketPath);
|
||||
|
||||
const bootArgs = [
|
||||
"console=ttyS0",
|
||||
"reboot=k",
|
||||
"panic=1",
|
||||
"pci=off",
|
||||
"root=/dev/vda",
|
||||
"rw",
|
||||
`ip=${snap.ip}::${CONFIG.bridge.gateway}:${CONFIG.bridge.netmask}::eth0:off`,
|
||||
].join(" ");
|
||||
|
||||
await api.putBootSource(socketPath, CONFIG.kernelPath, bootArgs);
|
||||
await api.putDrive(socketPath, "rootfs", snap.rootfsPath);
|
||||
await api.putNetworkInterface(
|
||||
socketPath,
|
||||
"eth0",
|
||||
snap.tapDevice,
|
||||
macFromOctet(snap.octet)
|
||||
);
|
||||
await api.putMachineConfig(
|
||||
socketPath,
|
||||
CONFIG.vm.vcpuCount,
|
||||
CONFIG.vm.memSizeMib
|
||||
);
|
||||
await api.startInstance(socketPath);
|
||||
|
||||
log("Waiting for SSH...");
|
||||
await waitForSsh(snap.ip);
|
||||
|
||||
log("Pausing VM...");
|
||||
await api.patchVm(socketPath, "Paused");
|
||||
|
||||
log("Creating snapshot...");
|
||||
await api.putSnapshotCreate(socketPath, snap.statePath, snap.memPath);
|
||||
|
||||
log("Snapshot created successfully.");
|
||||
log(` State: ${snap.statePath}`);
|
||||
log(` Memory: ${snap.memPath}`);
|
||||
log(` Rootfs: ${snap.rootfsPath}`);
|
||||
} finally {
|
||||
if (proc && !proc.killed) {
|
||||
proc.kill("SIGKILL");
|
||||
}
|
||||
try {
|
||||
const { unlinkSync } = await import("node:fs");
|
||||
unlinkSync(socketPath);
|
||||
} catch {}
|
||||
deleteTap(snap.tapDevice);
|
||||
}
|
||||
}
|
||||
107
src/ssh.ts
Normal file
107
src/ssh.ts
Normal file
@@ -0,0 +1,107 @@
|
||||
import { Client } from "ssh2";
|
||||
import { readFileSync } from "node:fs";
|
||||
import { createConnection } from "node:net";
|
||||
import { CONFIG } from "./config.js";
|
||||
import type { RunResult } from "./types.js";
|
||||
|
||||
export function waitForSsh(
|
||||
host: string,
|
||||
port = 22,
|
||||
timeoutMs = CONFIG.vm.bootTimeoutMs
|
||||
): Promise<void> {
|
||||
return new Promise((resolve, reject) => {
|
||||
const deadline = Date.now() + timeoutMs;
|
||||
|
||||
function attempt() {
|
||||
if (Date.now() > deadline) {
|
||||
reject(new Error(`SSH not ready after ${timeoutMs}ms`));
|
||||
return;
|
||||
}
|
||||
|
||||
const sock = createConnection({ host, port, timeout: 500 });
|
||||
|
||||
sock.on("connect", () => {
|
||||
sock.destroy();
|
||||
resolve();
|
||||
});
|
||||
|
||||
sock.on("error", () => {
|
||||
sock.destroy();
|
||||
setTimeout(attempt, CONFIG.vm.sshPollIntervalMs);
|
||||
});
|
||||
|
||||
sock.on("timeout", () => {
|
||||
sock.destroy();
|
||||
setTimeout(attempt, CONFIG.vm.sshPollIntervalMs);
|
||||
});
|
||||
}
|
||||
|
||||
attempt();
|
||||
});
|
||||
}
|
||||
|
||||
export function execCommand(
|
||||
host: string,
|
||||
command: string,
|
||||
timeoutMs: number,
|
||||
verbose: boolean
|
||||
): Promise<RunResult> {
|
||||
return new Promise((resolve, reject) => {
|
||||
const startTime = Date.now();
|
||||
const privateKey = readFileSync(CONFIG.sshKeyPath);
|
||||
const conn = new Client();
|
||||
|
||||
const timer = setTimeout(() => {
|
||||
conn.end();
|
||||
reject(new Error(`Command timed out after ${timeoutMs}ms`));
|
||||
}, timeoutMs);
|
||||
|
||||
conn.on("ready", () => {
|
||||
conn.exec(command, (err, stream) => {
|
||||
if (err) {
|
||||
clearTimeout(timer);
|
||||
conn.end();
|
||||
reject(err);
|
||||
return;
|
||||
}
|
||||
|
||||
const stdoutChunks: Buffer[] = [];
|
||||
const stderrChunks: Buffer[] = [];
|
||||
|
||||
stream.on("data", (data: Buffer) => {
|
||||
stdoutChunks.push(data);
|
||||
if (verbose) process.stdout.write(data);
|
||||
});
|
||||
|
||||
stream.stderr.on("data", (data: Buffer) => {
|
||||
stderrChunks.push(data);
|
||||
if (verbose) process.stderr.write(data);
|
||||
});
|
||||
|
||||
stream.on("close", (code: number | null) => {
|
||||
clearTimeout(timer);
|
||||
conn.end();
|
||||
resolve({
|
||||
exitCode: code ?? 1,
|
||||
stdout: Buffer.concat(stdoutChunks).toString(),
|
||||
stderr: Buffer.concat(stderrChunks).toString(),
|
||||
durationMs: Date.now() - startTime,
|
||||
});
|
||||
});
|
||||
});
|
||||
});
|
||||
|
||||
conn.on("error", (err) => {
|
||||
clearTimeout(timer);
|
||||
reject(err);
|
||||
});
|
||||
|
||||
conn.connect({
|
||||
host,
|
||||
port: 22,
|
||||
username: "root",
|
||||
privateKey,
|
||||
hostVerifier: () => true,
|
||||
});
|
||||
});
|
||||
}
|
||||
24
src/types.ts
Normal file
24
src/types.ts
Normal file
@@ -0,0 +1,24 @@
|
||||
export interface VMConfig {
|
||||
id: string;
|
||||
guestIp: string;
|
||||
tapDevice: string;
|
||||
socketPath: string;
|
||||
rootfsPath: string;
|
||||
timeoutMs: number;
|
||||
verbose: boolean;
|
||||
}
|
||||
|
||||
export interface RunResult {
|
||||
exitCode: number;
|
||||
stdout: string;
|
||||
stderr: string;
|
||||
durationMs: number;
|
||||
}
|
||||
|
||||
export interface RunOptions {
|
||||
timeout?: number;
|
||||
verbose?: boolean;
|
||||
mem?: number;
|
||||
vcpu?: number;
|
||||
noSnapshot?: boolean;
|
||||
}
|
||||
288
src/vm.ts
Normal file
288
src/vm.ts
Normal file
@@ -0,0 +1,288 @@
|
||||
import { spawn, type ChildProcess } from "node:child_process";
|
||||
import { existsSync, mkdirSync } from "node:fs";
|
||||
import { join } from "node:path";
|
||||
import { randomBytes } from "node:crypto";
|
||||
import { CONFIG } from "./config.js";
|
||||
import type { VMConfig, RunResult, RunOptions } from "./types.js";
|
||||
import * as api from "./firecracker-api.js";
|
||||
import {
|
||||
ensureBridge,
|
||||
ensureNat,
|
||||
allocateIp,
|
||||
releaseIp,
|
||||
createTap,
|
||||
deleteTap,
|
||||
macFromOctet,
|
||||
} from "./network.js";
|
||||
import {
|
||||
ensureBaseImage,
|
||||
ensureSshKeypair,
|
||||
createRunCopy,
|
||||
injectSshKey,
|
||||
deleteRunCopy,
|
||||
} from "./rootfs.js";
|
||||
import { waitForSsh, execCommand } from "./ssh.js";
|
||||
import { registerVm, unregisterVm } from "./cleanup.js";
|
||||
import { snapshotExists } from "./snapshot.js";
|
||||
|
||||
function log(verbose: boolean, msg: string) {
|
||||
if (verbose) process.stderr.write(`[fireclaw] ${msg}\n`);
|
||||
}
|
||||
|
||||
export class VMInstance {
|
||||
private config: VMConfig;
|
||||
private process: ChildProcess | null = null;
|
||||
private octet = 0;
|
||||
|
||||
constructor(config: VMConfig) {
|
||||
this.config = config;
|
||||
}
|
||||
|
||||
static async run(
|
||||
command: string,
|
||||
opts: RunOptions = {}
|
||||
): Promise<RunResult> {
|
||||
// Try snapshot path first unless disabled
|
||||
if (!opts.noSnapshot && snapshotExists()) {
|
||||
return VMInstance.runFromSnapshot(command, opts);
|
||||
}
|
||||
return VMInstance.runColdBoot(command, opts);
|
||||
}
|
||||
|
||||
private static async runFromSnapshot(
|
||||
command: string,
|
||||
opts: RunOptions
|
||||
): Promise<RunResult> {
|
||||
const id = `fc-snap-${randomBytes(3).toString("hex")}`;
|
||||
const verbose = opts.verbose ?? false;
|
||||
const timeoutMs = opts.timeout ?? CONFIG.vm.defaultTimeoutMs;
|
||||
const snap = CONFIG.snapshot;
|
||||
|
||||
mkdirSync(CONFIG.socketDir, { recursive: true });
|
||||
|
||||
const config: VMConfig = {
|
||||
id,
|
||||
guestIp: snap.ip,
|
||||
tapDevice: snap.tapDevice,
|
||||
socketPath: join(CONFIG.socketDir, `${id}.sock`),
|
||||
rootfsPath: "", // shared, not per-run
|
||||
timeoutMs,
|
||||
verbose,
|
||||
};
|
||||
|
||||
const vm = new VMInstance(config);
|
||||
vm.octet = 0; // no IP pool allocation for snapshot runs
|
||||
registerVm(vm);
|
||||
|
||||
try {
|
||||
log(verbose, `VM ${id}: restoring from snapshot...`);
|
||||
ensureBridge();
|
||||
ensureNat();
|
||||
createTap(snap.tapDevice);
|
||||
|
||||
// Spawn firecracker and load snapshot
|
||||
vm.process = spawn(
|
||||
CONFIG.firecrackerBin,
|
||||
["--api-sock", config.socketPath],
|
||||
{ stdio: "pipe", detached: false }
|
||||
);
|
||||
vm.process.on("error", (err) => {
|
||||
log(verbose, `Firecracker process error: ${err.message}`);
|
||||
});
|
||||
|
||||
await vm.waitForSocket();
|
||||
await api.putSnapshotLoad(
|
||||
config.socketPath,
|
||||
snap.statePath,
|
||||
snap.memPath
|
||||
);
|
||||
await api.patchVm(config.socketPath, "Resumed");
|
||||
|
||||
log(verbose, `VM ${id}: resumed, waiting for SSH...`);
|
||||
await waitForSsh(snap.ip);
|
||||
|
||||
log(verbose, `VM ${id}: executing command...`);
|
||||
const result = await execCommand(snap.ip, command, timeoutMs, verbose);
|
||||
|
||||
log(
|
||||
verbose,
|
||||
`VM ${id}: done (exit=${result.exitCode}, ${result.durationMs}ms)`
|
||||
);
|
||||
return result;
|
||||
} finally {
|
||||
await vm.destroy();
|
||||
unregisterVm(vm);
|
||||
}
|
||||
}
|
||||
|
||||
private static async runColdBoot(
|
||||
command: string,
|
||||
opts: RunOptions
|
||||
): Promise<RunResult> {
|
||||
const id = `fc-${randomBytes(3).toString("hex")}`;
|
||||
const verbose = opts.verbose ?? false;
|
||||
const timeoutMs = opts.timeout ?? CONFIG.vm.defaultTimeoutMs;
|
||||
|
||||
// Pre-flight checks
|
||||
ensureBaseImage();
|
||||
ensureSshKeypair();
|
||||
|
||||
// Allocate resources
|
||||
const { ip, octet } = allocateIp();
|
||||
const tapDevice = `fctap${octet}`;
|
||||
|
||||
mkdirSync(CONFIG.socketDir, { recursive: true });
|
||||
|
||||
const config: VMConfig = {
|
||||
id,
|
||||
guestIp: ip,
|
||||
tapDevice,
|
||||
socketPath: join(CONFIG.socketDir, `${id}.sock`),
|
||||
rootfsPath: "",
|
||||
timeoutMs,
|
||||
verbose,
|
||||
};
|
||||
|
||||
const vm = new VMInstance(config);
|
||||
vm.octet = octet;
|
||||
registerVm(vm);
|
||||
|
||||
try {
|
||||
log(verbose, `VM ${id}: preparing rootfs...`);
|
||||
config.rootfsPath = createRunCopy(id);
|
||||
injectSshKey(config.rootfsPath);
|
||||
|
||||
log(verbose, `VM ${id}: creating tap ${tapDevice}...`);
|
||||
ensureBridge();
|
||||
ensureNat();
|
||||
createTap(tapDevice);
|
||||
|
||||
log(verbose, `VM ${id}: booting...`);
|
||||
await vm.boot(opts);
|
||||
|
||||
log(verbose, `VM ${id}: waiting for SSH at ${ip}...`);
|
||||
await waitForSsh(ip);
|
||||
|
||||
log(verbose, `VM ${id}: executing command...`);
|
||||
const result = await execCommand(ip, command, timeoutMs, verbose);
|
||||
|
||||
log(
|
||||
verbose,
|
||||
`VM ${id}: done (exit=${result.exitCode}, ${result.durationMs}ms)`
|
||||
);
|
||||
return result;
|
||||
} finally {
|
||||
await vm.destroy();
|
||||
unregisterVm(vm);
|
||||
}
|
||||
}
|
||||
|
||||
private async boot(opts: RunOptions) {
|
||||
const { config } = this;
|
||||
const vcpu = opts.vcpu ?? CONFIG.vm.vcpuCount;
|
||||
const mem = opts.mem ?? CONFIG.vm.memSizeMib;
|
||||
|
||||
// Spawn firecracker
|
||||
this.process = spawn(
|
||||
CONFIG.firecrackerBin,
|
||||
["--api-sock", config.socketPath],
|
||||
{
|
||||
stdio: "pipe",
|
||||
detached: false,
|
||||
}
|
||||
);
|
||||
|
||||
this.process.on("error", (err) => {
|
||||
log(config.verbose, `Firecracker process error: ${err.message}`);
|
||||
});
|
||||
|
||||
// Wait for socket
|
||||
await this.waitForSocket();
|
||||
|
||||
// Configure via API
|
||||
const bootArgs = [
|
||||
"console=ttyS0",
|
||||
"reboot=k",
|
||||
"panic=1",
|
||||
"pci=off",
|
||||
"root=/dev/vda",
|
||||
"rw",
|
||||
`ip=${config.guestIp}::${CONFIG.bridge.gateway}:${CONFIG.bridge.netmask}::eth0:off`,
|
||||
].join(" ");
|
||||
|
||||
await api.putBootSource(config.socketPath, CONFIG.kernelPath, bootArgs);
|
||||
await api.putDrive(config.socketPath, "rootfs", config.rootfsPath);
|
||||
await api.putNetworkInterface(
|
||||
config.socketPath,
|
||||
"eth0",
|
||||
config.tapDevice,
|
||||
macFromOctet(this.octet)
|
||||
);
|
||||
await api.putMachineConfig(config.socketPath, vcpu, mem);
|
||||
await api.startInstance(config.socketPath);
|
||||
}
|
||||
|
||||
private waitForSocket(): Promise<void> {
|
||||
const socketPath = this.config.socketPath;
|
||||
return new Promise((resolve, reject) => {
|
||||
const deadline = Date.now() + 5_000;
|
||||
|
||||
const check = () => {
|
||||
if (existsSync(socketPath)) {
|
||||
setTimeout(resolve, 200);
|
||||
return;
|
||||
}
|
||||
if (Date.now() > deadline) {
|
||||
reject(new Error("Firecracker socket did not appear"));
|
||||
return;
|
||||
}
|
||||
setTimeout(check, 50);
|
||||
};
|
||||
|
||||
check();
|
||||
});
|
||||
}
|
||||
|
||||
async destroy() {
|
||||
const { config } = this;
|
||||
log(config.verbose, `VM ${config.id}: cleaning up...`);
|
||||
|
||||
// Kill firecracker
|
||||
if (this.process && !this.process.killed) {
|
||||
this.process.kill("SIGTERM");
|
||||
await new Promise<void>((resolve) => {
|
||||
const timer = setTimeout(() => {
|
||||
if (this.process && !this.process.killed) {
|
||||
this.process.kill("SIGKILL");
|
||||
}
|
||||
resolve();
|
||||
}, 2_000);
|
||||
this.process!.on("exit", () => {
|
||||
clearTimeout(timer);
|
||||
resolve();
|
||||
});
|
||||
});
|
||||
}
|
||||
|
||||
// Clean up socket
|
||||
try {
|
||||
const { unlinkSync } = await import("node:fs");
|
||||
unlinkSync(config.socketPath);
|
||||
} catch {
|
||||
// Already gone
|
||||
}
|
||||
|
||||
// Clean up tap device
|
||||
deleteTap(config.tapDevice);
|
||||
|
||||
// Release IP (skip for snapshot runs which don't allocate from pool)
|
||||
if (this.octet > 0) {
|
||||
releaseIp(this.octet);
|
||||
}
|
||||
|
||||
// Delete rootfs copy (skip for snapshot runs which share rootfs)
|
||||
if (config.rootfsPath) {
|
||||
deleteRunCopy(config.rootfsPath);
|
||||
}
|
||||
}
|
||||
}
|
||||
305
tests/test-suite.sh
Executable file
305
tests/test-suite.sh
Executable file
@@ -0,0 +1,305 @@
|
||||
#!/bin/bash
|
||||
# Fireclaw regression test suite
|
||||
# Requires: overseer running, no agents active
|
||||
# Usage: ./tests/test-suite.sh
|
||||
|
||||
set -uo pipefail
|
||||
|
||||
PASS=0
|
||||
FAIL=0
|
||||
SKIP=0
|
||||
|
||||
irc_cmd() {
|
||||
local wait=${1:-3}
|
||||
shift
|
||||
local cmds=""
|
||||
for cmd in "$@"; do
|
||||
cmds+="PRIVMSG #agents :${cmd}\r\n"
|
||||
cmds+="$(printf 'SLEEP %s' "$wait")\r\n"
|
||||
done
|
||||
{
|
||||
echo -e "NICK fctest\r\nUSER fctest 0 * :test\r\n"
|
||||
sleep 2
|
||||
echo -e "JOIN #agents\r\nJOIN #dev\r\n"
|
||||
sleep 1
|
||||
for cmd in "$@"; do
|
||||
echo -e "PRIVMSG #agents :${cmd}\r\n"
|
||||
sleep "$wait"
|
||||
done
|
||||
echo -e "QUIT\r\n"
|
||||
} | nc -q 2 127.0.0.1 6667 2>&1
|
||||
}
|
||||
|
||||
assert_contains() {
|
||||
local output="$1"
|
||||
local expected="$2"
|
||||
local test_name="$3"
|
||||
if echo "$output" | grep -q "$expected"; then
|
||||
echo " PASS: $test_name"
|
||||
((PASS++))
|
||||
else
|
||||
echo " FAIL: $test_name (expected: $expected)"
|
||||
((FAIL++))
|
||||
fi
|
||||
}
|
||||
|
||||
assert_not_contains() {
|
||||
local output="$1"
|
||||
local expected="$2"
|
||||
local test_name="$3"
|
||||
if echo "$output" | grep -q "$expected"; then
|
||||
echo " FAIL: $test_name (unexpected: $expected)"
|
||||
((FAIL++))
|
||||
else
|
||||
echo " PASS: $test_name"
|
||||
((PASS++))
|
||||
fi
|
||||
}
|
||||
|
||||
cleanup_agent() {
|
||||
local name="$1"
|
||||
fireclaw agent stop "$name" 2>/dev/null || true
|
||||
sleep 1
|
||||
# Clean any leaked taps
|
||||
for tap in $(ip link show 2>/dev/null | grep -oP 'fctap\d+' | sort -u); do
|
||||
if [ "$tap" != "fctap200" ]; then
|
||||
# Check if tap is used by a running agent
|
||||
if ! fireclaw agent list 2>/dev/null | grep -q "$tap"; then
|
||||
sudo ip tuntap del "$tap" mode tap 2>/dev/null || true
|
||||
fi
|
||||
fi
|
||||
done
|
||||
}
|
||||
|
||||
echo "========================================="
|
||||
echo "Fireclaw Regression Test Suite"
|
||||
echo "========================================="
|
||||
echo ""
|
||||
|
||||
# Precondition: check overseer is running
|
||||
echo "[Pre] Checking overseer..."
|
||||
OUT=$(irc_cmd 2 "!help")
|
||||
if echo "$OUT" | grep -q "overseer.*PRIVMSG.*Commands:"; then
|
||||
echo " OK: Overseer is running"
|
||||
else
|
||||
echo " ERROR: Overseer not running. Start with: fireclaw overseer"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Clean any leftover agents
|
||||
echo "[Pre] Cleaning leftover agents..."
|
||||
for agent in $(fireclaw agent list 2>/dev/null | awk '{print $1}'); do
|
||||
cleanup_agent "$agent"
|
||||
done
|
||||
echo ""
|
||||
|
||||
# ==========================================
|
||||
echo "--- Test 1: Overseer help command ---"
|
||||
OUT=$(irc_cmd 2 "!help")
|
||||
assert_contains "$OUT" "Commands:" "!help returns command list"
|
||||
|
||||
echo "--- Test 2: Overseer templates ---"
|
||||
OUT=$(irc_cmd 2 "!templates")
|
||||
assert_contains "$OUT" "worker" "templates includes worker"
|
||||
assert_contains "$OUT" "coder" "templates includes coder"
|
||||
assert_contains "$OUT" "researcher" "templates includes researcher"
|
||||
|
||||
echo "--- Test 3: Empty agent list ---"
|
||||
OUT=$(irc_cmd 2 "!list")
|
||||
assert_contains "$OUT" "No agents running" "no agents when clean"
|
||||
|
||||
echo "--- Test 4: Invalid template ---"
|
||||
OUT=$(irc_cmd 2 "!invoke faketype")
|
||||
assert_contains "$OUT" "not found" "invalid template rejected"
|
||||
|
||||
echo "--- Test 5: Missing arguments ---"
|
||||
OUT=$(irc_cmd 2 "!invoke" "!destroy" "!model")
|
||||
assert_contains "$OUT" "Usage:" "!invoke usage shown"
|
||||
|
||||
echo "--- Test 6: Destroy nonexistent ---"
|
||||
OUT=$(irc_cmd 2 "!destroy ghost")
|
||||
assert_contains "$OUT" "not running" "destroy nonexistent handled"
|
||||
|
||||
echo ""
|
||||
echo "--- Test 7: Spawn worker agent ---"
|
||||
OUT=$(irc_cmd 8 "!invoke worker")
|
||||
assert_contains "$OUT" "Agent \"worker\" started" "worker spawned"
|
||||
|
||||
# Verify agent is running
|
||||
AGENTS=$(fireclaw agent list 2>&1)
|
||||
assert_contains "$AGENTS" "worker" "worker in agent list"
|
||||
|
||||
echo "--- Test 8: Worker appears in IRC ---"
|
||||
OUT=$({
|
||||
echo -e "NICK fctest2\r\nUSER fctest2 0 * :t\r\n"
|
||||
sleep 2
|
||||
echo -e "JOIN #agents\r\n"
|
||||
sleep 1
|
||||
echo -e "NAMES #agents\r\n"
|
||||
sleep 1
|
||||
echo -e "QUIT\r\n"
|
||||
} | nc -q 2 127.0.0.1 6667 2>&1)
|
||||
assert_contains "$OUT" "worker" "worker in NAMES list"
|
||||
|
||||
echo "--- Test 9: Worker responds to mention ---"
|
||||
OUT=$({
|
||||
echo -e "NICK fctest3\r\nUSER fctest3 0 * :t\r\n"
|
||||
sleep 2
|
||||
echo -e "JOIN #agents\r\n"
|
||||
sleep 1
|
||||
echo -e "PRIVMSG #agents :worker: say pong\r\n"
|
||||
sleep 25
|
||||
echo -e "QUIT\r\n"
|
||||
} | nc -q 2 127.0.0.1 6667 2>&1)
|
||||
assert_contains "$OUT" ":worker.*PRIVMSG" "worker responded"
|
||||
|
||||
echo "--- Test 10: Worker ignores non-mention ---"
|
||||
OUT=$({
|
||||
echo -e "NICK fctest4\r\nUSER fctest4 0 * :t\r\n"
|
||||
sleep 2
|
||||
echo -e "JOIN #agents\r\n"
|
||||
sleep 1
|
||||
echo -e "PRIVMSG #agents :hello everyone\r\n"
|
||||
sleep 10
|
||||
echo -e "QUIT\r\n"
|
||||
} | nc -q 2 127.0.0.1 6667 2>&1)
|
||||
assert_not_contains "$OUT" ":worker.*PRIVMSG" "worker stays quiet on non-mention"
|
||||
|
||||
echo "--- Test 11: Duplicate prevention ---"
|
||||
OUT=$(irc_cmd 3 "!invoke worker")
|
||||
assert_contains "$OUT" "already running" "duplicate rejected"
|
||||
|
||||
echo "--- Test 12: Named agent from template ---"
|
||||
OUT=$(irc_cmd 8 "!invoke worker helper2")
|
||||
assert_contains "$OUT" 'Agent "helper2" started' "named agent spawned"
|
||||
sleep 2
|
||||
|
||||
echo "--- Test 13: Multiple agents listed ---"
|
||||
OUT=$(irc_cmd 3 "!list")
|
||||
assert_contains "$OUT" "worker" "worker in list"
|
||||
assert_contains "$OUT" "helper2" "helper2 in list"
|
||||
|
||||
echo "--- Test 14: Destroy specific agent ---"
|
||||
OUT=$(irc_cmd 3 "!destroy helper2")
|
||||
assert_contains "$OUT" 'Agent "helper2" destroyed' "helper2 destroyed"
|
||||
|
||||
# Verify only worker remains
|
||||
sleep 2
|
||||
OUT=$({
|
||||
echo -e "NICK fcverify\r\nUSER fcverify 0 * :t\r\n"
|
||||
sleep 2
|
||||
echo -e "JOIN #agents\r\n"
|
||||
sleep 1
|
||||
echo -e "PRIVMSG #agents :!list\r\n"
|
||||
sleep 3
|
||||
echo -e "QUIT\r\n"
|
||||
} | nc -q 2 127.0.0.1 6667 2>&1 | grep ":overseer.*PRIVMSG.*#agents")
|
||||
assert_contains "$OUT" "worker" "worker still running"
|
||||
# Check that helper2 doesn't appear in the list output (exclude destroy confirmation)
|
||||
LIST_LINE=$(echo "$OUT" | grep -v "destroyed" | grep -v "Invoking" || true)
|
||||
assert_not_contains "$LIST_LINE" "helper2" "helper2 removed from list"
|
||||
cleanup_agent "helper2"
|
||||
|
||||
echo "--- Test 15: Destroy worker ---"
|
||||
OUT=$(irc_cmd 3 "!destroy worker")
|
||||
assert_contains "$OUT" 'Agent "worker" destroyed' "worker destroyed"
|
||||
cleanup_agent "worker"
|
||||
|
||||
echo "--- Test 16: Clean after destroy ---"
|
||||
OUT=$(irc_cmd 2 "!list")
|
||||
assert_contains "$OUT" "No agents running" "all agents cleaned"
|
||||
|
||||
echo ""
|
||||
echo "--- Test 17: CLI agent start/list/stop ---"
|
||||
fireclaw agent start worker 2>&1 | grep -q "started" && echo " PASS: CLI agent start" && ((PASS++)) || { echo " FAIL: CLI agent start"; ((FAIL++)); }
|
||||
sleep 3
|
||||
fireclaw agent list 2>&1 | grep -q "worker" && echo " PASS: CLI agent list" && ((PASS++)) || { echo " FAIL: CLI agent list"; ((FAIL++)); }
|
||||
fireclaw agent stop worker 2>&1 | grep -q "stopped" && echo " PASS: CLI agent stop" && ((PASS++)) || { echo " FAIL: CLI agent stop"; ((FAIL++)); }
|
||||
cleanup_agent "worker"
|
||||
|
||||
echo "--- Test 18: Ephemeral run still works ---"
|
||||
OUT=$(fireclaw run "echo ephemeral-test" 2>&1)
|
||||
assert_contains "$OUT" "ephemeral-test" "fireclaw run works"
|
||||
|
||||
echo ""
|
||||
echo "--- Test 19: Overseer crash recovery ---"
|
||||
# Spawn a worker
|
||||
fireclaw agent start worker 2>&1 | grep -q "started" && echo " PASS: worker started for crash test" && ((PASS++)) || { echo " FAIL: start worker for crash test"; ((FAIL++)); }
|
||||
sleep 5
|
||||
WORKER_PID=$(python3 -c "import json; d=json.load(open('$HOME/.fireclaw/agents.json')); print(d.get('worker',{}).get('pid',''))" 2>/dev/null)
|
||||
# SIGKILL the overseer (simulates crash, KillMode=process keeps worker alive)
|
||||
OVERSEER_PID=$(systemctl show fireclaw-overseer -p MainPID --value 2>/dev/null)
|
||||
if [ -n "$OVERSEER_PID" ] && [ "$OVERSEER_PID" != "0" ]; then
|
||||
sudo kill -9 "$OVERSEER_PID" 2>/dev/null
|
||||
sleep 8 # Wait for systemd to auto-restart (RestartSec=5 + buffer)
|
||||
# Check worker survived
|
||||
if kill -0 "$WORKER_PID" 2>/dev/null; then
|
||||
echo " PASS: worker survived overseer crash" && ((PASS++))
|
||||
else
|
||||
echo " FAIL: worker died with overseer" && ((FAIL++))
|
||||
fi
|
||||
# Check overseer restarted and adopted
|
||||
NEW_PID=$(systemctl show fireclaw-overseer -p MainPID --value 2>/dev/null)
|
||||
if [ -n "$NEW_PID" ] && [ "$NEW_PID" != "0" ] && [ "$NEW_PID" != "$OVERSEER_PID" ]; then
|
||||
echo " PASS: overseer auto-restarted" && ((PASS++))
|
||||
else
|
||||
echo " FAIL: overseer did not restart" && ((FAIL++))
|
||||
fi
|
||||
# Check adopted via !list
|
||||
sleep 2
|
||||
OUT=$({
|
||||
echo -e "NICK fcrecov\r\nUSER fcrecov 0 * :t\r\n"
|
||||
sleep 2
|
||||
echo -e "JOIN #agents\r\n"
|
||||
sleep 1
|
||||
echo -e "PRIVMSG #agents :!list\r\n"
|
||||
sleep 3
|
||||
echo -e "QUIT\r\n"
|
||||
} | nc -q 2 127.0.0.1 6667 2>&1)
|
||||
assert_contains "$OUT" "worker" "overseer adopted worker after crash"
|
||||
# Cleanup
|
||||
OUT2=$({
|
||||
echo -e "NICK fcrecov2\r\nUSER fcrecov2 0 * :t\r\n"
|
||||
sleep 2
|
||||
echo -e "JOIN #agents\r\n"
|
||||
sleep 1
|
||||
echo -e "PRIVMSG #agents :!destroy worker\r\n"
|
||||
sleep 5
|
||||
echo -e "QUIT\r\n"
|
||||
} | nc -q 2 127.0.0.1 6667 2>&1)
|
||||
else
|
||||
echo " SKIP: overseer not running via systemd, skipping crash test" && ((SKIP++))
|
||||
((SKIP++))
|
||||
((SKIP++))
|
||||
((SKIP++))
|
||||
cleanup_agent "worker"
|
||||
fi
|
||||
|
||||
echo ""
|
||||
echo "--- Test 20: Graceful agent shutdown (IRC QUIT) ---"
|
||||
# Spawn and destroy, check for clean QUIT
|
||||
{
|
||||
echo -e "NICK fcquit\r\nUSER fcquit 0 * :t\r\n"
|
||||
sleep 2
|
||||
echo -e "JOIN #agents\r\n"
|
||||
sleep 1
|
||||
echo -e "PRIVMSG #agents :!invoke worker\r\n"
|
||||
sleep 8
|
||||
echo -e "PRIVMSG #agents :!destroy worker\r\n"
|
||||
sleep 5
|
||||
echo -e "QUIT\r\n"
|
||||
} | nc -q 2 127.0.0.1 6667 2>&1 > /tmp/fc-quit-test.txt
|
||||
if grep -q "QUIT.*shutting down" /tmp/fc-quit-test.txt; then
|
||||
echo " PASS: agent sent IRC QUIT on destroy" && ((PASS++))
|
||||
else
|
||||
echo " FAIL: agent did not send IRC QUIT" && ((FAIL++))
|
||||
fi
|
||||
rm -f /tmp/fc-quit-test.txt
|
||||
cleanup_agent "worker"
|
||||
|
||||
echo ""
|
||||
echo "========================================="
|
||||
echo "Results: $PASS passed, $FAIL failed, $SKIP skipped"
|
||||
echo "========================================="
|
||||
|
||||
[ "$FAIL" -eq 0 ] && exit 0 || exit 1
|
||||
16
tsconfig.json
Normal file
16
tsconfig.json
Normal file
@@ -0,0 +1,16 @@
|
||||
{
|
||||
"compilerOptions": {
|
||||
"target": "ES2022",
|
||||
"module": "Node16",
|
||||
"moduleResolution": "Node16",
|
||||
"outDir": "./dist",
|
||||
"rootDir": "./src",
|
||||
"strict": true,
|
||||
"esModuleInterop": true,
|
||||
"skipLibCheck": true,
|
||||
"forceConsistentCasingInFileNames": true,
|
||||
"declaration": true,
|
||||
"sourceMap": true
|
||||
},
|
||||
"include": ["src/**/*"]
|
||||
}
|
||||
Reference in New Issue
Block a user