Compare commits
37 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| abc91bc149 | |||
| c827d341ab | |||
| 6c4ad47b09 | |||
| 5b312e34de | |||
| 6673210ff0 | |||
| d6a737fbad | |||
| fdfa468960 | |||
| ca224a0ae9 | |||
| 383113126f | |||
| 2d371ceb86 | |||
| d838fe08cf | |||
| deca7228c7 | |||
| e685a2a7ba | |||
| f4643b8c59 | |||
| 0ab04e1964 | |||
| 100bb98e62 | |||
| fe162d11f7 | |||
| 96dfa63c39 | |||
| 74e79f2870 | |||
| d9b695d5a0 | |||
| c363f45ffc | |||
| 426ca8f1c1 | |||
| e1f1a24a37 | |||
| 185cda575e | |||
| 9f624e9497 | |||
| 3083b5d9d7 | |||
| 3c00de75d1 | |||
| 9a06bbf5ea | |||
| a604d73340 | |||
| 2d42d498b3 | |||
| 4483b585a7 | |||
| 42870c7c1f | |||
| 590c88ecef | |||
| d3ed3619c2 | |||
| d299e394f0 | |||
| 2e5912e73c | |||
| 27cb6508dc |
37
IDEAS.md
37
IDEAS.md
@@ -16,6 +16,30 @@ View or live-edit an agent's persona via IRC. "Make the worker more sarcastic" w
|
||||
### !pause / !resume <agent>
|
||||
Temporarily mute an agent without destroying it. Agent stays alive but stops responding. Useful when you need a channel to yourself.
|
||||
|
||||
## Skill System (inspired by mitsuhiko/agent-stuff)
|
||||
|
||||
### SKILL.md pattern for agent tools
|
||||
Replace hardcoded tools in agent.py with a discoverable skill directory. Each skill is a folder with a SKILL.md (description, parameters, examples) and a script (run.sh/run.py).
|
||||
|
||||
```
|
||||
~/.fireclaw/skills/
|
||||
web_search/
|
||||
SKILL.md # name, description, parameters — parsed into tool definition
|
||||
run.py # actual implementation
|
||||
fetch_url/
|
||||
SKILL.md
|
||||
run.py
|
||||
git_diff/
|
||||
SKILL.md
|
||||
run.sh
|
||||
```
|
||||
|
||||
Agent discovers skills at boot, loads SKILL.md into Ollama tool definitions, invokes scripts on tool call. Adding a new tool = drop a folder. No agent.py changes needed.
|
||||
|
||||
Could also support per-template skill selection — coder gets git/code skills, researcher gets search/fetch skills, worker gets everything.
|
||||
|
||||
Reference: https://github.com/mitsuhiko/agent-stuff — Pi Coding Agent skill/extension architecture.
|
||||
|
||||
## Agent Tools
|
||||
|
||||
### Web search
|
||||
@@ -110,6 +134,19 @@ One-VM-per-service is overkill for trusted MCP servers but could be used for unt
|
||||
- **database** — SQLite or PostgreSQL query tool
|
||||
- **fetch** — HTTP fetch + readability extraction
|
||||
|
||||
### Cron / scheduled agents
|
||||
Add `schedule` field to templates (cron syntax). Overseer checks every minute, spawns matching agents, they do their task, report to #agents, self-destruct after timeout. Use cases: daily health checks, backup verification, digest summaries.
|
||||
|
||||
## Logging
|
||||
|
||||
### Centralized log viewer
|
||||
Agent logs go to /workspace/agent.log inside each VM. For a centralized web UI:
|
||||
- rsyslog on host (agents send to 172.16.0.1:514) for aggregation
|
||||
- frontail (`npx frontail /var/log/fireclaw/*.log --port 9001`) for browser-based real-time viewing
|
||||
- Or GoTTY (`gotty tail -f ...`) for zero-config web terminal
|
||||
|
||||
Start simple (plain files + !logs), add rsyslog + frontail when needed.
|
||||
|
||||
## Infrastructure
|
||||
|
||||
### Agent metrics dashboard
|
||||
|
||||
97
REPORT.md
Normal file
97
REPORT.md
Normal file
@@ -0,0 +1,97 @@
|
||||
# Fireclaw Code Review Report
|
||||
|
||||
Generated 2026-04-08. Full codebase analysis.
|
||||
|
||||
## Critical
|
||||
|
||||
### 1. Shell injection in agent-manager.ts — FIXED
|
||||
**File:** `src/agent-manager.ts:120-131`
|
||||
Config JSON and persona text interpolated directly into shell commands via `echo '${configJson}'`. If persona contains single quotes, shell breaks or injects arbitrary commands.
|
||||
**Fix:** Replaced with `tee` via stdin. No shell interpolation.
|
||||
|
||||
### 2. IP pool has no real locking — FIXED
|
||||
**File:** `src/network.ts:202-222`
|
||||
`openSync` created a lock file but never acquired an actual `flock`. Under concurrency, two agents could allocate the same IP.
|
||||
**Fix:** Atomic writes via `writeFileSync` + `renameSync`. Removed fake lock.
|
||||
|
||||
### 3. --no-snapshot flag broken — FALSE ALARM
|
||||
**File:** `src/cli.ts:38`
|
||||
Commander parses `--no-snapshot` as `{ snapshot: false }`. Verified: `opts.snapshot === false` is correct. No fix needed.
|
||||
|
||||
## High
|
||||
|
||||
### 4. SKILL.md parser fragile — FIXED
|
||||
**File:** `agent/agent.py:69-138`
|
||||
**Fix:** Added error logging on parse failures, flexible indent detection (2+ spaces), CRLF normalization, boolean case-insensitive (`true`/`True`/`yes`/`1`), parameter type validation with warnings.
|
||||
|
||||
### 5. SSH host key verification disabled — DOCUMENTED
|
||||
**Files:** `src/ssh.ts:104`, `src/agent-manager.ts`, `src/overseer.ts`
|
||||
`hostVerifier: () => true` and `StrictHostKeyChecking=no` everywhere. Acceptable on private bridge network (172.16.0.0/24) — VMs are ephemeral and host keys change on every boot. Conscious design decision, not an oversight.
|
||||
|
||||
### 6. Memory not fully reloaded after save_memory — FIXED
|
||||
**File:** `agent/agent.py:319-326`
|
||||
After save_memory, only MEMORY.md index was reloaded. Individual memory files were not re-read into system prompt.
|
||||
**Fix:** Extracted `reload_memory()` function that reloads index + all memory/*.md files.
|
||||
|
||||
## Medium
|
||||
|
||||
### 7. SSH options duplicated — FIXED
|
||||
**Files:** `src/agent-manager.ts`, `src/overseer.ts`
|
||||
Same SSH options array repeated 4+ times.
|
||||
**Fix:** Extracted `SSH_OPTS` constant in both files.
|
||||
|
||||
### 8. Process termination inconsistent — OPEN (low risk)
|
||||
**Files:** `src/firecracker-vm.ts:140-164`, `src/agent-manager.ts:319-332`
|
||||
Two different implementations. The ChildProcess version uses SIGTERM→SIGKILL, the PID version uses polling. Both work, different contexts (owned vs adopted processes).
|
||||
|
||||
### 9. killall python3 hardcoded — FIXED
|
||||
**Files:** `src/agent-manager.ts`
|
||||
**Fix:** Replaced `killall python3` with `pkill -f 'agent.py'`. Targets the specific script, not all python3 processes.
|
||||
|
||||
### 10. Test suite expects researcher template — FIXED
|
||||
**File:** `tests/test-suite.sh:105`
|
||||
Test asserts `researcher` template exists, but install script only created worker, coder, quick.
|
||||
**Fix:** Added researcher and creative templates to `scripts/install.sh`.
|
||||
|
||||
### 11. Bare exception handlers — FIXED
|
||||
**File:** `src/agent-manager.ts`
|
||||
**Fix:** Added error logging to cleanup catch blocks in listAgents and reconcileAgents. Remaining bare catches are intentional best-effort cleanup (umount, rmdir, unlink).
|
||||
|
||||
### 12. agent.py monolithic (598 lines) — OPEN
|
||||
**File:** `agent/agent.py`
|
||||
Handles IRC, skill discovery, tool execution, memory, config reload in one file. Functional but could benefit from splitting.
|
||||
|
||||
### 13. Unused writePool function — FIXED
|
||||
**File:** `src/network.ts:198`
|
||||
Left over after switching to `atomicWritePool`. Removed.
|
||||
|
||||
## Low
|
||||
|
||||
### 14. Hardcoded network interface fallback
|
||||
**File:** `src/network.ts:56` — defaults to `"eno2"` if route parsing fails.
|
||||
|
||||
### 15. Predictable mount point names
|
||||
**File:** `src/agent-manager.ts:94` — uses `Date.now()` instead of crypto random.
|
||||
|
||||
### 16. No Firecracker binary hash verification
|
||||
**File:** `scripts/install.sh:115-124` — downloads binary without SHA256 check.
|
||||
|
||||
### 17. Ollama response size unbounded
|
||||
**File:** `agent/agent.py:338` — `resp.read()` with no size limit.
|
||||
|
||||
### 18. IRC message splitting at 400 chars — FIXED
|
||||
**File:** `agent/agent.py:266`
|
||||
**Fix:** Reduced to 380 chars to stay within IRC's 512-byte line limit.
|
||||
|
||||
### 19. Thread safety on _last_response_time — FIXED
|
||||
**File:** `agent/agent.py:485`
|
||||
**Fix:** Added `_cooldown_lock` (threading.Lock) around cooldown check-and-set.
|
||||
|
||||
## Summary
|
||||
|
||||
| Status | Count |
|
||||
|---|---|
|
||||
| Fixed | 13 |
|
||||
| False alarm | 1 |
|
||||
| Open (medium) | 2 |
|
||||
| Open (low) | 4 |
|
||||
79
ROADMAP.md
79
ROADMAP.md
@@ -19,47 +19,60 @@
|
||||
|
||||
- [x] ngircd configured (`nyx.fireclaw.local`, FireclawNet)
|
||||
- [x] Channel layout: #control (overseer), #agents (common room), DMs, /invite
|
||||
- [x] Ollama with 5 models (qwen2.5-coder, qwen2.5, llama3.1, gemma3, phi4-mini)
|
||||
- [x] Ollama with 5+ models, hot-swappable per agent
|
||||
- [x] Agent rootfs — Alpine + Python IRC bot + podman + tools
|
||||
- [x] Agent manager — start/stop/list/reload long-running VMs
|
||||
- [x] Overseer — host-side IRC bot, !invoke/!destroy/!list/!model/!templates
|
||||
- [x] Overseer — !invoke, !destroy, !list, !model, !models, !templates, !persona, !status, !help
|
||||
- [x] 5 agent templates — worker, coder, researcher, quick, creative
|
||||
- [x] Agent tool access — shell commands + podman containers
|
||||
- [x] Persistent workspace — 64 MiB ext4 as second virtio drive at /workspace
|
||||
- [x] Agent memory system — MEMORY.md + save_memory tool, survives restarts
|
||||
- [x] Agent hot-reload — SSH config update + SIGHUP, no VM restart
|
||||
- [x] Non-root agents — unprivileged `agent` user
|
||||
- [x] Agent-to-agent via IRC mentions, 10s cooldown
|
||||
- [x] DM support — private messages without mention
|
||||
- [x] /invite support — agents auto-join invited channels
|
||||
- [x] Overseer resilience — crash recovery, agent adoption, KillMode=process
|
||||
- [x] Graceful shutdown — SSH SIGTERM → IRC QUIT → kill VM
|
||||
- [x] Systemd service — fireclaw-overseer.service
|
||||
- [x] Regression test suite — 20 tests
|
||||
- [x] Discoverable skill system — SKILL.md + run.py per tool, auto-loaded at boot
|
||||
- [x] Agent tools — run_command, web_search, fetch_url, save_memory
|
||||
- [x] Persistent workspace + memory system (MEMORY.md pattern)
|
||||
- [x] Agent hot-reload, non-root agents, agent-to-agent, DMs, /invite
|
||||
- [x] Overseer resilience, health checks, graceful shutdown, systemd
|
||||
|
||||
## Phase 4: Hardening & Performance
|
||||
## Phase 4: Hardening & Deployment (done)
|
||||
|
||||
- [ ] Network policies per agent — iptables rules per tap device
|
||||
- [ ] Warm pool — pre-booted VMs from snapshots for instant spawns
|
||||
- [x] Network policies, thread safety, trigger fix, race condition fix
|
||||
- [x] Install/uninstall scripts, deployed on Debian + Ubuntu + GPU server
|
||||
- [x] Refactor — shared firecracker-vm.ts, skill system extraction
|
||||
|
||||
### Remaining
|
||||
- [ ] Warm pool — pre-booted VMs from snapshots
|
||||
- [ ] Concurrent snapshot runs via network namespaces
|
||||
- [ ] Thin provisioning — device-mapper snapshots instead of full rootfs copies
|
||||
- [ ] Thread safety — lock around IRC socket writes in agent.py
|
||||
- [ ] Agent health checks — overseer monitors and restarts dead agents
|
||||
- [ ] Thin provisioning — device-mapper snapshots
|
||||
|
||||
## Phase 5: Advanced Features
|
||||
## Phase 5: Agent Intelligence
|
||||
|
||||
- [ ] Persistent agent memory v2 — richer structure, auto-save from conversations
|
||||
- [ ] Scheduled/cron tasks — agents that run on a timer
|
||||
- [ ] Advanced tool use — MCP tools, multi-step execution, file I/O
|
||||
- [ ] Cost tracking — log duration, model, tokens per interaction
|
||||
- [ ] Execution recording — full audit trail of agent actions
|
||||
Priority order by gain/complexity ratio.
|
||||
|
||||
## Phase 6: Ideas & Experiments
|
||||
### High priority (high gain, low-medium complexity)
|
||||
|
||||
- [ ] vsock — replace SSH with virtio-vsock for lower overhead
|
||||
- [ ] **Large output handling** — tool results >2K chars saved to workspace file, agent gets preview + can read the rest. Prevents context explosion. Simple, high impact.
|
||||
- [ ] **Iteration budget** — shared token/round budget across tool calls. Prevents runaway loops, especially with GPU server running faster models that chain more aggressively. Add per-template configurable limits.
|
||||
- [ ] **Skill registry as git repo** — separate git repo for community/shared skills. Clone into agent rootfs. `fireclaw skills pull` to update. Like agentskills.io but self-hosted on Gitea.
|
||||
- [ ] **Session persistence** — SQLite in workspace for conversation history. FTS5 full-text search over past sessions. Agents can search their own history.
|
||||
|
||||
### Medium priority (medium gain, medium complexity)
|
||||
|
||||
- [ ] **Context compression** — when conversation history exceeds threshold, LLM-summarize middle turns. Protect head (system prompt) and tail (recent messages). Keeps agents coherent in long conversations.
|
||||
- [ ] **Skill learning** — after complex multi-tool tasks, agent creates a new SKILL.md + run.py in workspace/skills. Next boot, new skill is available. Self-improving agents.
|
||||
- [x] **Scheduled/cron agents** — templates support `schedule` (5-field cron) and `schedule_timeout` fields. Overseer checks every 60s, spawns and auto-destroys.
|
||||
- [ ] **!logs command** — tail agent interaction history from workspace.
|
||||
|
||||
### Lower priority (good ideas, higher complexity or less immediate need)
|
||||
|
||||
- [x] **Dangerous command approval** — pattern-based detection (rm -rf, dd, mkfs, fork bombs, shutdown, etc.) blocks execution in run_command skill.
|
||||
- [ ] **Parallel tool execution** — detect independent tool calls, run concurrently. Needs safety heuristics (read-only, non-overlapping paths).
|
||||
- [ ] **Cost tracking** — Ollama returns token counts. Log per-interaction: duration, model, tokens, skill used.
|
||||
- [ ] **Execution recording** — full audit trail of all tool calls and results.
|
||||
|
||||
## Phase 6: Infrastructure
|
||||
|
||||
- [ ] MCP servers in Firecracker VM with podman containers
|
||||
- [ ] Webhook triggers — HTTP endpoint that spawns ephemeral agents
|
||||
- [ ] Alert forwarding — pipe system alerts into #agents
|
||||
- [ ] Web dashboard — status page for running agents
|
||||
- [ ] Podman-in-Firecracker — double isolation for untrusted container images
|
||||
- [ ] Honeypot mode — test agent safety with fake credentials/services
|
||||
- [ ] Self-healing rootfs — agents evolve their own images
|
||||
- [ ] Claude API backend — for tasks requiring deep reasoning
|
||||
- [ ] IRC federation — link nyx.fireclaw.local ↔ odin for external access
|
||||
|
||||
## Phase 7: Ideas & Experiments
|
||||
|
||||
See IDEAS.md for the full list.
|
||||
|
||||
63
TODO.md
63
TODO.md
@@ -3,36 +3,47 @@
|
||||
## Done
|
||||
|
||||
- [x] Firecracker CLI runner with snapshots (~1.1s)
|
||||
- [x] Alpine rootfs with ca-certificates, podman, python3
|
||||
- [x] Global `fireclaw` command
|
||||
- [x] Multi-agent system — overseer + agent VMs + IRC + Ollama
|
||||
- [x] 5 agent templates (worker, coder, researcher, quick, creative)
|
||||
- [x] 5 Ollama models (qwen2.5-coder, qwen2.5, llama3.1, gemma3, phi4-mini)
|
||||
- [x] Agent tool access — shell commands + podman containers
|
||||
- [x] Persistent workspace + memory system (MEMORY.md pattern)
|
||||
- [x] Agent hot-reload — model/persona swap via SSH + SIGHUP
|
||||
- [x] Non-root agents — unprivileged `agent` user
|
||||
- [x] Agent-to-agent via IRC mentions (10s cooldown)
|
||||
- [x] DM support — private messages, no mention needed
|
||||
- [x] /invite support — agents auto-join invited channels
|
||||
- [x] Channel layout — #control (commands), #agents (common), DMs
|
||||
- [x] Overseer resilience — crash recovery, agent adoption
|
||||
- [x] Graceful shutdown — IRC QUIT before VM kill
|
||||
- [x] Systemd service (KillMode=process)
|
||||
- [x] Regression test suite (20 tests)
|
||||
- [x] 5 templates, 5+ models, hot-reload, non-root agents
|
||||
- [x] Tools: run_command, web_search, fetch_url, save_memory
|
||||
- [x] Discoverable skill system — SKILL.md + run.py, auto-loaded
|
||||
- [x] Persistent workspace + memory (MEMORY.md pattern)
|
||||
- [x] Overseer: !invoke, !destroy, !list, !model, !models, !templates, !persona, !status, !help
|
||||
- [x] Health checks, crash recovery, graceful shutdown, systemd
|
||||
- [x] Network policies, thread safety, trigger fix, race condition fix
|
||||
- [x] Install/uninstall scripts, deployed on 2 machines
|
||||
- [x] Refactor: firecracker-vm.ts shared helpers, skill extraction
|
||||
- [x] Large output handling — save >2K results to file, preview + read_file skill
|
||||
- [x] Session persistence — SQLite + FTS5, conversation history survives restarts
|
||||
- [x] !logs — tail agent history from workspace
|
||||
- [x] Context compression — cached summaries, configurable threshold/keep
|
||||
- [x] write_file skill — agents can create and modify workspace files
|
||||
- [x] Structured system prompt — explicit tool descriptions, multi-agent awareness
|
||||
- [x] Per-template config — temperature, num_predict, context_size, compress settings
|
||||
- [x] Response quality — 500-char deque storage, 1024 default output tokens, 250-token summaries
|
||||
- [x] update.sh script — one-command rootfs patching and snapshot rebuild
|
||||
|
||||
## Next up
|
||||
## Next up (Phase 5 — by priority)
|
||||
|
||||
- [ ] Network policies per agent — restrict internet access
|
||||
- [ ] Warm pool — pre-booted VMs for instant agent spawns
|
||||
- [ ] Persistent agent memory improvements — richer memory structure, auto-save from conversations
|
||||
- [ ] Thin provisioning — device-mapper snapshots instead of full rootfs copies
|
||||
### Medium effort
|
||||
- [ ] Skill registry git repo — shared skills on Gitea, `fireclaw skills pull`
|
||||
|
||||
### Bigger items
|
||||
- [ ] Skill learning — agents create new skills from experience
|
||||
- [x] Cron agents — scheduled agent spawns (5-field cron in templates, auto-destroy timeout)
|
||||
- [x] Dangerous command approval — pattern detection blocks rm -rf /, dd, mkfs, fork bombs, etc.
|
||||
- [ ] Parallel tool execution — concurrent independent tool calls
|
||||
|
||||
## Polish
|
||||
|
||||
- [ ] Agent-to-agent response quality — small models (7B) parrot messages instead of answering. Needs better prompting ("don't repeat the question, answer it") or larger models (14B+). Claude API would help here.
|
||||
- [ ] Cost tracking per agent interaction
|
||||
- [ ] Cost tracking per interaction
|
||||
- [ ] Execution recording / audit trail
|
||||
- [ ] Agent health checks — overseer pings agents, restarts dead ones
|
||||
- [ ] Thread safety in agent.py — lock around IRC socket writes
|
||||
- [ ] Update regression tests for new channel layout
|
||||
- [ ] Update regression tests for skill system + channel layout
|
||||
|
||||
## Low priority (from REPORT.md)
|
||||
|
||||
- [ ] Hardcoded network interface fallback — `src/network.ts:56` defaults to `"eno2"` if route parsing fails
|
||||
- [ ] Predictable mount point names — `src/agent-manager.ts:94` uses `Date.now()` instead of crypto random
|
||||
- [ ] No Firecracker binary hash verification — `scripts/install.sh` downloads without SHA256 check
|
||||
- [ ] Ollama response size unbounded — `agent/tools.py` should limit `resp.read()` size
|
||||
- [ ] Process termination inconsistent — two patterns (ChildProcess vs PID polling), works but could consolidate
|
||||
|
||||
492
agent/agent.py
492
agent/agent.py
@@ -1,18 +1,22 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Fireclaw IRC agent — connects to IRC, responds via Ollama with tool access."""
|
||||
"""Fireclaw IRC agent — connects to IRC, responds via Ollama with discoverable skills."""
|
||||
|
||||
import os
|
||||
import socket
|
||||
import json
|
||||
import sys
|
||||
import time
|
||||
import subprocess
|
||||
import urllib.request
|
||||
import urllib.error
|
||||
import signal
|
||||
import threading
|
||||
import urllib.request
|
||||
from collections import deque
|
||||
|
||||
# Load config
|
||||
from skills import discover_skills, execute_skill, set_logger as set_skills_logger
|
||||
from tools import load_memory, query_ollama, set_logger as set_tools_logger
|
||||
from sessions import init_db, save_message, load_recent, set_logger as set_sessions_logger
|
||||
|
||||
# ─── Config ──────────────────────────────────────────────────────────
|
||||
|
||||
with open("/etc/agent/config.json") as f:
|
||||
CONFIG = json.load(f)
|
||||
|
||||
@@ -24,114 +28,80 @@ except FileNotFoundError:
|
||||
PERSONA = "You are a helpful assistant."
|
||||
|
||||
NICK = CONFIG.get("nick", "agent")
|
||||
CHANNEL = CONFIG.get("channel", "#agents")
|
||||
SERVER = CONFIG.get("server", "172.16.0.1")
|
||||
PORT = CONFIG.get("port", 6667)
|
||||
OLLAMA_URL = CONFIG.get("ollama_url", "http://172.16.0.1:11434")
|
||||
CONTEXT_SIZE = CONFIG.get("context_size", 20)
|
||||
MAX_RESPONSE_LINES = CONFIG.get("max_response_lines", 50)
|
||||
TOOLS_ENABLED = CONFIG.get("tools", True)
|
||||
MAX_TOOL_ROUNDS = CONFIG.get("max_tool_rounds", 5)
|
||||
MAX_TOOL_ROUNDS = CONFIG.get("max_tool_rounds", 10)
|
||||
NUM_PREDICT = CONFIG.get("num_predict", 1024)
|
||||
TEMPERATURE = CONFIG.get("temperature", 0.7)
|
||||
WORKSPACE = "/workspace"
|
||||
SKILL_DIRS = ["/opt/skills", f"{WORKSPACE}/skills"]
|
||||
COMPRESS_ENABLED = CONFIG.get("compress", True)
|
||||
COMPRESS_THRESHOLD = CONFIG.get("compress_threshold", 16)
|
||||
COMPRESS_KEEP = CONFIG.get("compress_keep", 8)
|
||||
|
||||
# Mutable runtime config — can be hot-reloaded via SIGHUP
|
||||
RUNTIME = {
|
||||
"model": CONFIG.get("model", "qwen2.5-coder:7b"),
|
||||
"trigger": CONFIG.get("trigger", "mention"),
|
||||
"persona": PERSONA,
|
||||
}
|
||||
|
||||
# Recent messages for context
|
||||
recent = deque(maxlen=CONTEXT_SIZE)
|
||||
|
||||
# Load persistent memory from workspace
|
||||
AGENT_MEMORY = ""
|
||||
try:
|
||||
import os
|
||||
with open(f"{WORKSPACE}/MEMORY.md") as f:
|
||||
AGENT_MEMORY = f.read().strip()
|
||||
# Also load all memory files referenced in the index
|
||||
mem_dir = f"{WORKSPACE}/memory"
|
||||
if os.path.isdir(mem_dir):
|
||||
for fname in sorted(os.listdir(mem_dir)):
|
||||
if fname.endswith(".md"):
|
||||
try:
|
||||
with open(f"{mem_dir}/{fname}") as f:
|
||||
topic = fname.replace(".md", "")
|
||||
AGENT_MEMORY += f"\n\n## {topic}\n{f.read().strip()}"
|
||||
except Exception:
|
||||
pass
|
||||
except FileNotFoundError:
|
||||
pass
|
||||
# ─── Logging ─────────────────────────────────────────────────────────
|
||||
|
||||
# Tool definitions for Ollama chat API
|
||||
TOOLS = [
|
||||
{
|
||||
"type": "function",
|
||||
"function": {
|
||||
"name": "run_command",
|
||||
"description": "Execute a shell command on this system and return the output. Use this to check system info, run scripts, fetch URLs, process data, etc.",
|
||||
"parameters": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"command": {
|
||||
"type": "string",
|
||||
"description": "The shell command to execute (bash)",
|
||||
}
|
||||
},
|
||||
"required": ["command"],
|
||||
},
|
||||
},
|
||||
},
|
||||
{
|
||||
"type": "function",
|
||||
"function": {
|
||||
"name": "save_memory",
|
||||
"description": "Save something important to your persistent memory. Use this to remember facts about users, lessons learned, project context, or anything you want to recall in future conversations. Memories survive restarts.",
|
||||
"parameters": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"topic": {
|
||||
"type": "string",
|
||||
"description": "Short topic name for the memory file (e.g. 'user_prefs', 'project_x', 'lessons')",
|
||||
},
|
||||
"content": {
|
||||
"type": "string",
|
||||
"description": "The memory content to save",
|
||||
},
|
||||
},
|
||||
"required": ["topic", "content"],
|
||||
},
|
||||
},
|
||||
},
|
||||
{
|
||||
"type": "function",
|
||||
"function": {
|
||||
"name": "web_search",
|
||||
"description": "Search the web using SearXNG. Returns titles, URLs, and snippets for the top results. Use this when you need current information or facts you're unsure about.",
|
||||
"parameters": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"query": {
|
||||
"type": "string",
|
||||
"description": "The search query",
|
||||
},
|
||||
"num_results": {
|
||||
"type": "integer",
|
||||
"description": "Number of results to return (default 5)",
|
||||
},
|
||||
},
|
||||
"required": ["query"],
|
||||
},
|
||||
},
|
||||
},
|
||||
]
|
||||
|
||||
SEARX_URL = CONFIG.get("searx_url", "https://searx.mymx.me")
|
||||
LOG_FILE = f"{WORKSPACE}/agent.log" if os.path.isdir(WORKSPACE) else None
|
||||
|
||||
|
||||
def log(msg):
|
||||
print(f"[agent:{NICK}] {msg}", flush=True)
|
||||
line = f"[{time.strftime('%H:%M:%S')}] {msg}"
|
||||
print(f"[agent:{NICK}] {line}", flush=True)
|
||||
if LOG_FILE:
|
||||
try:
|
||||
with open(LOG_FILE, "a") as f:
|
||||
f.write(line + "\n")
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
|
||||
# Inject logger into submodules
|
||||
set_skills_logger(log)
|
||||
set_tools_logger(log)
|
||||
set_sessions_logger(log)
|
||||
|
||||
# ─── Init ────────────────────────────────────────────────────────────
|
||||
|
||||
AGENT_MEMORY = load_memory(WORKSPACE)
|
||||
TOOLS, SKILL_SCRIPTS = discover_skills(SKILL_DIRS)
|
||||
log(f"Loaded {len(TOOLS)} skills: {', '.join(SKILL_SCRIPTS.keys())}")
|
||||
|
||||
db_conn = init_db(f"{WORKSPACE}/sessions.db")
|
||||
for msg in load_recent(db_conn, CONTEXT_SIZE):
|
||||
recent.append(msg)
|
||||
log(f"Session: restored {len(recent)} messages")
|
||||
|
||||
|
||||
def reload_memory():
|
||||
global AGENT_MEMORY
|
||||
AGENT_MEMORY = load_memory(WORKSPACE)
|
||||
|
||||
|
||||
def dispatch_tool(fn_name, fn_args, round_num):
|
||||
"""Execute a tool call via the skill system."""
|
||||
script = SKILL_SCRIPTS.get(fn_name)
|
||||
if not script:
|
||||
return f"[unknown tool: {fn_name}]"
|
||||
log(f"Skill [{round_num}]: {fn_name}({str(fn_args)[:60]})")
|
||||
result = execute_skill(script, fn_args, WORKSPACE, CONFIG)
|
||||
if fn_name == "save_memory":
|
||||
reload_memory()
|
||||
return result
|
||||
|
||||
|
||||
# ─── IRC Client ──────────────────────────────────────────────────────
|
||||
|
||||
|
||||
class IRCClient:
|
||||
@@ -161,9 +131,9 @@ class IRCClient:
|
||||
for line in text.split("\n"):
|
||||
line = line.strip()
|
||||
if line:
|
||||
while len(line) > 400:
|
||||
self.send(f"PRIVMSG {target} :{line[:400]}")
|
||||
line = line[400:]
|
||||
while len(line) > 380:
|
||||
self.send(f"PRIVMSG {target} :{line[:380]}")
|
||||
line = line[380:]
|
||||
self.send(f"PRIVMSG {target} :{line}")
|
||||
|
||||
def set_bot_mode(self):
|
||||
@@ -182,244 +152,83 @@ class IRCClient:
|
||||
return lines
|
||||
|
||||
|
||||
def run_command(command):
|
||||
"""Execute a shell command and return output."""
|
||||
log(f"Running command: {command[:100]}")
|
||||
# ─── Message Handling ────────────────────────────────────────────────
|
||||
|
||||
|
||||
_compression_cache = {"hash": None, "summary": None}
|
||||
|
||||
|
||||
def compress_messages(channel_msgs):
|
||||
"""Summarize older messages, keep recent ones intact. Caches summary."""
|
||||
if not COMPRESS_ENABLED or len(channel_msgs) <= COMPRESS_THRESHOLD:
|
||||
return channel_msgs
|
||||
|
||||
older = channel_msgs[:-COMPRESS_KEEP]
|
||||
keep = channel_msgs[-COMPRESS_KEEP:]
|
||||
|
||||
lines = [f"<{m['nick']}> {m['text']}" for m in older]
|
||||
conversation = "\n".join(lines)
|
||||
conv_hash = hash(conversation)
|
||||
|
||||
# Return cached summary if older messages haven't changed
|
||||
if _compression_cache["hash"] == conv_hash and _compression_cache["summary"]:
|
||||
return [{"nick": "_summary", "text": _compression_cache["summary"], "channel": channel_msgs[0]["channel"]}] + keep
|
||||
|
||||
try:
|
||||
result = subprocess.run(
|
||||
["bash", "-c", command],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
timeout=120,
|
||||
)
|
||||
output = result.stdout
|
||||
if result.stderr:
|
||||
output += f"\n[stderr] {result.stderr}"
|
||||
if result.returncode != 0:
|
||||
output += f"\n[exit code: {result.returncode}]"
|
||||
# Truncate very long output
|
||||
if len(output) > 2000:
|
||||
output = output[:2000] + "\n[output truncated]"
|
||||
return output.strip() or "[no output]"
|
||||
except subprocess.TimeoutExpired:
|
||||
return "[command timed out after 120s]"
|
||||
payload = json.dumps({
|
||||
"model": RUNTIME["model"],
|
||||
"messages": [
|
||||
{"role": "system", "content": "Summarize this IRC conversation in 3-5 sentences. Preserve key facts, decisions, questions, and any specific data mentioned. Be thorough but concise."},
|
||||
{"role": "user", "content": conversation},
|
||||
],
|
||||
"stream": False,
|
||||
"options": {"num_predict": 250},
|
||||
}).encode()
|
||||
req = urllib.request.Request(f"{OLLAMA_URL}/api/chat", data=payload, headers={"Content-Type": "application/json"})
|
||||
with urllib.request.urlopen(req, timeout=30) as resp:
|
||||
data = json.loads(resp.read(500_000))
|
||||
summary = data.get("message", {}).get("content", "").strip()
|
||||
if summary:
|
||||
_compression_cache["hash"] = conv_hash
|
||||
_compression_cache["summary"] = summary
|
||||
log(f"Compressed {len(older)} messages into summary")
|
||||
return [{"nick": "_summary", "text": summary, "channel": channel_msgs[0]["channel"]}] + keep
|
||||
except Exception as e:
|
||||
return f"[error: {e}]"
|
||||
log(f"Compression failed: {e}")
|
||||
|
||||
|
||||
def save_memory(topic, content):
|
||||
"""Save a memory to the persistent workspace."""
|
||||
import os
|
||||
mem_dir = f"{WORKSPACE}/memory"
|
||||
os.makedirs(mem_dir, exist_ok=True)
|
||||
|
||||
# Write the memory file
|
||||
filepath = f"{mem_dir}/{topic}.md"
|
||||
with open(filepath, "w") as f:
|
||||
f.write(content + "\n")
|
||||
|
||||
# Update MEMORY.md index
|
||||
index_path = f"{WORKSPACE}/MEMORY.md"
|
||||
existing = ""
|
||||
try:
|
||||
with open(index_path) as f:
|
||||
existing = f.read()
|
||||
except FileNotFoundError:
|
||||
existing = "# Agent Memory\n"
|
||||
|
||||
# Add or update entry
|
||||
entry = f"- [{topic}](memory/{topic}.md)"
|
||||
if topic not in existing:
|
||||
with open(index_path, "a") as f:
|
||||
f.write(f"\n{entry}")
|
||||
|
||||
# Reload memory into global
|
||||
global AGENT_MEMORY
|
||||
with open(index_path) as f:
|
||||
AGENT_MEMORY = f.read().strip()
|
||||
|
||||
log(f"Memory saved: {topic}")
|
||||
return f"Memory saved to {filepath}"
|
||||
|
||||
|
||||
def web_search(query, num_results=5):
|
||||
"""Search the web via SearXNG."""
|
||||
log(f"Web search: {query[:60]}")
|
||||
try:
|
||||
import urllib.parse
|
||||
params = urllib.parse.urlencode({"q": query, "format": "json"})
|
||||
req = urllib.request.Request(
|
||||
f"{SEARX_URL}/search?{params}",
|
||||
headers={"User-Agent": "fireclaw-agent"},
|
||||
)
|
||||
with urllib.request.urlopen(req, timeout=15) as resp:
|
||||
data = json.loads(resp.read())
|
||||
results = data.get("results", [])[:num_results]
|
||||
if not results:
|
||||
return "No results found."
|
||||
lines = []
|
||||
for r in results:
|
||||
title = r.get("title", "")
|
||||
url = r.get("url", "")
|
||||
snippet = r.get("content", "")[:150]
|
||||
lines.append(f"- {title}\n {url}\n {snippet}")
|
||||
return "\n".join(lines)
|
||||
except Exception as e:
|
||||
return f"[search error: {e}]"
|
||||
|
||||
|
||||
def try_parse_tool_call(text):
|
||||
"""Try to parse a text-based tool call from model output.
|
||||
Handles formats like:
|
||||
{"name": "run_command", "arguments": {"command": "uptime"}}
|
||||
<tool_call>{"name": "run_command", ...}</tool_call>
|
||||
Returns (name, args) tuple or None.
|
||||
"""
|
||||
import re
|
||||
# Strip tool_call tags if present
|
||||
text = re.sub(r"</?tool_call>", "", text).strip()
|
||||
# Try to find JSON in the text
|
||||
for start in range(len(text)):
|
||||
if text[start] == "{":
|
||||
for end in range(len(text), start, -1):
|
||||
if text[end - 1] == "}":
|
||||
try:
|
||||
obj = json.loads(text[start:end])
|
||||
name = obj.get("name")
|
||||
args = obj.get("arguments", {})
|
||||
if name and isinstance(args, dict):
|
||||
return (name, args)
|
||||
except json.JSONDecodeError:
|
||||
continue
|
||||
return None
|
||||
|
||||
|
||||
def ollama_request(payload):
|
||||
"""Make a request to Ollama API."""
|
||||
data = json.dumps(payload).encode("utf-8")
|
||||
req = urllib.request.Request(
|
||||
f"{OLLAMA_URL}/api/chat",
|
||||
data=data,
|
||||
headers={"Content-Type": "application/json"},
|
||||
)
|
||||
with urllib.request.urlopen(req, timeout=120) as resp:
|
||||
return json.loads(resp.read())
|
||||
|
||||
|
||||
def query_ollama(messages):
|
||||
"""Call Ollama chat API with tool support. Returns final response text."""
|
||||
payload = {
|
||||
"model": RUNTIME["model"],
|
||||
"messages": messages,
|
||||
"stream": False,
|
||||
"options": {"num_predict": 512},
|
||||
}
|
||||
|
||||
if TOOLS_ENABLED:
|
||||
payload["tools"] = TOOLS
|
||||
|
||||
for round_num in range(MAX_TOOL_ROUNDS):
|
||||
try:
|
||||
data = ollama_request(payload)
|
||||
except (urllib.error.URLError, TimeoutError) as e:
|
||||
return f"[error: {e}]"
|
||||
|
||||
msg = data.get("message", {})
|
||||
|
||||
# Check for structured tool calls from API
|
||||
tool_calls = msg.get("tool_calls")
|
||||
if tool_calls:
|
||||
messages.append(msg)
|
||||
|
||||
for tc in tool_calls:
|
||||
fn = tc.get("function", {})
|
||||
fn_name = fn.get("name", "")
|
||||
fn_args = fn.get("arguments", {})
|
||||
|
||||
if fn_name == "run_command":
|
||||
cmd = fn_args.get("command", "")
|
||||
log(f"Tool call [{round_num+1}/{MAX_TOOL_ROUNDS}]: {cmd[:80]}")
|
||||
result = run_command(cmd)
|
||||
messages.append({"role": "tool", "content": result})
|
||||
elif fn_name == "save_memory":
|
||||
topic = fn_args.get("topic", "note")
|
||||
content = fn_args.get("content", "")
|
||||
log(f"Tool call [{round_num+1}/{MAX_TOOL_ROUNDS}]: save_memory({topic})")
|
||||
result = save_memory(topic, content)
|
||||
messages.append({"role": "tool", "content": result})
|
||||
elif fn_name == "web_search":
|
||||
query = fn_args.get("query", "")
|
||||
num = fn_args.get("num_results", 5)
|
||||
log(f"Tool call [{round_num+1}/{MAX_TOOL_ROUNDS}]: web_search({query[:60]})")
|
||||
result = web_search(query, num)
|
||||
messages.append({"role": "tool", "content": result})
|
||||
else:
|
||||
messages.append({
|
||||
"role": "tool",
|
||||
"content": f"[unknown tool: {fn_name}]",
|
||||
})
|
||||
|
||||
payload["messages"] = messages
|
||||
continue
|
||||
|
||||
# Check for text-based tool calls (model dumped JSON as text)
|
||||
content = msg.get("content", "").strip()
|
||||
parsed_tool = try_parse_tool_call(content)
|
||||
if parsed_tool:
|
||||
fn_name, fn_args = parsed_tool
|
||||
messages.append({"role": "assistant", "content": content})
|
||||
if fn_name == "run_command":
|
||||
cmd = fn_args.get("command", "")
|
||||
log(f"Text tool call [{round_num+1}/{MAX_TOOL_ROUNDS}]: {cmd[:80]}")
|
||||
result = run_command(cmd)
|
||||
messages.append({"role": "user", "content": f"Command output:\n{result}\n\nNow provide your response to the user based on this output."})
|
||||
elif fn_name == "save_memory":
|
||||
topic = fn_args.get("topic", "note")
|
||||
mem_content = fn_args.get("content", "")
|
||||
log(f"Text tool call [{round_num+1}/{MAX_TOOL_ROUNDS}]: save_memory({topic})")
|
||||
result = save_memory(topic, mem_content)
|
||||
messages.append({"role": "user", "content": f"{result}\n\nNow respond to the user."})
|
||||
elif fn_name == "web_search":
|
||||
query = fn_args.get("query", "")
|
||||
num = fn_args.get("num_results", 5)
|
||||
log(f"Text tool call [{round_num+1}/{MAX_TOOL_ROUNDS}]: web_search({query[:60]})")
|
||||
result = web_search(query, num)
|
||||
messages.append({"role": "user", "content": f"Search results:\n{result}\n\nNow respond to the user based on these results."})
|
||||
payload["messages"] = messages
|
||||
continue
|
||||
|
||||
# No tool calls — return the text response
|
||||
return content
|
||||
|
||||
return "[max tool rounds reached]"
|
||||
return channel_msgs
|
||||
|
||||
|
||||
def build_messages(question, channel):
|
||||
"""Build chat messages with system prompt and conversation history."""
|
||||
system = RUNTIME["persona"]
|
||||
if TOOLS_ENABLED:
|
||||
system += "\n\nYou have access to tools:"
|
||||
system += "\n- run_command: Execute shell commands on your system."
|
||||
system += "\n- web_search: Search the web for current information."
|
||||
system += "\n- save_memory: Save important information to your persistent workspace."
|
||||
system += "\nUse tools when needed rather than guessing. Your workspace at /workspace persists across restarts."
|
||||
|
||||
# Environment
|
||||
system += f"\n\n## Environment\nYou are {NICK} in IRC channel {channel}. This is a multi-agent system — other nicks may be AI agents with their own tools. Keep responses concise (this is IRC). To address someone, prefix with their nick: 'coder: can you review this?'"
|
||||
|
||||
# Tools
|
||||
if TOOLS_ENABLED and TOOLS:
|
||||
system += "\n\n## Tools\nYou have tools — use them proactively instead of guessing or apologizing. If asked to do something, DO it with your tools."
|
||||
for t in TOOLS:
|
||||
fn = t["function"]
|
||||
system += f"\n- **{fn['name']}**: {fn.get('description', '')}"
|
||||
system += "\n\nYour workspace at /workspace persists across restarts. Write files, save results, read them back."
|
||||
|
||||
# Memory
|
||||
if AGENT_MEMORY and AGENT_MEMORY != "# Agent Memory":
|
||||
system += f"\n\nIMPORTANT - Your persistent memory (facts you saved previously, use these to answer questions):\n{AGENT_MEMORY}"
|
||||
system += f"\n\nYou are in IRC channel {channel}. Your nick is {NICK}. Keep responses concise — this is IRC."
|
||||
system += "\nWhen you want to address another agent or user, always start your message with their nick followed by a colon, e.g. 'coder: can you review this?'. This is how IRC mentions work — without the prefix, they won't see your message."
|
||||
system += f"\n\n## Your Memory\n{AGENT_MEMORY}"
|
||||
|
||||
messages = [{"role": "system", "content": system}]
|
||||
|
||||
# Build conversation history as alternating user/assistant messages
|
||||
channel_msgs = [m for m in recent if m["channel"] == channel]
|
||||
for msg in channel_msgs[-CONTEXT_SIZE:]:
|
||||
if msg["nick"] == NICK:
|
||||
channel_msgs = compress_messages(channel_msgs[-CONTEXT_SIZE:])
|
||||
for msg in channel_msgs:
|
||||
if msg["nick"] == "_summary":
|
||||
messages.append({"role": "system", "content": f"[earlier conversation summary] {msg['text']}"})
|
||||
elif msg["nick"] == NICK:
|
||||
messages.append({"role": "assistant", "content": msg["text"]})
|
||||
else:
|
||||
messages.append({"role": "user", "content": f"<{msg['nick']}> {msg['text']}"})
|
||||
|
||||
# Ensure the last message is from the user (the triggering question)
|
||||
# If the deque already captured it, don't double-add
|
||||
last = messages[-1] if len(messages) > 1 else None
|
||||
if not last or last.get("role") != "user" or question not in last.get("content", ""):
|
||||
messages.append({"role": "user", "content": question})
|
||||
@@ -428,14 +237,10 @@ def build_messages(question, channel):
|
||||
|
||||
|
||||
def should_trigger(text):
|
||||
"""Check if this message should trigger a response.
|
||||
Only triggers when nick is at the start of the message (e.g. 'worker: hello')
|
||||
not when nick appears elsewhere (e.g. 'coder: say hi to worker')."""
|
||||
if RUNTIME["trigger"] == "all":
|
||||
return True
|
||||
lower = text.lower()
|
||||
nick = NICK.lower()
|
||||
# Match: "nick: ...", "nick, ...", "nick ...", "@nick ..."
|
||||
return (
|
||||
lower.startswith(f"{nick}:") or
|
||||
lower.startswith(f"{nick},") or
|
||||
@@ -447,7 +252,6 @@ def should_trigger(text):
|
||||
|
||||
|
||||
def extract_question(text):
|
||||
"""Extract the actual question from the trigger."""
|
||||
lower = text.lower()
|
||||
for prefix in [
|
||||
f"{NICK.lower()}: ",
|
||||
@@ -462,13 +266,12 @@ def extract_question(text):
|
||||
return text
|
||||
|
||||
|
||||
# Track last response time to prevent agent-to-agent loops
|
||||
_last_response_time = 0
|
||||
_AGENT_COOLDOWN = 10 # seconds between responses to prevent loops
|
||||
_AGENT_COOLDOWN = 10
|
||||
_cooldown_lock = threading.Lock()
|
||||
|
||||
|
||||
def handle_message(irc, source_nick, target, text):
|
||||
"""Process an incoming PRIVMSG."""
|
||||
global _last_response_time
|
||||
|
||||
is_dm = not target.startswith("#")
|
||||
@@ -476,20 +279,20 @@ def handle_message(irc, source_nick, target, text):
|
||||
reply_to = source_nick if is_dm else target
|
||||
|
||||
recent.append({"nick": source_nick, "text": text, "channel": channel})
|
||||
save_message(db_conn, source_nick, channel, text)
|
||||
|
||||
if source_nick == NICK:
|
||||
return
|
||||
|
||||
# DMs always trigger, channel messages need mention
|
||||
if not is_dm and not should_trigger(text):
|
||||
return
|
||||
|
||||
# Cooldown to prevent agent-to-agent loops
|
||||
now = time.time()
|
||||
if now - _last_response_time < _AGENT_COOLDOWN:
|
||||
log(f"Cooldown active, ignoring trigger from {source_nick}")
|
||||
return
|
||||
_last_response_time = now
|
||||
with _cooldown_lock:
|
||||
now = time.time()
|
||||
if now - _last_response_time < _AGENT_COOLDOWN:
|
||||
log(f"Cooldown active, ignoring trigger from {source_nick}")
|
||||
return
|
||||
_last_response_time = now
|
||||
|
||||
question = extract_question(text) if not is_dm else text
|
||||
log(f"Triggered by {source_nick} in {channel}: {question[:80]}")
|
||||
@@ -497,7 +300,13 @@ def handle_message(irc, source_nick, target, text):
|
||||
def do_respond():
|
||||
try:
|
||||
messages = build_messages(question, channel)
|
||||
response = query_ollama(messages)
|
||||
response = query_ollama(
|
||||
messages, RUNTIME,
|
||||
TOOLS if TOOLS_ENABLED else [],
|
||||
SKILL_SCRIPTS, dispatch_tool,
|
||||
OLLAMA_URL, MAX_TOOL_ROUNDS,
|
||||
num_predict=NUM_PREDICT, temperature=TEMPERATURE,
|
||||
)
|
||||
|
||||
if not response:
|
||||
return
|
||||
@@ -508,7 +317,8 @@ def handle_message(irc, source_nick, target, text):
|
||||
lines.append(f"[truncated, {MAX_RESPONSE_LINES} lines max]")
|
||||
|
||||
irc.say(reply_to, "\n".join(lines))
|
||||
recent.append({"nick": NICK, "text": response[:200], "channel": channel})
|
||||
recent.append({"nick": NICK, "text": response[:500], "channel": channel})
|
||||
save_message(db_conn, NICK, channel, response[:500], full_text=response)
|
||||
except Exception as e:
|
||||
log(f"Error handling message: {e}")
|
||||
try:
|
||||
@@ -519,8 +329,11 @@ def handle_message(irc, source_nick, target, text):
|
||||
threading.Thread(target=do_respond, daemon=True).start()
|
||||
|
||||
|
||||
# ─── Main Loop ───────────────────────────────────────────────────────
|
||||
|
||||
|
||||
def run():
|
||||
log(f"Starting agent: nick={NICK} channel={CHANNEL} model={RUNTIME['model']} tools={TOOLS_ENABLED}")
|
||||
log(f"Starting agent: nick={NICK} model={RUNTIME['model']} tools={TOOLS_ENABLED}")
|
||||
|
||||
while True:
|
||||
try:
|
||||
@@ -528,7 +341,6 @@ def run():
|
||||
log(f"Connecting to {SERVER}:{PORT}...")
|
||||
irc.connect()
|
||||
|
||||
# Hot-reload on SIGHUP — re-read config and persona
|
||||
def handle_sighup(signum, frame):
|
||||
log("SIGHUP received, reloading config...")
|
||||
try:
|
||||
@@ -542,13 +354,12 @@ def run():
|
||||
except FileNotFoundError:
|
||||
pass
|
||||
log(f"Reloaded: model={RUNTIME['model']} trigger={RUNTIME['trigger']}")
|
||||
irc.say(CHANNEL, f"[reloaded: model={RUNTIME['model']}]")
|
||||
irc.say("#agents", f"[reloaded: model={RUNTIME['model']}]")
|
||||
except Exception as e:
|
||||
log(f"Reload failed: {e}")
|
||||
|
||||
signal.signal(signal.SIGHUP, handle_sighup)
|
||||
|
||||
# Graceful shutdown on SIGTERM — send IRC QUIT
|
||||
def handle_sigterm(signum, frame):
|
||||
log("SIGTERM received, quitting IRC...")
|
||||
try:
|
||||
@@ -577,9 +388,8 @@ def run():
|
||||
log("Registered with server")
|
||||
irc.set_bot_mode()
|
||||
irc.join("#agents")
|
||||
log(f"Joined #agents")
|
||||
log("Joined #agents")
|
||||
|
||||
# Handle INVITE — auto-join invited channels
|
||||
if parts[1] == "INVITE" and len(parts) >= 3:
|
||||
invited_channel = parts[-1].lstrip(":")
|
||||
inviter = parts[0].split("!")[0].lstrip(":")
|
||||
|
||||
87
agent/sessions.py
Normal file
87
agent/sessions.py
Normal file
@@ -0,0 +1,87 @@
|
||||
"""Session persistence — SQLite + FTS5 for conversation history."""
|
||||
|
||||
import sqlite3
|
||||
import threading
|
||||
import time
|
||||
|
||||
|
||||
def log(msg):
|
||||
print(f"[sessions] {msg}", flush=True)
|
||||
|
||||
|
||||
def set_logger(fn):
|
||||
global log
|
||||
log = fn
|
||||
|
||||
|
||||
_write_lock = threading.Lock()
|
||||
|
||||
|
||||
def init_db(db_path):
|
||||
"""Create/open the session database. Returns a connection."""
|
||||
conn = sqlite3.connect(db_path, check_same_thread=False)
|
||||
conn.execute("PRAGMA journal_mode=WAL")
|
||||
conn.execute("""
|
||||
CREATE TABLE IF NOT EXISTS messages (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
ts REAL NOT NULL,
|
||||
nick TEXT NOT NULL,
|
||||
channel TEXT NOT NULL,
|
||||
text TEXT NOT NULL,
|
||||
full_text TEXT
|
||||
)
|
||||
""")
|
||||
conn.execute("""
|
||||
CREATE INDEX IF NOT EXISTS idx_messages_channel_ts
|
||||
ON messages(channel, ts)
|
||||
""")
|
||||
# FTS5 virtual table for full-text search
|
||||
conn.execute("""
|
||||
CREATE VIRTUAL TABLE IF NOT EXISTS messages_fts
|
||||
USING fts5(text, content=messages, content_rowid=id)
|
||||
""")
|
||||
conn.commit()
|
||||
_prune(conn)
|
||||
log(f"Session DB ready: {db_path}")
|
||||
return conn
|
||||
|
||||
|
||||
def save_message(conn, nick, channel, text, full_text=None):
|
||||
"""Persist a message to the session database."""
|
||||
with _write_lock:
|
||||
cur = conn.execute(
|
||||
"INSERT INTO messages (ts, nick, channel, text, full_text) VALUES (?, ?, ?, ?, ?)",
|
||||
(time.time(), nick, channel, text, full_text),
|
||||
)
|
||||
conn.execute(
|
||||
"INSERT INTO messages_fts(rowid, text) VALUES (?, ?)",
|
||||
(cur.lastrowid, text),
|
||||
)
|
||||
conn.commit()
|
||||
|
||||
|
||||
def load_recent(conn, limit=20):
|
||||
"""Load the last N messages for boot recovery.
|
||||
Returns list of {"nick", "text", "channel"} dicts in chronological order."""
|
||||
rows = conn.execute(
|
||||
"SELECT nick, text, channel FROM messages ORDER BY id DESC LIMIT ?",
|
||||
(limit,),
|
||||
).fetchall()
|
||||
return [{"nick": r[0], "text": r[1], "channel": r[2]} for r in reversed(rows)]
|
||||
|
||||
|
||||
def _prune(conn, keep=1000):
|
||||
"""Delete old messages beyond the last `keep`. Runs once at init."""
|
||||
count = conn.execute("SELECT COUNT(*) FROM messages").fetchone()[0]
|
||||
if count <= keep:
|
||||
return
|
||||
deleted = count - keep
|
||||
conn.execute("""
|
||||
DELETE FROM messages WHERE id NOT IN (
|
||||
SELECT id FROM messages ORDER BY id DESC LIMIT ?
|
||||
)
|
||||
""", (keep,))
|
||||
# Rebuild FTS index after bulk delete
|
||||
conn.execute("INSERT INTO messages_fts(messages_fts) VALUES('rebuild')")
|
||||
conn.commit()
|
||||
log(f"Pruned {deleted} old messages (kept {keep})")
|
||||
175
agent/skills.py
Normal file
175
agent/skills.py
Normal file
@@ -0,0 +1,175 @@
|
||||
"""Skill discovery, parsing, and execution."""
|
||||
|
||||
import os
|
||||
import re
|
||||
import json
|
||||
import subprocess
|
||||
import time
|
||||
|
||||
|
||||
def log(msg):
|
||||
"""Import-safe logging — overridden by agent.py at init."""
|
||||
print(f"[skills] {msg}", flush=True)
|
||||
|
||||
|
||||
def set_logger(fn):
|
||||
"""Allow agent.py to inject its logger."""
|
||||
global log
|
||||
log = fn
|
||||
|
||||
|
||||
LARGE_OUTPUT_THRESHOLD = 2000
|
||||
_output_counter = 0
|
||||
|
||||
|
||||
def parse_skill_md(path):
|
||||
"""Parse a SKILL.md frontmatter into a tool definition.
|
||||
Returns tool definition dict or None on failure."""
|
||||
try:
|
||||
with open(path) as f:
|
||||
content = f.read()
|
||||
except Exception as e:
|
||||
log(f"Cannot read {path}: {e}")
|
||||
return None
|
||||
|
||||
content = content.replace("\r\n", "\n")
|
||||
|
||||
match = re.match(r"^---\n(.*?)\n---", content, re.DOTALL)
|
||||
if not match:
|
||||
log(f"No frontmatter in {path}")
|
||||
return None
|
||||
|
||||
fm = {}
|
||||
current_key = None
|
||||
current_param = None
|
||||
params = {}
|
||||
|
||||
for line in match.group(1).split("\n"):
|
||||
stripped = line.strip()
|
||||
if not stripped or stripped.startswith("#"):
|
||||
continue
|
||||
|
||||
indent = len(line) - len(line.lstrip())
|
||||
|
||||
if indent >= 2 and current_key == "parameters":
|
||||
if indent >= 4 and current_param:
|
||||
k, _, v = stripped.partition(":")
|
||||
k = k.strip()
|
||||
v = v.strip().strip('"').strip("'")
|
||||
if k == "required":
|
||||
v = v.lower() in ("true", "yes", "1")
|
||||
params[current_param][k] = v
|
||||
elif ":" in stripped:
|
||||
param_name = stripped.rstrip(":").strip()
|
||||
current_param = param_name
|
||||
params[param_name] = {}
|
||||
elif ":" in line and indent == 0:
|
||||
k, _, v = line.partition(":")
|
||||
k = k.strip()
|
||||
v = v.strip().strip('"').strip("'")
|
||||
fm[k] = v
|
||||
current_key = k
|
||||
if k == "parameters":
|
||||
current_param = None
|
||||
|
||||
if "name" not in fm:
|
||||
log(f"No 'name' field in {path}")
|
||||
return None
|
||||
|
||||
if "description" not in fm:
|
||||
log(f"Warning: no 'description' in {path}")
|
||||
|
||||
properties = {}
|
||||
required = []
|
||||
for pname, pdata in params.items():
|
||||
ptype = pdata.get("type", "string")
|
||||
if ptype not in ("string", "integer", "number", "boolean", "array", "object"):
|
||||
log(f"Warning: unknown type '{ptype}' for param '{pname}' in {path}")
|
||||
properties[pname] = {
|
||||
"type": ptype,
|
||||
"description": pdata.get("description", ""),
|
||||
}
|
||||
if pdata.get("required", False):
|
||||
required.append(pname)
|
||||
|
||||
return {
|
||||
"type": "function",
|
||||
"function": {
|
||||
"name": fm["name"],
|
||||
"description": fm.get("description", ""),
|
||||
"parameters": {
|
||||
"type": "object",
|
||||
"properties": properties,
|
||||
"required": required,
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
|
||||
def discover_skills(skill_dirs):
|
||||
"""Scan skill directories and return tool definitions + script paths."""
|
||||
tools = []
|
||||
scripts = {}
|
||||
|
||||
for skill_dir in skill_dirs:
|
||||
if not os.path.isdir(skill_dir):
|
||||
continue
|
||||
for name in sorted(os.listdir(skill_dir)):
|
||||
skill_path = os.path.join(skill_dir, name)
|
||||
skill_md = os.path.join(skill_path, "SKILL.md")
|
||||
if not os.path.isfile(skill_md):
|
||||
continue
|
||||
|
||||
tool_def = parse_skill_md(skill_md)
|
||||
if not tool_def:
|
||||
continue
|
||||
|
||||
for ext in ("run.py", "run.sh"):
|
||||
script = os.path.join(skill_path, ext)
|
||||
if os.path.isfile(script):
|
||||
scripts[tool_def["function"]["name"]] = script
|
||||
break
|
||||
|
||||
if tool_def["function"]["name"] in scripts:
|
||||
tools.append(tool_def)
|
||||
|
||||
return tools, scripts
|
||||
|
||||
|
||||
def execute_skill(script_path, args, workspace, config):
|
||||
"""Execute a skill script with args as JSON on stdin.
|
||||
Large outputs are saved to a file with a preview returned."""
|
||||
global _output_counter
|
||||
env = os.environ.copy()
|
||||
env["WORKSPACE"] = workspace
|
||||
env["SEARX_URL"] = config.get("searx_url", "https://searx.mymx.me")
|
||||
|
||||
try:
|
||||
result = subprocess.run(
|
||||
["python3" if script_path.endswith(".py") else "bash", script_path],
|
||||
input=json.dumps(args),
|
||||
capture_output=True,
|
||||
text=True,
|
||||
timeout=120,
|
||||
env=env,
|
||||
)
|
||||
output = result.stdout
|
||||
if result.stderr:
|
||||
output += f"\n[stderr] {result.stderr}"
|
||||
output = output.strip() or "[no output]"
|
||||
|
||||
if len(output) > LARGE_OUTPUT_THRESHOLD:
|
||||
output_dir = f"{workspace}/tool_outputs"
|
||||
os.makedirs(output_dir, exist_ok=True)
|
||||
_output_counter += 1
|
||||
filepath = f"{output_dir}/output_{_output_counter}.txt"
|
||||
with open(filepath, "w") as f:
|
||||
f.write(output)
|
||||
preview = output[:1500]
|
||||
return f"{preview}\n\n[output truncated — full result ({len(output)} chars) saved to {filepath}. Use read_file to view it.]"
|
||||
|
||||
return output
|
||||
except subprocess.TimeoutExpired:
|
||||
return "[skill timed out after 120s]"
|
||||
except Exception as e:
|
||||
return f"[skill error: {e}]"
|
||||
132
agent/tools.py
Normal file
132
agent/tools.py
Normal file
@@ -0,0 +1,132 @@
|
||||
"""LLM interaction, tool dispatch, and memory management."""
|
||||
|
||||
import os
|
||||
import re
|
||||
import json
|
||||
import urllib.request
|
||||
import urllib.error
|
||||
|
||||
|
||||
def log(msg):
|
||||
print(f"[tools] {msg}", flush=True)
|
||||
|
||||
|
||||
def set_logger(fn):
|
||||
global log
|
||||
log = fn
|
||||
|
||||
|
||||
# ─── Memory ──────────────────────────────────────────────────────────
|
||||
|
||||
def load_memory(workspace):
|
||||
"""Load all memory files from workspace."""
|
||||
memory = ""
|
||||
try:
|
||||
with open(f"{workspace}/MEMORY.md") as f:
|
||||
memory = f.read().strip()
|
||||
mem_dir = f"{workspace}/memory"
|
||||
if os.path.isdir(mem_dir):
|
||||
for fname in sorted(os.listdir(mem_dir)):
|
||||
if fname.endswith(".md"):
|
||||
try:
|
||||
with open(f"{mem_dir}/{fname}") as f:
|
||||
topic = fname.replace(".md", "")
|
||||
memory += f"\n\n## {topic}\n{f.read().strip()}"
|
||||
except Exception:
|
||||
pass
|
||||
except FileNotFoundError:
|
||||
pass
|
||||
return memory
|
||||
|
||||
|
||||
# ─── Tool Call Parsing ───────────────────────────────────────────────
|
||||
|
||||
def try_parse_tool_call(text):
|
||||
"""Parse text-based tool calls (model dumps JSON as text)."""
|
||||
text = re.sub(r"</?tool_call>", "", text).strip()
|
||||
for start in range(len(text)):
|
||||
if text[start] == "{":
|
||||
for end in range(len(text), start, -1):
|
||||
if text[end - 1] == "}":
|
||||
try:
|
||||
obj = json.loads(text[start:end])
|
||||
name = obj.get("name")
|
||||
args = obj.get("arguments", {})
|
||||
if name and isinstance(args, dict):
|
||||
return (name, args)
|
||||
except json.JSONDecodeError:
|
||||
continue
|
||||
return None
|
||||
|
||||
|
||||
# ─── LLM Interaction ────────────────────────────────────────────────
|
||||
|
||||
def ollama_request(ollama_url, payload):
|
||||
data = json.dumps(payload).encode("utf-8")
|
||||
req = urllib.request.Request(
|
||||
f"{ollama_url}/api/chat",
|
||||
data=data,
|
||||
headers={"Content-Type": "application/json"},
|
||||
)
|
||||
with urllib.request.urlopen(req, timeout=120) as resp:
|
||||
return json.loads(resp.read(2_000_000))
|
||||
|
||||
|
||||
def query_ollama(messages, runtime, tools, skill_scripts, dispatch_fn, ollama_url, max_rounds, num_predict=1024, temperature=0.7):
|
||||
"""Call Ollama chat API with skill-based tool support."""
|
||||
payload = {
|
||||
"model": runtime["model"],
|
||||
"messages": messages,
|
||||
"stream": False,
|
||||
"options": {"num_predict": num_predict, "temperature": temperature},
|
||||
}
|
||||
|
||||
if tools:
|
||||
payload["tools"] = tools
|
||||
|
||||
for round_num in range(max_rounds):
|
||||
remaining = max_rounds - round_num
|
||||
try:
|
||||
data = ollama_request(ollama_url, payload)
|
||||
except (urllib.error.URLError, TimeoutError) as e:
|
||||
return f"[error: {e}]"
|
||||
|
||||
msg = data.get("message", {})
|
||||
|
||||
# Structured tool calls
|
||||
tool_calls = msg.get("tool_calls")
|
||||
if tool_calls:
|
||||
messages.append(msg)
|
||||
for tc in tool_calls:
|
||||
fn = tc.get("function", {})
|
||||
result = dispatch_fn(
|
||||
fn.get("name", ""),
|
||||
fn.get("arguments", {}),
|
||||
round_num + 1,
|
||||
)
|
||||
if remaining <= 2:
|
||||
result += f"\n[warning: {remaining - 1} tool rounds remaining — wrap up]"
|
||||
messages.append({"role": "tool", "content": result})
|
||||
payload["messages"] = messages
|
||||
continue
|
||||
|
||||
# Text-based tool calls
|
||||
content = msg.get("content", "").strip()
|
||||
parsed_tool = try_parse_tool_call(content)
|
||||
if parsed_tool:
|
||||
fn_name, fn_args = parsed_tool
|
||||
if fn_name in skill_scripts:
|
||||
messages.append({"role": "assistant", "content": content})
|
||||
result = dispatch_fn(fn_name, fn_args, round_num + 1)
|
||||
if remaining <= 2:
|
||||
result += f"\n[warning: {remaining - 1} tool rounds remaining — wrap up]"
|
||||
messages.append({
|
||||
"role": "user",
|
||||
"content": f"Tool result:\n{result}\n\nNow respond to the user based on this result.",
|
||||
})
|
||||
payload["messages"] = messages
|
||||
continue
|
||||
|
||||
return content
|
||||
|
||||
return "[max tool rounds reached]"
|
||||
@@ -306,10 +306,12 @@ else
|
||||
' || err "Failed to install packages in chroot"
|
||||
ok "Alpine packages installed"
|
||||
|
||||
log "Installing agent script and config..."
|
||||
sudo mkdir -p /tmp/agent-build-mnt/opt/agent /tmp/agent-build-mnt/etc/agent
|
||||
sudo cp "$SCRIPT_DIR/agent/agent.py" /tmp/agent-build-mnt/opt/agent/agent.py
|
||||
log "Installing agent script, skills, and config..."
|
||||
sudo mkdir -p /tmp/agent-build-mnt/opt/agent /tmp/agent-build-mnt/opt/skills /tmp/agent-build-mnt/etc/agent
|
||||
sudo cp "$SCRIPT_DIR/agent/"*.py /tmp/agent-build-mnt/opt/agent/
|
||||
sudo chmod +x /tmp/agent-build-mnt/opt/agent/agent.py
|
||||
sudo cp -r "$SCRIPT_DIR/skills/"* /tmp/agent-build-mnt/opt/skills/
|
||||
sudo chmod +x /tmp/agent-build-mnt/opt/skills/*/run.*
|
||||
|
||||
echo '{"nick":"agent","model":"qwen2.5-coder:7b","trigger":"mention","server":"172.16.0.1","port":6667,"ollama_url":"http://172.16.0.1:11434"}' | \
|
||||
sudo tee /tmp/agent-build-mnt/etc/agent/config.json > /dev/null
|
||||
@@ -422,14 +424,16 @@ step "Agent templates"
|
||||
TMPL_DIR="$FIRECLAW_DIR/templates"
|
||||
mkdir -p "$TMPL_DIR"
|
||||
|
||||
for tmpl in worker coder quick; do
|
||||
for tmpl in worker coder researcher quick creative; do
|
||||
if [[ -f "$TMPL_DIR/$tmpl.json" ]]; then
|
||||
skip "$tmpl"
|
||||
else
|
||||
case $tmpl in
|
||||
worker) echo '{"name":"worker","nick":"worker","model":"qwen2.5-coder:7b","trigger":"mention","persona":"You are a general-purpose assistant on IRC. Keep responses concise."}' > "$TMPL_DIR/$tmpl.json" ;;
|
||||
coder) echo '{"name":"coder","nick":"coder","model":"qwen2.5-coder:7b","trigger":"mention","persona":"You are a code-focused assistant on IRC. Be direct and technical."}' > "$TMPL_DIR/$tmpl.json" ;;
|
||||
quick) echo '{"name":"quick","nick":"quick","model":"phi4-mini","trigger":"mention","tools":false,"network":"none","persona":"You are a fast assistant on IRC. One sentence answers."}' > "$TMPL_DIR/$tmpl.json" ;;
|
||||
worker) echo '{"name":"worker","nick":"worker","model":"qwen2.5:7b","trigger":"mention","persona":"You are a general-purpose assistant on IRC. Keep responses concise."}' > "$TMPL_DIR/$tmpl.json" ;;
|
||||
coder) echo '{"name":"coder","nick":"coder","model":"qwen2.5-coder:7b","trigger":"mention","persona":"You are a code-focused assistant on IRC. Be direct and technical."}' > "$TMPL_DIR/$tmpl.json" ;;
|
||||
researcher) echo '{"name":"researcher","nick":"research","model":"llama3.1:8b","trigger":"mention","persona":"You are a research assistant on IRC. Use numbered points for complex topics. Keep responses to 5-10 lines."}' > "$TMPL_DIR/$tmpl.json" ;;
|
||||
quick) echo '{"name":"quick","nick":"quick","model":"phi4-mini","trigger":"mention","tools":false,"network":"none","persona":"You are a fast assistant on IRC. One sentence answers."}' > "$TMPL_DIR/$tmpl.json" ;;
|
||||
creative) echo '{"name":"creative","nick":"muse","model":"gemma3:4b","trigger":"mention","tools":false,"persona":"You are a creative assistant on IRC. Help with writing, brainstorming, ideas."}' > "$TMPL_DIR/$tmpl.json" ;;
|
||||
esac
|
||||
ok "$tmpl template created"
|
||||
fi
|
||||
|
||||
83
scripts/update.sh
Executable file
83
scripts/update.sh
Executable file
@@ -0,0 +1,83 @@
|
||||
#!/bin/bash
|
||||
# Update fireclaw agent code and skills in the rootfs.
|
||||
# Stops the overseer, patches the rootfs, rebuilds snapshot, restarts.
|
||||
#
|
||||
# Usage: ./scripts/update.sh
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
log() { echo -e "\033[1;34m[fireclaw]\033[0m $*"; }
|
||||
step() { echo -e "\n\033[1;32m━━━ $* ━━━\033[0m"; }
|
||||
ok() { echo -e " \033[0;32m✓\033[0m $*"; }
|
||||
err() { echo -e "\033[1;31m[error]\033[0m $*" >&2; exit 1; }
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
|
||||
FIRECLAW_DIR="$HOME/.fireclaw"
|
||||
ROOTFS="$FIRECLAW_DIR/agent-rootfs.ext4"
|
||||
MNT="/tmp/fireclaw-update-mnt"
|
||||
|
||||
[[ ! -f "$ROOTFS" ]] && err "No rootfs found at $ROOTFS — run install.sh first."
|
||||
|
||||
# ─── Stop overseer ──────────────────────────────────────────────────
|
||||
|
||||
step "Stop overseer"
|
||||
if systemctl is-active --quiet fireclaw-overseer 2>/dev/null; then
|
||||
sudo systemctl stop fireclaw-overseer
|
||||
ok "Overseer stopped"
|
||||
else
|
||||
ok "Overseer not running"
|
||||
fi
|
||||
|
||||
# Wait for any firecracker processes to exit
|
||||
sleep 1
|
||||
|
||||
# ─── Build TypeScript ───────────────────────────────────────────────
|
||||
|
||||
step "Build TypeScript"
|
||||
cd "$SCRIPT_DIR"
|
||||
npm run build
|
||||
ok "TypeScript compiled"
|
||||
|
||||
# ─── Patch rootfs ───────────────────────────────────────────────────
|
||||
|
||||
step "Patch rootfs"
|
||||
|
||||
sudo mkdir -p "$MNT"
|
||||
sudo mount "$ROOTFS" "$MNT" || err "Failed to mount rootfs"
|
||||
|
||||
trap 'sudo umount "$MNT" 2>/dev/null; sudo rmdir "$MNT" 2>/dev/null' EXIT
|
||||
|
||||
sudo mkdir -p "$MNT/opt/agent" "$MNT/opt/skills"
|
||||
|
||||
sudo cp "$SCRIPT_DIR/agent/"*.py "$MNT/opt/agent/"
|
||||
sudo chmod +x "$MNT/opt/agent/agent.py"
|
||||
|
||||
sudo rm -rf "$MNT/opt/skills/"*
|
||||
sudo cp -r "$SCRIPT_DIR/skills/"* "$MNT/opt/skills/"
|
||||
sudo chmod +x "$MNT/opt/skills/"*/run.*
|
||||
|
||||
sudo umount "$MNT"
|
||||
sudo rmdir "$MNT"
|
||||
trap - EXIT
|
||||
|
||||
ok "Agent + skills updated in rootfs"
|
||||
|
||||
# ─── Rebuild snapshot ───────────────────────────────────────────────
|
||||
|
||||
step "Rebuild snapshot"
|
||||
|
||||
rm -f "$FIRECLAW_DIR/snapshot.state" \
|
||||
"$FIRECLAW_DIR/snapshot.mem" \
|
||||
"$FIRECLAW_DIR/snapshot-rootfs.ext4"
|
||||
|
||||
fireclaw snapshot create
|
||||
ok "Snapshot rebuilt"
|
||||
|
||||
# ─── Restart overseer ──────────────────────────────────────────────
|
||||
|
||||
step "Restart overseer"
|
||||
sudo systemctl start fireclaw-overseer
|
||||
ok "Overseer started"
|
||||
|
||||
echo ""
|
||||
log "Update complete. Use IRC to test."
|
||||
9
skills/fetch_url/SKILL.md
Normal file
9
skills/fetch_url/SKILL.md
Normal file
@@ -0,0 +1,9 @@
|
||||
---
|
||||
name: fetch_url
|
||||
description: Fetch a URL and return its text content. HTML is stripped to plain text. Use this to read web pages, documentation, articles, etc.
|
||||
parameters:
|
||||
url:
|
||||
type: string
|
||||
description: The URL to fetch
|
||||
required: true
|
||||
---
|
||||
49
skills/fetch_url/run.py
Normal file
49
skills/fetch_url/run.py
Normal file
@@ -0,0 +1,49 @@
|
||||
#!/usr/bin/env python3
|
||||
import sys
|
||||
import json
|
||||
import re
|
||||
import urllib.request
|
||||
from html.parser import HTMLParser
|
||||
|
||||
args = json.loads(sys.stdin.read())
|
||||
url = args.get("url", "")
|
||||
|
||||
|
||||
class TextExtractor(HTMLParser):
|
||||
def __init__(self):
|
||||
super().__init__()
|
||||
self.text = []
|
||||
self._skip = False
|
||||
|
||||
def handle_starttag(self, tag, attrs):
|
||||
if tag in ("script", "style", "noscript"):
|
||||
self._skip = True
|
||||
|
||||
def handle_endtag(self, tag):
|
||||
if tag in ("script", "style", "noscript"):
|
||||
self._skip = False
|
||||
if tag in ("p", "br", "div", "h1", "h2", "h3", "h4", "li", "tr"):
|
||||
self.text.append("\n")
|
||||
|
||||
def handle_data(self, data):
|
||||
if not self._skip:
|
||||
self.text.append(data)
|
||||
|
||||
|
||||
try:
|
||||
req = urllib.request.Request(url, headers={"User-Agent": "fireclaw-agent"})
|
||||
with urllib.request.urlopen(req, timeout=15) as resp:
|
||||
content_type = resp.headers.get("Content-Type", "")
|
||||
raw = resp.read(50_000).decode("utf-8", errors="replace")
|
||||
|
||||
if "html" in content_type:
|
||||
parser = TextExtractor()
|
||||
parser.feed(raw)
|
||||
text = "".join(parser.text)
|
||||
else:
|
||||
text = raw
|
||||
|
||||
text = re.sub(r"\n{3,}", "\n\n", text).strip()
|
||||
print(text or "[empty page]")
|
||||
except Exception as e:
|
||||
print(f"[fetch error: {e}]")
|
||||
17
skills/read_file/SKILL.md
Normal file
17
skills/read_file/SKILL.md
Normal file
@@ -0,0 +1,17 @@
|
||||
---
|
||||
name: read_file
|
||||
description: Read a file from the workspace with optional line range. Use this to view large tool outputs, logs, or any file in /workspace.
|
||||
parameters:
|
||||
path:
|
||||
type: string
|
||||
description: Path to the file to read (must be under /workspace)
|
||||
required: true
|
||||
offset:
|
||||
type: integer
|
||||
description: Start reading from this line number (default 1)
|
||||
required: false
|
||||
limit:
|
||||
type: integer
|
||||
description: Maximum number of lines to return (default 200)
|
||||
required: false
|
||||
---
|
||||
67
skills/read_file/run.py
Normal file
67
skills/read_file/run.py
Normal file
@@ -0,0 +1,67 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Read a file from /workspace with optional line range."""
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
|
||||
args = json.loads(sys.stdin.read())
|
||||
path = args.get("path", "")
|
||||
offset = max(int(args.get("offset", 1)), 1)
|
||||
limit = max(int(args.get("limit", 200)), 1)
|
||||
|
||||
WORKSPACE = os.environ.get("WORKSPACE", "/workspace")
|
||||
|
||||
# Resolve to absolute and ensure it stays under /workspace
|
||||
resolved = os.path.realpath(path)
|
||||
if not resolved.startswith(WORKSPACE + "/") and resolved != WORKSPACE:
|
||||
print(f"[error: path must be under {WORKSPACE}]")
|
||||
sys.exit(0)
|
||||
|
||||
if not os.path.exists(resolved):
|
||||
print(f"[error: file not found: {path}]")
|
||||
sys.exit(0)
|
||||
|
||||
if os.path.isdir(resolved):
|
||||
entries = sorted(os.listdir(resolved))
|
||||
print(f"Directory listing of {path} ({len(entries)} entries):")
|
||||
for entry in entries[:100]:
|
||||
full = os.path.join(resolved, entry)
|
||||
kind = "dir" if os.path.isdir(full) else "file"
|
||||
print(f" {entry} ({kind})")
|
||||
if len(entries) > 100:
|
||||
print(f" ... and {len(entries) - 100} more")
|
||||
sys.exit(0)
|
||||
|
||||
# Binary detection — check first 512 bytes
|
||||
try:
|
||||
with open(resolved, "rb") as f:
|
||||
chunk = f.read(512)
|
||||
if b"\x00" in chunk:
|
||||
size = os.path.getsize(resolved)
|
||||
print(f"[binary file: {path} ({size} bytes)]")
|
||||
sys.exit(0)
|
||||
except Exception as e:
|
||||
print(f"[error reading file: {e}]")
|
||||
sys.exit(0)
|
||||
|
||||
# Read with line range
|
||||
try:
|
||||
with open(resolved) as f:
|
||||
lines = f.readlines()
|
||||
except Exception as e:
|
||||
print(f"[error reading file: {e}]")
|
||||
sys.exit(0)
|
||||
|
||||
total = len(lines)
|
||||
start_idx = offset - 1 # 0-based
|
||||
end_idx = min(start_idx + limit, total)
|
||||
|
||||
if start_idx >= total:
|
||||
print(f"[file has {total} lines — offset {offset} is beyond end of file]")
|
||||
sys.exit(0)
|
||||
|
||||
for i in range(start_idx, end_idx):
|
||||
print(f"{i + 1}\t{lines[i]}", end="")
|
||||
|
||||
if end_idx < total:
|
||||
print(f"\n[showing lines {offset}-{end_idx} of {total} — use offset={end_idx + 1} to read more]")
|
||||
9
skills/run_command/SKILL.md
Normal file
9
skills/run_command/SKILL.md
Normal file
@@ -0,0 +1,9 @@
|
||||
---
|
||||
name: run_command
|
||||
description: Execute a shell command on this system and return the output. Use this to check system info, run scripts, fetch URLs, process data, etc.
|
||||
parameters:
|
||||
command:
|
||||
type: string
|
||||
description: The shell command to execute (bash)
|
||||
required: true
|
||||
---
|
||||
51
skills/run_command/run.py
Normal file
51
skills/run_command/run.py
Normal file
@@ -0,0 +1,51 @@
|
||||
#!/usr/bin/env python3
|
||||
import subprocess
|
||||
import sys
|
||||
import json
|
||||
import re
|
||||
|
||||
DANGEROUS_PATTERNS = [
|
||||
(re.compile(r'\brm\s+(-[a-zA-Z]*f[a-zA-Z]*\s+)?-[a-zA-Z]*r[a-zA-Z]*\s+(/|~|\.)(\s|$)'), "recursive delete of critical path"),
|
||||
(re.compile(r'\brm\s+(-[a-zA-Z]*r[a-zA-Z]*\s+)?-[a-zA-Z]*f[a-zA-Z]*\s+(/|~|\.)(\s|$)'), "recursive delete of critical path"),
|
||||
(re.compile(r'\bdd\s+if='), "raw disk write (dd)"),
|
||||
(re.compile(r'\bmkfs\b'), "filesystem format"),
|
||||
(re.compile(r':\(\)\s*\{[^}]*:\s*\|\s*:'), "fork bomb"),
|
||||
(re.compile(r'>\s*/dev/[sh]d[a-z]'), "device write"),
|
||||
(re.compile(r'\bchmod\s+(-[a-zA-Z]*R[a-zA-Z]*\s+)?777\s+/(\s|$)'), "recursive chmod 777 on /"),
|
||||
(re.compile(r'\b(shutdown|reboot|halt|poweroff)\b'), "system shutdown/reboot"),
|
||||
(re.compile(r'\bkill\s+-9\s+(-1|1)\b'), "kill init or all processes"),
|
||||
]
|
||||
|
||||
|
||||
def check_dangerous(cmd):
|
||||
for pattern, desc in DANGEROUS_PATTERNS:
|
||||
if pattern.search(cmd):
|
||||
return desc
|
||||
return None
|
||||
|
||||
|
||||
args = json.loads(sys.stdin.read())
|
||||
command = args.get("command", "")
|
||||
|
||||
blocked = check_dangerous(command)
|
||||
if blocked:
|
||||
print(f'[blocked: command matches dangerous pattern "{blocked}". This command was not executed.]')
|
||||
sys.exit(0)
|
||||
|
||||
try:
|
||||
result = subprocess.run(
|
||||
["bash", "-c", command],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
timeout=120,
|
||||
)
|
||||
output = result.stdout
|
||||
if result.stderr:
|
||||
output += f"\n[stderr] {result.stderr}"
|
||||
if result.returncode != 0:
|
||||
output += f"\n[exit code: {result.returncode}]"
|
||||
print(output.strip() or "[no output]")
|
||||
except subprocess.TimeoutExpired:
|
||||
print("[command timed out after 120s]")
|
||||
except Exception as e:
|
||||
print(f"[error: {e}]")
|
||||
13
skills/save_memory/SKILL.md
Normal file
13
skills/save_memory/SKILL.md
Normal file
@@ -0,0 +1,13 @@
|
||||
---
|
||||
name: save_memory
|
||||
description: Save something important to your persistent memory. Use this to remember facts about users, lessons learned, project context, or anything you want to recall in future conversations. Memories survive restarts.
|
||||
parameters:
|
||||
topic:
|
||||
type: string
|
||||
description: "Short topic name for the memory file (e.g. 'user_prefs', 'project_x', 'lessons')"
|
||||
required: true
|
||||
content:
|
||||
type: string
|
||||
description: The memory content to save
|
||||
required: true
|
||||
---
|
||||
33
skills/save_memory/run.py
Normal file
33
skills/save_memory/run.py
Normal file
@@ -0,0 +1,33 @@
|
||||
#!/usr/bin/env python3
|
||||
import sys
|
||||
import json
|
||||
import os
|
||||
|
||||
args = json.loads(sys.stdin.read())
|
||||
topic = args.get("topic", "note")
|
||||
content = args.get("content", "")
|
||||
workspace = os.environ.get("WORKSPACE", "/workspace")
|
||||
|
||||
mem_dir = f"{workspace}/memory"
|
||||
os.makedirs(mem_dir, exist_ok=True)
|
||||
|
||||
# Write the memory file
|
||||
filepath = f"{mem_dir}/{topic}.md"
|
||||
with open(filepath, "w") as f:
|
||||
f.write(content + "\n")
|
||||
|
||||
# Update MEMORY.md index
|
||||
index_path = f"{workspace}/MEMORY.md"
|
||||
existing = ""
|
||||
try:
|
||||
with open(index_path) as f:
|
||||
existing = f.read()
|
||||
except FileNotFoundError:
|
||||
existing = "# Agent Memory\n"
|
||||
|
||||
entry = f"- [{topic}](memory/{topic}.md)"
|
||||
if topic not in existing:
|
||||
with open(index_path, "a") as f:
|
||||
f.write(f"\n{entry}")
|
||||
|
||||
print(f"Memory saved to {filepath}")
|
||||
13
skills/web_search/SKILL.md
Normal file
13
skills/web_search/SKILL.md
Normal file
@@ -0,0 +1,13 @@
|
||||
---
|
||||
name: web_search
|
||||
description: Search the web using SearXNG. Returns titles, URLs, and snippets for the top results. Use this when you need current information or facts you're unsure about.
|
||||
parameters:
|
||||
query:
|
||||
type: string
|
||||
description: The search query
|
||||
required: true
|
||||
num_results:
|
||||
type: integer
|
||||
description: Number of results to return (default 5)
|
||||
required: false
|
||||
---
|
||||
32
skills/web_search/run.py
Normal file
32
skills/web_search/run.py
Normal file
@@ -0,0 +1,32 @@
|
||||
#!/usr/bin/env python3
|
||||
import sys
|
||||
import json
|
||||
import urllib.request
|
||||
import urllib.parse
|
||||
|
||||
args = json.loads(sys.stdin.read())
|
||||
query = args.get("query", "")
|
||||
num_results = args.get("num_results", 5)
|
||||
searx_url = args.get("_searx_url", "https://searx.mymx.me")
|
||||
|
||||
try:
|
||||
params = urllib.parse.urlencode({"q": query, "format": "json"})
|
||||
req = urllib.request.Request(
|
||||
f"{searx_url}/search?{params}",
|
||||
headers={"User-Agent": "fireclaw-agent"},
|
||||
)
|
||||
with urllib.request.urlopen(req, timeout=15) as resp:
|
||||
data = json.loads(resp.read())
|
||||
results = data.get("results", [])[:num_results]
|
||||
if not results:
|
||||
print("No results found.")
|
||||
else:
|
||||
lines = []
|
||||
for r in results:
|
||||
title = r.get("title", "")
|
||||
url = r.get("url", "")
|
||||
snippet = r.get("content", "")[:150]
|
||||
lines.append(f"- {title}\n {url}\n {snippet}")
|
||||
print("\n".join(lines))
|
||||
except Exception as e:
|
||||
print(f"[search error: {e}]")
|
||||
17
skills/write_file/SKILL.md
Normal file
17
skills/write_file/SKILL.md
Normal file
@@ -0,0 +1,17 @@
|
||||
---
|
||||
name: write_file
|
||||
description: Write content to a file in /workspace. Creates parent directories if needed. Use this to save scripts, reports, data, or any output you want to persist.
|
||||
parameters:
|
||||
path:
|
||||
type: string
|
||||
description: Path to write (must be under /workspace)
|
||||
required: true
|
||||
content:
|
||||
type: string
|
||||
description: Content to write to the file
|
||||
required: true
|
||||
append:
|
||||
type: boolean
|
||||
description: If true, append to existing file instead of overwriting (default false)
|
||||
required: false
|
||||
---
|
||||
32
skills/write_file/run.py
Normal file
32
skills/write_file/run.py
Normal file
@@ -0,0 +1,32 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Write content to a file under /workspace."""
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
|
||||
args = json.loads(sys.stdin.read())
|
||||
path = args.get("path", "")
|
||||
content = args.get("content", "")
|
||||
append = args.get("append", False)
|
||||
|
||||
WORKSPACE = os.environ.get("WORKSPACE", "/workspace")
|
||||
|
||||
resolved = os.path.realpath(path)
|
||||
if not resolved.startswith(WORKSPACE + "/") and resolved != WORKSPACE:
|
||||
print(f"[error: path must be under {WORKSPACE}]")
|
||||
sys.exit(0)
|
||||
|
||||
if os.path.isdir(resolved):
|
||||
print(f"[error: {path} is a directory]")
|
||||
sys.exit(0)
|
||||
|
||||
try:
|
||||
os.makedirs(os.path.dirname(resolved), exist_ok=True)
|
||||
mode = "a" if append else "w"
|
||||
with open(resolved, mode) as f:
|
||||
f.write(content)
|
||||
size = os.path.getsize(resolved)
|
||||
action = "appended to" if append else "wrote"
|
||||
print(f"[{action} {path} ({size} bytes)]")
|
||||
except Exception as e:
|
||||
print(f"[error: {e}]")
|
||||
@@ -1,4 +1,3 @@
|
||||
import { spawn } from "node:child_process";
|
||||
import {
|
||||
existsSync,
|
||||
mkdirSync,
|
||||
@@ -11,19 +10,26 @@ import {
|
||||
import { join } from "node:path";
|
||||
import { execFileSync } from "node:child_process";
|
||||
import { CONFIG } from "./config.js";
|
||||
|
||||
const SSH_OPTS = [
|
||||
"-o", "StrictHostKeyChecking=no",
|
||||
"-o", "UserKnownHostsFile=/dev/null",
|
||||
"-o", "ConnectTimeout=5",
|
||||
"-i", CONFIG.sshKeyPath,
|
||||
];
|
||||
import {
|
||||
ensureBridge,
|
||||
ensureNat,
|
||||
allocateIp,
|
||||
releaseIp,
|
||||
createTap,
|
||||
deleteTap,
|
||||
macFromOctet,
|
||||
applyNetworkPolicy,
|
||||
removeNetworkPolicy,
|
||||
type NetworkPolicy,
|
||||
} from "./network.js";
|
||||
import * as api from "./firecracker-api.js";
|
||||
import {
|
||||
setupNetwork,
|
||||
spawnFirecracker,
|
||||
bootVM,
|
||||
} from "./firecracker-vm.js";
|
||||
|
||||
export interface AgentInfo {
|
||||
name: string;
|
||||
@@ -39,13 +45,26 @@ export interface AgentInfo {
|
||||
startedAt: string;
|
||||
}
|
||||
|
||||
interface AgentTemplate {
|
||||
export interface AgentTemplate {
|
||||
name: string;
|
||||
nick: string;
|
||||
model: string;
|
||||
trigger: string;
|
||||
persona: string;
|
||||
network?: NetworkPolicy;
|
||||
// Agent runtime settings (passed to config.json)
|
||||
tools?: boolean;
|
||||
context_size?: number;
|
||||
num_predict?: number;
|
||||
temperature?: number;
|
||||
compress?: boolean;
|
||||
compress_threshold?: number;
|
||||
compress_keep?: number;
|
||||
max_tool_rounds?: number;
|
||||
max_response_lines?: number;
|
||||
// Cron scheduling
|
||||
schedule?: string; // 5-field cron expression, e.g. "0 8 * * *"
|
||||
schedule_timeout?: number; // seconds before auto-destroy (default 300)
|
||||
}
|
||||
|
||||
const AGENTS_FILE = join(CONFIG.baseDir, "agents.json");
|
||||
@@ -89,7 +108,7 @@ export function listTemplates(): string[] {
|
||||
|
||||
function injectAgentConfig(
|
||||
rootfsPath: string,
|
||||
config: { nick: string; model: string; trigger: string },
|
||||
config: Record<string, unknown>,
|
||||
persona: string
|
||||
) {
|
||||
const mountPoint = `/tmp/fireclaw-agent-${Date.now()}`;
|
||||
@@ -104,35 +123,25 @@ function injectAgentConfig(
|
||||
{ stdio: "pipe" }
|
||||
);
|
||||
|
||||
// Write config
|
||||
// Write config (via stdin to avoid shell injection)
|
||||
const configJson = JSON.stringify({
|
||||
nick: config.nick,
|
||||
model: config.model,
|
||||
trigger: config.trigger,
|
||||
server: "172.16.0.1",
|
||||
port: 6667,
|
||||
ollama_url: "http://172.16.0.1:11434",
|
||||
...config,
|
||||
});
|
||||
const configPath = join(mountPoint, "etc/agent/config.json");
|
||||
execFileSync("sudo", ["tee", configPath], {
|
||||
input: configJson,
|
||||
stdio: ["pipe", "pipe", "pipe"],
|
||||
});
|
||||
execFileSync(
|
||||
"sudo",
|
||||
[
|
||||
"bash",
|
||||
"-c",
|
||||
`echo '${configJson}' > ${join(mountPoint, "etc/agent/config.json")}`,
|
||||
],
|
||||
{ stdio: "pipe" }
|
||||
);
|
||||
|
||||
// Write persona
|
||||
execFileSync(
|
||||
"sudo",
|
||||
[
|
||||
"bash",
|
||||
"-c",
|
||||
`cat > ${join(mountPoint, "etc/agent/persona.md")} << 'PERSONA_EOF'\n${persona}\nPERSONA_EOF`,
|
||||
],
|
||||
{ stdio: "pipe" }
|
||||
);
|
||||
// Write persona (via stdin to avoid shell injection)
|
||||
const personaPath = join(mountPoint, "etc/agent/persona.md");
|
||||
execFileSync("sudo", ["tee", personaPath], {
|
||||
input: persona,
|
||||
stdio: ["pipe", "pipe", "pipe"],
|
||||
});
|
||||
|
||||
// Inject SSH key for debugging access
|
||||
execFileSync("sudo", ["mkdir", "-p", join(mountPoint, "root/.ssh")], {
|
||||
@@ -201,24 +210,6 @@ function ensureWorkspace(agentName: string): string {
|
||||
return imgPath;
|
||||
}
|
||||
|
||||
function waitForSocket(socketPath: string): Promise<void> {
|
||||
return new Promise((resolve, reject) => {
|
||||
const deadline = Date.now() + 5_000;
|
||||
const check = () => {
|
||||
if (existsSync(socketPath)) {
|
||||
setTimeout(resolve, 200);
|
||||
return;
|
||||
}
|
||||
if (Date.now() > deadline) {
|
||||
reject(new Error("Firecracker socket did not appear"));
|
||||
return;
|
||||
}
|
||||
setTimeout(check, 50);
|
||||
};
|
||||
check();
|
||||
});
|
||||
}
|
||||
|
||||
export async function startAgent(
|
||||
templateName: string,
|
||||
overrides?: { name?: string; model?: string }
|
||||
@@ -254,58 +245,41 @@ export async function startAgent(
|
||||
// Clean stale socket from previous run
|
||||
try { unlinkSync(socketPath); } catch {}
|
||||
|
||||
// Prepare rootfs
|
||||
// Prepare rootfs — pass all template settings to agent config
|
||||
copyFileSync(AGENT_ROOTFS, rootfsPath);
|
||||
injectAgentConfig(
|
||||
rootfsPath,
|
||||
{ nick, model, trigger: template.trigger },
|
||||
template.persona
|
||||
);
|
||||
const agentConfig: Record<string, unknown> = {
|
||||
nick,
|
||||
model,
|
||||
trigger: template.trigger,
|
||||
};
|
||||
// Forward optional template settings
|
||||
for (const key of [
|
||||
"tools", "context_size", "num_predict", "temperature",
|
||||
"compress", "compress_threshold", "compress_keep",
|
||||
"max_tool_rounds", "max_response_lines",
|
||||
] as const) {
|
||||
if ((template as unknown as Record<string, unknown>)[key] !== undefined) {
|
||||
agentConfig[key] = (template as unknown as Record<string, unknown>)[key];
|
||||
}
|
||||
}
|
||||
injectAgentConfig(rootfsPath, agentConfig, template.persona);
|
||||
|
||||
// Create/get persistent workspace
|
||||
const workspacePath = ensureWorkspace(name);
|
||||
|
||||
// Setup network
|
||||
ensureBridge();
|
||||
ensureNat();
|
||||
deleteTap(tapDevice); // clean stale tap from previous run
|
||||
createTap(tapDevice);
|
||||
setupNetwork(tapDevice);
|
||||
|
||||
// Boot VM
|
||||
const proc = spawn(
|
||||
CONFIG.firecrackerBin,
|
||||
["--api-sock", socketPath],
|
||||
{ stdio: "pipe", detached: true }
|
||||
);
|
||||
proc.unref();
|
||||
|
||||
await waitForSocket(socketPath);
|
||||
|
||||
const bootArgs = [
|
||||
"console=ttyS0",
|
||||
"reboot=k",
|
||||
"panic=1",
|
||||
"pci=off",
|
||||
"root=/dev/vda",
|
||||
"rw",
|
||||
`ip=${ip}::${CONFIG.bridge.gateway}:${CONFIG.bridge.netmask}::eth0:off`,
|
||||
].join(" ");
|
||||
|
||||
await api.putBootSource(socketPath, CONFIG.kernelPath, bootArgs);
|
||||
await api.putDrive(socketPath, "rootfs", rootfsPath);
|
||||
await api.putDrive(socketPath, "workspace", workspacePath, false, false);
|
||||
await api.putNetworkInterface(
|
||||
const proc = await spawnFirecracker(socketPath, { detached: true });
|
||||
await bootVM({
|
||||
socketPath,
|
||||
"eth0",
|
||||
rootfsPath,
|
||||
extraDrives: [{ id: "workspace", path: workspacePath }],
|
||||
tapDevice,
|
||||
macFromOctet(octet)
|
||||
);
|
||||
await api.putMachineConfig(
|
||||
socketPath,
|
||||
CONFIG.vm.vcpuCount,
|
||||
CONFIG.vm.memSizeMib
|
||||
);
|
||||
await api.startInstance(socketPath);
|
||||
ip,
|
||||
octet,
|
||||
});
|
||||
|
||||
// Apply network policy
|
||||
const networkPolicy: NetworkPolicy = template.network ?? "full";
|
||||
@@ -348,14 +322,7 @@ export async function stopAgent(name: string) {
|
||||
try {
|
||||
execFileSync(
|
||||
"ssh",
|
||||
[
|
||||
"-o", "StrictHostKeyChecking=no",
|
||||
"-o", "UserKnownHostsFile=/dev/null",
|
||||
"-o", "ConnectTimeout=3",
|
||||
"-i", CONFIG.sshKeyPath,
|
||||
`root@${info.ip}`,
|
||||
"killall python3 2>/dev/null; sleep 1",
|
||||
],
|
||||
[...SSH_OPTS, `root@${info.ip}`, "pkill -f 'agent.py' 2>/dev/null; sleep 1"],
|
||||
{ stdio: "pipe", timeout: 5_000 }
|
||||
);
|
||||
} catch {
|
||||
@@ -413,18 +380,10 @@ export function listAgents(): AgentInfo[] {
|
||||
} catch {
|
||||
// Process is dead, clean up
|
||||
log(`Agent "${name}" is dead, cleaning up...`);
|
||||
try {
|
||||
deleteTap(info.tapDevice);
|
||||
} catch {}
|
||||
try {
|
||||
releaseIp(info.octet);
|
||||
} catch {}
|
||||
try {
|
||||
unlinkSync(info.rootfsPath);
|
||||
} catch {}
|
||||
try {
|
||||
unlinkSync(info.socketPath);
|
||||
} catch {}
|
||||
try { deleteTap(info.tapDevice); } catch (e) { log(` tap cleanup: ${e}`); }
|
||||
try { releaseIp(info.octet); } catch (e) { log(` ip cleanup: ${e}`); }
|
||||
try { unlinkSync(info.rootfsPath); } catch (e) { log(` rootfs cleanup: ${e}`); }
|
||||
try { unlinkSync(info.socketPath); } catch (e) { log(` socket cleanup: ${e}`); }
|
||||
delete agents[name];
|
||||
}
|
||||
}
|
||||
@@ -452,13 +411,6 @@ export async function reloadAgent(
|
||||
}
|
||||
if (updates.trigger) configUpdates.trigger = updates.trigger;
|
||||
|
||||
// Write updated config as a temp file on the VM via SSH
|
||||
const sshOpts = [
|
||||
"-o", "StrictHostKeyChecking=no",
|
||||
"-o", "UserKnownHostsFile=/dev/null",
|
||||
"-o", "ConnectTimeout=5",
|
||||
"-i", CONFIG.sshKeyPath,
|
||||
];
|
||||
const sshTarget = `root@${info.ip}`;
|
||||
|
||||
try {
|
||||
@@ -466,7 +418,7 @@ export async function reloadAgent(
|
||||
// Read current config from VM
|
||||
const currentRaw = execFileSync(
|
||||
"ssh",
|
||||
[...sshOpts, sshTarget, "cat /etc/agent/config.json"],
|
||||
[...SSH_OPTS, sshTarget, "cat /etc/agent/config.json"],
|
||||
{ encoding: "utf-8", timeout: 10_000 }
|
||||
);
|
||||
const current = JSON.parse(currentRaw);
|
||||
@@ -476,7 +428,7 @@ export async function reloadAgent(
|
||||
// Write back via stdin
|
||||
execFileSync(
|
||||
"ssh",
|
||||
[...sshOpts, sshTarget, `cat > /etc/agent/config.json`],
|
||||
[...SSH_OPTS, sshTarget, `cat > /etc/agent/config.json`],
|
||||
{ input: newConfig, timeout: 10_000 }
|
||||
);
|
||||
}
|
||||
@@ -484,7 +436,7 @@ export async function reloadAgent(
|
||||
if (updates.persona) {
|
||||
execFileSync(
|
||||
"ssh",
|
||||
[...sshOpts, sshTarget, `cat > /etc/agent/persona.md`],
|
||||
[...SSH_OPTS, sshTarget, `cat > /etc/agent/persona.md`],
|
||||
{ input: updates.persona, timeout: 10_000 }
|
||||
);
|
||||
}
|
||||
@@ -492,7 +444,7 @@ export async function reloadAgent(
|
||||
// Signal agent to reload
|
||||
execFileSync(
|
||||
"ssh",
|
||||
[...sshOpts, sshTarget, "killall -HUP python3"],
|
||||
[...SSH_OPTS, sshTarget, "pkill -HUP -f 'agent.py'"],
|
||||
{ stdio: "pipe", timeout: 10_000 }
|
||||
);
|
||||
} catch (err) {
|
||||
@@ -522,11 +474,10 @@ export function reconcileAgents(): { adopted: string[]; cleaned: string[] } {
|
||||
log(`Adopted running agent "${name}" (PID ${info.pid}, ${info.ip})`);
|
||||
} else {
|
||||
log(`Cleaning dead agent "${name}" (PID ${info.pid} gone)...`);
|
||||
// Clean up resources from dead agent
|
||||
try { deleteTap(info.tapDevice); } catch {}
|
||||
try { releaseIp(info.octet); } catch {}
|
||||
try { unlinkSync(info.rootfsPath); } catch {}
|
||||
try { unlinkSync(info.socketPath); } catch {}
|
||||
try { deleteTap(info.tapDevice); } catch (e) { log(` tap: ${e}`); }
|
||||
try { releaseIp(info.octet); } catch (e) { log(` ip: ${e}`); }
|
||||
try { unlinkSync(info.rootfsPath); } catch (e) { log(` rootfs: ${e}`); }
|
||||
try { unlinkSync(info.socketPath); } catch (e) { log(` socket: ${e}`); }
|
||||
delete agents[name];
|
||||
cleaned.push(name);
|
||||
}
|
||||
|
||||
164
src/firecracker-vm.ts
Normal file
164
src/firecracker-vm.ts
Normal file
@@ -0,0 +1,164 @@
|
||||
/**
|
||||
* Shared Firecracker VM lifecycle helpers.
|
||||
* Used by vm.ts, snapshot.ts, and agent-manager.ts.
|
||||
*/
|
||||
|
||||
import { spawn, type ChildProcess } from "node:child_process";
|
||||
import { existsSync, unlinkSync, mkdirSync } from "node:fs";
|
||||
import { CONFIG } from "./config.js";
|
||||
import * as api from "./firecracker-api.js";
|
||||
import {
|
||||
ensureBridge,
|
||||
ensureNat,
|
||||
createTap,
|
||||
deleteTap,
|
||||
macFromOctet,
|
||||
} from "./network.js";
|
||||
|
||||
export interface BootOptions {
|
||||
socketPath: string;
|
||||
kernelPath?: string;
|
||||
rootfsPath: string;
|
||||
extraDrives?: { id: string; path: string; readOnly?: boolean }[];
|
||||
tapDevice: string;
|
||||
ip: string;
|
||||
octet: number;
|
||||
vcpu?: number;
|
||||
mem?: number;
|
||||
}
|
||||
|
||||
/**
|
||||
* Wait for a Firecracker API socket to appear.
|
||||
*/
|
||||
export function waitForSocket(
|
||||
socketPath: string,
|
||||
timeoutMs = 5_000
|
||||
): Promise<void> {
|
||||
return new Promise((resolve, reject) => {
|
||||
const deadline = Date.now() + timeoutMs;
|
||||
const check = () => {
|
||||
if (existsSync(socketPath)) {
|
||||
setTimeout(resolve, 200);
|
||||
return;
|
||||
}
|
||||
if (Date.now() > deadline) {
|
||||
reject(new Error("Firecracker socket did not appear"));
|
||||
return;
|
||||
}
|
||||
setTimeout(check, 50);
|
||||
};
|
||||
check();
|
||||
});
|
||||
}
|
||||
|
||||
/**
|
||||
* Set up network for a VM: ensure bridge, NAT, and create tap device.
|
||||
* Cleans stale tap first.
|
||||
*/
|
||||
export function setupNetwork(tapDevice: string) {
|
||||
ensureBridge();
|
||||
ensureNat();
|
||||
deleteTap(tapDevice);
|
||||
createTap(tapDevice);
|
||||
}
|
||||
|
||||
/**
|
||||
* Spawn a Firecracker process and wait for the API socket.
|
||||
*/
|
||||
export async function spawnFirecracker(
|
||||
socketPath: string,
|
||||
opts?: { detached?: boolean }
|
||||
): Promise<ChildProcess> {
|
||||
// Clean stale socket
|
||||
try {
|
||||
unlinkSync(socketPath);
|
||||
} catch {}
|
||||
|
||||
mkdirSync(CONFIG.socketDir, { recursive: true });
|
||||
|
||||
const proc = spawn(
|
||||
CONFIG.firecrackerBin,
|
||||
["--api-sock", socketPath],
|
||||
{
|
||||
stdio: "pipe",
|
||||
detached: opts?.detached ?? false,
|
||||
}
|
||||
);
|
||||
|
||||
if (opts?.detached) proc.unref();
|
||||
|
||||
await waitForSocket(socketPath);
|
||||
return proc;
|
||||
}
|
||||
|
||||
/**
|
||||
* Configure and start a Firecracker VM via its API.
|
||||
*/
|
||||
export async function bootVM(opts: BootOptions) {
|
||||
const kernel = opts.kernelPath ?? CONFIG.kernelPath;
|
||||
const vcpu = opts.vcpu ?? CONFIG.vm.vcpuCount;
|
||||
const mem = opts.mem ?? CONFIG.vm.memSizeMib;
|
||||
|
||||
const bootArgs = [
|
||||
"console=ttyS0",
|
||||
"reboot=k",
|
||||
"panic=1",
|
||||
"pci=off",
|
||||
"root=/dev/vda",
|
||||
"rw",
|
||||
`ip=${opts.ip}::${CONFIG.bridge.gateway}:${CONFIG.bridge.netmask}::eth0:off`,
|
||||
].join(" ");
|
||||
|
||||
await api.putBootSource(opts.socketPath, kernel, bootArgs);
|
||||
await api.putDrive(opts.socketPath, "rootfs", opts.rootfsPath);
|
||||
|
||||
if (opts.extraDrives) {
|
||||
for (const drive of opts.extraDrives) {
|
||||
await api.putDrive(
|
||||
opts.socketPath,
|
||||
drive.id,
|
||||
drive.path,
|
||||
drive.readOnly ?? false,
|
||||
false
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
await api.putNetworkInterface(
|
||||
opts.socketPath,
|
||||
"eth0",
|
||||
opts.tapDevice,
|
||||
macFromOctet(opts.octet)
|
||||
);
|
||||
await api.putMachineConfig(opts.socketPath, vcpu, mem);
|
||||
await api.startInstance(opts.socketPath);
|
||||
}
|
||||
|
||||
/**
|
||||
* Kill a Firecracker process and clean up its socket.
|
||||
*/
|
||||
export async function killFirecracker(
|
||||
proc: ChildProcess | null,
|
||||
socketPath: string,
|
||||
signal: NodeJS.Signals = "SIGTERM"
|
||||
) {
|
||||
if (proc && !proc.killed) {
|
||||
proc.kill(signal);
|
||||
await new Promise<void>((resolve) => {
|
||||
const timer = setTimeout(() => {
|
||||
if (proc && !proc.killed) {
|
||||
proc.kill("SIGKILL");
|
||||
}
|
||||
resolve();
|
||||
}, 2_000);
|
||||
proc.on("exit", () => {
|
||||
clearTimeout(timer);
|
||||
resolve();
|
||||
});
|
||||
});
|
||||
}
|
||||
|
||||
try {
|
||||
unlinkSync(socketPath);
|
||||
} catch {}
|
||||
}
|
||||
@@ -1,5 +1,5 @@
|
||||
import { execFileSync } from "node:child_process";
|
||||
import { openSync, closeSync, readFileSync, writeFileSync } from "node:fs";
|
||||
import { readFileSync, writeFileSync, renameSync } from "node:fs";
|
||||
import { CONFIG } from "./config.js";
|
||||
|
||||
function run(cmd: string, args: string[]) {
|
||||
@@ -195,39 +195,35 @@ function readPool(): IpPool {
|
||||
}
|
||||
}
|
||||
|
||||
function writePool(pool: IpPool) {
|
||||
writeFileSync(CONFIG.ipPoolFile, JSON.stringify(pool));
|
||||
function atomicWritePool(pool: IpPool) {
|
||||
const tmp = CONFIG.ipPoolFile + ".tmp";
|
||||
writeFileSync(tmp, JSON.stringify(pool));
|
||||
renameSync(tmp, CONFIG.ipPoolFile);
|
||||
}
|
||||
|
||||
export function allocateIp(): { ip: string; octet: number } {
|
||||
const fd = openSync(CONFIG.ipPoolLock, "w");
|
||||
try {
|
||||
// Simple flock via child process
|
||||
const pool = readPool();
|
||||
for (
|
||||
let octet = CONFIG.bridge.minHost;
|
||||
octet <= CONFIG.bridge.maxHost;
|
||||
octet++
|
||||
) {
|
||||
if (!pool.allocated.includes(octet)) {
|
||||
pool.allocated.push(octet);
|
||||
writePool(pool);
|
||||
return { ip: `${CONFIG.bridge.prefix}.${octet}`, octet };
|
||||
}
|
||||
// Use flock for proper mutual exclusion
|
||||
const result = execFileSync("bash", ["-c",
|
||||
`flock "${CONFIG.ipPoolLock}" cat "${CONFIG.ipPoolFile}" 2>/dev/null || echo '{"allocated":[]}'`
|
||||
], { encoding: "utf-8" });
|
||||
const pool: IpPool = JSON.parse(result.trim());
|
||||
|
||||
for (
|
||||
let octet = CONFIG.bridge.minHost;
|
||||
octet <= CONFIG.bridge.maxHost;
|
||||
octet++
|
||||
) {
|
||||
if (!pool.allocated.includes(octet)) {
|
||||
pool.allocated.push(octet);
|
||||
atomicWritePool(pool);
|
||||
return { ip: `${CONFIG.bridge.prefix}.${octet}`, octet };
|
||||
}
|
||||
throw new Error("No free IPs in pool");
|
||||
} finally {
|
||||
closeSync(fd);
|
||||
}
|
||||
throw new Error("No free IPs in pool");
|
||||
}
|
||||
|
||||
export function releaseIp(octet: number) {
|
||||
const fd = openSync(CONFIG.ipPoolLock, "w");
|
||||
try {
|
||||
const pool = readPool();
|
||||
pool.allocated = pool.allocated.filter((o) => o !== octet);
|
||||
writePool(pool);
|
||||
} finally {
|
||||
closeSync(fd);
|
||||
}
|
||||
const pool = readPool();
|
||||
pool.allocated = pool.allocated.filter((o) => o !== octet);
|
||||
atomicWritePool(pool);
|
||||
}
|
||||
|
||||
175
src/overseer.ts
175
src/overseer.ts
@@ -7,8 +7,18 @@ import {
|
||||
listTemplates,
|
||||
reconcileAgents,
|
||||
reloadAgent,
|
||||
loadTemplate,
|
||||
type AgentInfo,
|
||||
type AgentTemplate,
|
||||
} from "./agent-manager.js";
|
||||
import { CONFIG } from "./config.js";
|
||||
|
||||
const SSH_OPTS = [
|
||||
"-o", "StrictHostKeyChecking=no",
|
||||
"-o", "UserKnownHostsFile=/dev/null",
|
||||
"-o", "ConnectTimeout=3",
|
||||
"-i", CONFIG.sshKeyPath,
|
||||
];
|
||||
|
||||
interface OverseerConfig {
|
||||
server: string;
|
||||
@@ -29,6 +39,37 @@ function formatAgentList(agents: AgentInfo[]): string[] {
|
||||
);
|
||||
}
|
||||
|
||||
function fieldMatches(field: string, value: number): boolean {
|
||||
if (field === "*") return true;
|
||||
return field.split(",").some((part) => {
|
||||
if (part.includes("/")) {
|
||||
const [range, stepStr] = part.split("/");
|
||||
const step = parseInt(stepStr);
|
||||
if (range === "*") return value % step === 0;
|
||||
const [min, max] = range.split("-").map(Number);
|
||||
return value >= min && value <= max && (value - min) % step === 0;
|
||||
}
|
||||
if (part.includes("-")) {
|
||||
const [min, max] = part.split("-").map(Number);
|
||||
return value >= min && value <= max;
|
||||
}
|
||||
return parseInt(part) === value;
|
||||
});
|
||||
}
|
||||
|
||||
function matchesCron(expr: string, date: Date): boolean {
|
||||
const fields = expr.trim().split(/\s+/);
|
||||
if (fields.length !== 5) return false;
|
||||
const checks: [string, number][] = [
|
||||
[fields[0], date.getMinutes()],
|
||||
[fields[1], date.getHours()],
|
||||
[fields[2], date.getDate()],
|
||||
[fields[3], date.getMonth() + 1],
|
||||
[fields[4], date.getDay()],
|
||||
];
|
||||
return checks.every(([field, value]) => fieldMatches(field, value));
|
||||
}
|
||||
|
||||
export async function runOverseer(config: OverseerConfig) {
|
||||
// Reconcile agent state on startup
|
||||
log("Reconciling agent state...");
|
||||
@@ -195,8 +236,91 @@ export async function runOverseer(config: OverseerConfig) {
|
||||
break;
|
||||
}
|
||||
|
||||
case "!logs": {
|
||||
const name = parts[1];
|
||||
const n = parseInt(parts[2] || "10");
|
||||
if (!name) {
|
||||
bot.say(event.target, "Usage: !logs <name> [lines]");
|
||||
return;
|
||||
}
|
||||
const agents = listAgents();
|
||||
const agent = agents.find((a) => a.name === name);
|
||||
if (!agent) {
|
||||
bot.say(event.target, `Agent "${name}" not found.`);
|
||||
return;
|
||||
}
|
||||
try {
|
||||
const { execFileSync } = await import("node:child_process");
|
||||
const logs = execFileSync("ssh", [
|
||||
...SSH_OPTS,
|
||||
`root@${agent.ip}`,
|
||||
`tail -n ${n} /workspace/agent.log 2>/dev/null || echo '[no logs yet]'`,
|
||||
], { encoding: "utf-8", timeout: 5_000 }).trim();
|
||||
for (const line of logs.split("\n")) {
|
||||
bot.say(event.target, line);
|
||||
}
|
||||
} catch {
|
||||
bot.say(event.target, `Could not read logs for "${name}".`);
|
||||
}
|
||||
break;
|
||||
}
|
||||
|
||||
case "!persona": {
|
||||
const name = parts[1];
|
||||
if (!name) {
|
||||
bot.say(event.target, "Usage: !persona <name> [new persona text]");
|
||||
return;
|
||||
}
|
||||
const newPersona = parts.slice(2).join(" ");
|
||||
if (newPersona) {
|
||||
await reloadAgent(name, { persona: newPersona });
|
||||
bot.say(event.target, `Agent "${name}" persona updated.`);
|
||||
} else {
|
||||
// View current persona — read from agent config via SSH
|
||||
const agents = listAgents();
|
||||
const agent = agents.find((a) => a.name === name);
|
||||
if (!agent) {
|
||||
bot.say(event.target, `Agent "${name}" not found.`);
|
||||
return;
|
||||
}
|
||||
try {
|
||||
const { execFileSync } = await import("node:child_process");
|
||||
const persona = execFileSync("ssh", [
|
||||
...SSH_OPTS,
|
||||
`root@${agent.ip}`,
|
||||
"cat /etc/agent/persona.md",
|
||||
], { encoding: "utf-8", timeout: 5_000 }).trim();
|
||||
bot.say(event.target, `${name}: ${persona}`);
|
||||
} catch {
|
||||
bot.say(event.target, `Could not read persona for "${name}".`);
|
||||
}
|
||||
}
|
||||
break;
|
||||
}
|
||||
|
||||
case "!version": {
|
||||
try {
|
||||
const { readFileSync } = await import("node:fs");
|
||||
const { execFileSync } = await import("node:child_process");
|
||||
const { join, dirname } = await import("node:path");
|
||||
const { fileURLToPath } = await import("node:url");
|
||||
const pkgDir = join(dirname(fileURLToPath(import.meta.url)), "..");
|
||||
const pkg = JSON.parse(readFileSync(join(pkgDir, "package.json"), "utf-8"));
|
||||
let gitHash = "";
|
||||
try {
|
||||
gitHash = execFileSync("git", ["rev-parse", "--short", "HEAD"], {
|
||||
encoding: "utf-8", cwd: pkgDir, timeout: 3_000,
|
||||
}).trim();
|
||||
} catch {}
|
||||
bot.say(event.target, `fireclaw v${pkg.version}${gitHash ? ` (${gitHash})` : ""}`);
|
||||
} catch {
|
||||
bot.say(event.target, "fireclaw (version unknown)");
|
||||
}
|
||||
break;
|
||||
}
|
||||
|
||||
case "!help": {
|
||||
bot.say(event.target, "Commands: !invoke <template> [name] | !destroy <name> | !list | !model <name> <model> | !models | !templates | !status | !help");
|
||||
bot.say(event.target, "Commands: !invoke !destroy !list !model !models !templates !persona !logs !status !version !help");
|
||||
break;
|
||||
}
|
||||
}
|
||||
@@ -247,5 +371,54 @@ export async function runOverseer(config: OverseerConfig) {
|
||||
|
||||
setInterval(healthCheck, HEALTH_CHECK_INTERVAL);
|
||||
|
||||
// Cron agent scheduler — check every 60s
|
||||
const CRON_CHECK_INTERVAL = 60_000;
|
||||
const cronCheck = async () => {
|
||||
const templates = listTemplates();
|
||||
for (const tmplName of templates) {
|
||||
try {
|
||||
const template = loadTemplate(tmplName);
|
||||
if (!template.schedule) continue;
|
||||
|
||||
const now = new Date();
|
||||
if (!matchesCron(template.schedule, now)) continue;
|
||||
|
||||
const cronName = `${template.name}-cron`;
|
||||
|
||||
// Skip if already running
|
||||
const running = listAgents();
|
||||
if (running.some((a) => a.name === cronName)) continue;
|
||||
|
||||
log(`Cron trigger: spawning "${cronName}" from template "${tmplName}"`);
|
||||
bot.say(config.channel, `Cron: spawning "${cronName}" from template "${tmplName}"`);
|
||||
|
||||
const info = await startAgent(tmplName, { name: cronName });
|
||||
knownAgents.add(cronName);
|
||||
|
||||
// Schedule auto-destroy
|
||||
const timeout = (template.schedule_timeout ?? 300) * 1000;
|
||||
setTimeout(async () => {
|
||||
try {
|
||||
const current = listAgents();
|
||||
if (current.some((a) => a.name === cronName)) {
|
||||
log(`Cron timeout: destroying "${cronName}" after ${timeout / 1000}s`);
|
||||
bot.say(config.channel, `Cron: destroying "${cronName}" (timeout ${timeout / 1000}s)`);
|
||||
await stopAgent(cronName);
|
||||
knownAgents.delete(cronName);
|
||||
}
|
||||
} catch (err) {
|
||||
const msg = err instanceof Error ? err.message : String(err);
|
||||
log(`Error destroying cron agent "${cronName}": ${msg}`);
|
||||
}
|
||||
}, timeout);
|
||||
} catch (err) {
|
||||
const msg = err instanceof Error ? err.message : String(err);
|
||||
log(`Cron error for template "${tmplName}": ${msg}`);
|
||||
}
|
||||
}
|
||||
};
|
||||
|
||||
setInterval(cronCheck, CRON_CHECK_INTERVAL);
|
||||
|
||||
log("Overseer started. Waiting for commands...");
|
||||
}
|
||||
|
||||
@@ -1,45 +1,22 @@
|
||||
import { spawn, type ChildProcess } from "node:child_process";
|
||||
import { existsSync, mkdirSync } from "node:fs";
|
||||
import { type ChildProcess } from "node:child_process";
|
||||
import { existsSync, mkdirSync, copyFileSync } from "node:fs";
|
||||
import { join } from "node:path";
|
||||
import { CONFIG } from "./config.js";
|
||||
import * as api from "./firecracker-api.js";
|
||||
import {
|
||||
ensureBridge,
|
||||
ensureNat,
|
||||
createTap,
|
||||
deleteTap,
|
||||
macFromOctet,
|
||||
} from "./network.js";
|
||||
import {
|
||||
ensureBaseImage,
|
||||
ensureSshKeypair,
|
||||
injectSshKey,
|
||||
} from "./rootfs.js";
|
||||
import { deleteTap } from "./network.js";
|
||||
import { ensureBaseImage, ensureSshKeypair, injectSshKey } from "./rootfs.js";
|
||||
import { waitForSsh } from "./ssh.js";
|
||||
import { copyFileSync } from "node:fs";
|
||||
import {
|
||||
setupNetwork,
|
||||
spawnFirecracker,
|
||||
bootVM,
|
||||
killFirecracker,
|
||||
} from "./firecracker-vm.js";
|
||||
|
||||
function log(msg: string) {
|
||||
process.stderr.write(`[snapshot] ${msg}\n`);
|
||||
}
|
||||
|
||||
function waitForSocket(socketPath: string): Promise<void> {
|
||||
return new Promise((resolve, reject) => {
|
||||
const deadline = Date.now() + 5_000;
|
||||
const check = () => {
|
||||
if (existsSync(socketPath)) {
|
||||
setTimeout(resolve, 200);
|
||||
return;
|
||||
}
|
||||
if (Date.now() > deadline) {
|
||||
reject(new Error("Firecracker socket did not appear"));
|
||||
return;
|
||||
}
|
||||
setTimeout(check, 50);
|
||||
};
|
||||
check();
|
||||
});
|
||||
}
|
||||
|
||||
export function snapshotExists(): boolean {
|
||||
return (
|
||||
existsSync(CONFIG.snapshot.statePath) &&
|
||||
@@ -61,47 +38,21 @@ export async function createSnapshot() {
|
||||
injectSshKey(snap.rootfsPath);
|
||||
|
||||
log("Setting up network...");
|
||||
ensureBridge();
|
||||
ensureNat();
|
||||
deleteTap(snap.tapDevice); // clean stale tap from previous run
|
||||
createTap(snap.tapDevice);
|
||||
setupNetwork(snap.tapDevice);
|
||||
|
||||
let proc: ChildProcess | null = null;
|
||||
|
||||
try {
|
||||
log("Booting VM for snapshot...");
|
||||
proc = spawn(CONFIG.firecrackerBin, ["--api-sock", socketPath], {
|
||||
stdio: "pipe",
|
||||
detached: false,
|
||||
proc = await spawnFirecracker(socketPath);
|
||||
await bootVM({
|
||||
socketPath,
|
||||
rootfsPath: snap.rootfsPath,
|
||||
tapDevice: snap.tapDevice,
|
||||
ip: snap.ip,
|
||||
octet: snap.octet,
|
||||
});
|
||||
|
||||
await waitForSocket(socketPath);
|
||||
|
||||
const bootArgs = [
|
||||
"console=ttyS0",
|
||||
"reboot=k",
|
||||
"panic=1",
|
||||
"pci=off",
|
||||
"root=/dev/vda",
|
||||
"rw",
|
||||
`ip=${snap.ip}::${CONFIG.bridge.gateway}:${CONFIG.bridge.netmask}::eth0:off`,
|
||||
].join(" ");
|
||||
|
||||
await api.putBootSource(socketPath, CONFIG.kernelPath, bootArgs);
|
||||
await api.putDrive(socketPath, "rootfs", snap.rootfsPath);
|
||||
await api.putNetworkInterface(
|
||||
socketPath,
|
||||
"eth0",
|
||||
snap.tapDevice,
|
||||
macFromOctet(snap.octet)
|
||||
);
|
||||
await api.putMachineConfig(
|
||||
socketPath,
|
||||
CONFIG.vm.vcpuCount,
|
||||
CONFIG.vm.memSizeMib
|
||||
);
|
||||
await api.startInstance(socketPath);
|
||||
|
||||
log("Waiting for SSH...");
|
||||
await waitForSsh(snap.ip);
|
||||
|
||||
@@ -116,13 +67,7 @@ export async function createSnapshot() {
|
||||
log(` Memory: ${snap.memPath}`);
|
||||
log(` Rootfs: ${snap.rootfsPath}`);
|
||||
} finally {
|
||||
if (proc && !proc.killed) {
|
||||
proc.kill("SIGKILL");
|
||||
}
|
||||
try {
|
||||
const { unlinkSync } = await import("node:fs");
|
||||
unlinkSync(socketPath);
|
||||
} catch {}
|
||||
await killFirecracker(proc, socketPath, "SIGKILL");
|
||||
deleteTap(snap.tapDevice);
|
||||
}
|
||||
}
|
||||
|
||||
157
src/vm.ts
157
src/vm.ts
@@ -1,19 +1,11 @@
|
||||
import { spawn, type ChildProcess } from "node:child_process";
|
||||
import { existsSync, mkdirSync } from "node:fs";
|
||||
import { type ChildProcess } from "node:child_process";
|
||||
import { mkdirSync } from "node:fs";
|
||||
import { join } from "node:path";
|
||||
import { randomBytes } from "node:crypto";
|
||||
import { CONFIG } from "./config.js";
|
||||
import type { VMConfig, RunResult, RunOptions } from "./types.js";
|
||||
import * as api from "./firecracker-api.js";
|
||||
import {
|
||||
ensureBridge,
|
||||
ensureNat,
|
||||
allocateIp,
|
||||
releaseIp,
|
||||
createTap,
|
||||
deleteTap,
|
||||
macFromOctet,
|
||||
} from "./network.js";
|
||||
import { allocateIp, releaseIp, deleteTap } from "./network.js";
|
||||
import {
|
||||
ensureBaseImage,
|
||||
ensureSshKeypair,
|
||||
@@ -24,6 +16,12 @@ import {
|
||||
import { waitForSsh, execCommand } from "./ssh.js";
|
||||
import { registerVm, unregisterVm } from "./cleanup.js";
|
||||
import { snapshotExists } from "./snapshot.js";
|
||||
import {
|
||||
setupNetwork,
|
||||
spawnFirecracker,
|
||||
bootVM,
|
||||
killFirecracker,
|
||||
} from "./firecracker-vm.js";
|
||||
|
||||
function log(verbose: boolean, msg: string) {
|
||||
if (verbose) process.stderr.write(`[fireclaw] ${msg}\n`);
|
||||
@@ -42,7 +40,6 @@ export class VMInstance {
|
||||
command: string,
|
||||
opts: RunOptions = {}
|
||||
): Promise<RunResult> {
|
||||
// Try snapshot path first unless disabled
|
||||
if (!opts.noSnapshot && snapshotExists()) {
|
||||
return VMInstance.runFromSnapshot(command, opts);
|
||||
}
|
||||
@@ -65,33 +62,20 @@ export class VMInstance {
|
||||
guestIp: snap.ip,
|
||||
tapDevice: snap.tapDevice,
|
||||
socketPath: join(CONFIG.socketDir, `${id}.sock`),
|
||||
rootfsPath: "", // shared, not per-run
|
||||
rootfsPath: "",
|
||||
timeoutMs,
|
||||
verbose,
|
||||
};
|
||||
|
||||
const vm = new VMInstance(config);
|
||||
vm.octet = 0; // no IP pool allocation for snapshot runs
|
||||
vm.octet = 0;
|
||||
registerVm(vm);
|
||||
|
||||
try {
|
||||
log(verbose, `VM ${id}: restoring from snapshot...`);
|
||||
ensureBridge();
|
||||
ensureNat();
|
||||
deleteTap(snap.tapDevice); // clean stale tap from previous run
|
||||
createTap(snap.tapDevice);
|
||||
setupNetwork(snap.tapDevice);
|
||||
|
||||
// Spawn firecracker and load snapshot
|
||||
vm.process = spawn(
|
||||
CONFIG.firecrackerBin,
|
||||
["--api-sock", config.socketPath],
|
||||
{ stdio: "pipe", detached: false }
|
||||
);
|
||||
vm.process.on("error", (err) => {
|
||||
log(verbose, `Firecracker process error: ${err.message}`);
|
||||
});
|
||||
|
||||
await vm.waitForSocket();
|
||||
vm.process = await spawnFirecracker(config.socketPath);
|
||||
await api.putSnapshotLoad(
|
||||
config.socketPath,
|
||||
snap.statePath,
|
||||
@@ -124,16 +108,12 @@ export class VMInstance {
|
||||
const verbose = opts.verbose ?? false;
|
||||
const timeoutMs = opts.timeout ?? CONFIG.vm.defaultTimeoutMs;
|
||||
|
||||
// Pre-flight checks
|
||||
ensureBaseImage();
|
||||
ensureSshKeypair();
|
||||
|
||||
// Allocate resources
|
||||
const { ip, octet } = allocateIp();
|
||||
const tapDevice = `fctap${octet}`;
|
||||
|
||||
mkdirSync(CONFIG.socketDir, { recursive: true });
|
||||
|
||||
const config: VMConfig = {
|
||||
id,
|
||||
guestIp: ip,
|
||||
@@ -154,13 +134,19 @@ export class VMInstance {
|
||||
injectSshKey(config.rootfsPath);
|
||||
|
||||
log(verbose, `VM ${id}: creating tap ${tapDevice}...`);
|
||||
ensureBridge();
|
||||
ensureNat();
|
||||
deleteTap(tapDevice); // clean stale tap from previous run
|
||||
createTap(tapDevice);
|
||||
setupNetwork(tapDevice);
|
||||
|
||||
log(verbose, `VM ${id}: booting...`);
|
||||
await vm.boot(opts);
|
||||
vm.process = await spawnFirecracker(config.socketPath);
|
||||
await bootVM({
|
||||
socketPath: config.socketPath,
|
||||
rootfsPath: config.rootfsPath,
|
||||
tapDevice,
|
||||
ip,
|
||||
octet,
|
||||
vcpu: opts.vcpu,
|
||||
mem: opts.mem,
|
||||
});
|
||||
|
||||
log(verbose, `VM ${id}: waiting for SSH at ${ip}...`);
|
||||
await waitForSsh(ip);
|
||||
@@ -179,110 +165,17 @@ export class VMInstance {
|
||||
}
|
||||
}
|
||||
|
||||
private async boot(opts: RunOptions) {
|
||||
const { config } = this;
|
||||
const vcpu = opts.vcpu ?? CONFIG.vm.vcpuCount;
|
||||
const mem = opts.mem ?? CONFIG.vm.memSizeMib;
|
||||
|
||||
// Spawn firecracker
|
||||
this.process = spawn(
|
||||
CONFIG.firecrackerBin,
|
||||
["--api-sock", config.socketPath],
|
||||
{
|
||||
stdio: "pipe",
|
||||
detached: false,
|
||||
}
|
||||
);
|
||||
|
||||
this.process.on("error", (err) => {
|
||||
log(config.verbose, `Firecracker process error: ${err.message}`);
|
||||
});
|
||||
|
||||
// Wait for socket
|
||||
await this.waitForSocket();
|
||||
|
||||
// Configure via API
|
||||
const bootArgs = [
|
||||
"console=ttyS0",
|
||||
"reboot=k",
|
||||
"panic=1",
|
||||
"pci=off",
|
||||
"root=/dev/vda",
|
||||
"rw",
|
||||
`ip=${config.guestIp}::${CONFIG.bridge.gateway}:${CONFIG.bridge.netmask}::eth0:off`,
|
||||
].join(" ");
|
||||
|
||||
await api.putBootSource(config.socketPath, CONFIG.kernelPath, bootArgs);
|
||||
await api.putDrive(config.socketPath, "rootfs", config.rootfsPath);
|
||||
await api.putNetworkInterface(
|
||||
config.socketPath,
|
||||
"eth0",
|
||||
config.tapDevice,
|
||||
macFromOctet(this.octet)
|
||||
);
|
||||
await api.putMachineConfig(config.socketPath, vcpu, mem);
|
||||
await api.startInstance(config.socketPath);
|
||||
}
|
||||
|
||||
private waitForSocket(): Promise<void> {
|
||||
const socketPath = this.config.socketPath;
|
||||
return new Promise((resolve, reject) => {
|
||||
const deadline = Date.now() + 5_000;
|
||||
|
||||
const check = () => {
|
||||
if (existsSync(socketPath)) {
|
||||
setTimeout(resolve, 200);
|
||||
return;
|
||||
}
|
||||
if (Date.now() > deadline) {
|
||||
reject(new Error("Firecracker socket did not appear"));
|
||||
return;
|
||||
}
|
||||
setTimeout(check, 50);
|
||||
};
|
||||
|
||||
check();
|
||||
});
|
||||
}
|
||||
|
||||
async destroy() {
|
||||
const { config } = this;
|
||||
log(config.verbose, `VM ${config.id}: cleaning up...`);
|
||||
|
||||
// Kill firecracker
|
||||
if (this.process && !this.process.killed) {
|
||||
this.process.kill("SIGTERM");
|
||||
await new Promise<void>((resolve) => {
|
||||
const timer = setTimeout(() => {
|
||||
if (this.process && !this.process.killed) {
|
||||
this.process.kill("SIGKILL");
|
||||
}
|
||||
resolve();
|
||||
}, 2_000);
|
||||
this.process!.on("exit", () => {
|
||||
clearTimeout(timer);
|
||||
resolve();
|
||||
});
|
||||
});
|
||||
}
|
||||
|
||||
// Clean up socket
|
||||
try {
|
||||
const { unlinkSync } = await import("node:fs");
|
||||
unlinkSync(config.socketPath);
|
||||
} catch {
|
||||
// Already gone
|
||||
}
|
||||
|
||||
// Clean up tap device
|
||||
await killFirecracker(this.process, config.socketPath);
|
||||
deleteTap(config.tapDevice);
|
||||
|
||||
// Release IP (skip for snapshot runs which don't allocate from pool)
|
||||
if (this.octet > 0) {
|
||||
releaseIp(this.octet);
|
||||
}
|
||||
|
||||
// Delete rootfs copy (skip for snapshot runs which share rootfs)
|
||||
if (config.rootfsPath) {
|
||||
deleteRunCopy(config.rootfsPath);
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user