Dangerous command approval: run_command skill now checks commands
against 9 regex patterns (rm -rf /, dd, mkfs, fork bombs, shutdown,
device writes, etc.) and blocks execution with a clear message.
Defense-in-depth layer on top of VM isolation.
Cron agents: templates support schedule (5-field cron) and
schedule_timeout (seconds, default 300) fields. Overseer checks
every 60s, spawns {name}-cron agents on match, auto-destroys after
timeout. Inline cron parser supports *, ranges, lists, and steps.
No npm dependencies added.
2.6 KiB
2.6 KiB
TODO
Done
- Firecracker CLI runner with snapshots (~1.1s)
- Multi-agent system — overseer + agent VMs + IRC + Ollama
- 5 templates, 5+ models, hot-reload, non-root agents
- Tools: run_command, web_search, fetch_url, save_memory
- Discoverable skill system — SKILL.md + run.py, auto-loaded
- Persistent workspace + memory (MEMORY.md pattern)
- Overseer: !invoke, !destroy, !list, !model, !models, !templates, !persona, !status, !help
- Health checks, crash recovery, graceful shutdown, systemd
- Network policies, thread safety, trigger fix, race condition fix
- Install/uninstall scripts, deployed on 2 machines
- Refactor: firecracker-vm.ts shared helpers, skill extraction
- Large output handling — save >2K results to file, preview + read_file skill
- Session persistence — SQLite + FTS5, conversation history survives restarts
- !logs — tail agent history from workspace
- Context compression — cached summaries, configurable threshold/keep
- write_file skill — agents can create and modify workspace files
- Structured system prompt — explicit tool descriptions, multi-agent awareness
- Per-template config — temperature, num_predict, context_size, compress settings
- Response quality — 500-char deque storage, 1024 default output tokens, 250-token summaries
- update.sh script — one-command rootfs patching and snapshot rebuild
Next up (Phase 5 — by priority)
Medium effort
- Skill registry git repo — shared skills on Gitea,
fireclaw skills pull
Bigger items
- Skill learning — agents create new skills from experience
- Cron agents — scheduled agent spawns (5-field cron in templates, auto-destroy timeout)
- Dangerous command approval — pattern detection blocks rm -rf /, dd, mkfs, fork bombs, etc.
- Parallel tool execution — concurrent independent tool calls
Polish
- Cost tracking per interaction
- Execution recording / audit trail
- Update regression tests for skill system + channel layout
Low priority (from REPORT.md)
- Hardcoded network interface fallback —
src/network.ts:56defaults to"eno2"if route parsing fails - Predictable mount point names —
src/agent-manager.ts:94usesDate.now()instead of crypto random - No Firecracker binary hash verification —
scripts/install.shdownloads without SHA256 check - Ollama response size unbounded —
agent/tools.pyshould limitresp.read()size - Process termination inconsistent — two patterns (ChildProcess vs PID polling), works but could consolidate