2.3 KiB
2.3 KiB
TODO
Done
- Firecracker CLI runner with snapshots (~1.1s)
- Multi-agent system — overseer + agent VMs + IRC + Ollama
- 5 templates, 5+ models, hot-reload, non-root agents
- Tools: run_command, web_search, fetch_url, save_memory
- Discoverable skill system — SKILL.md + run.py, auto-loaded
- Persistent workspace + memory (MEMORY.md pattern)
- Overseer: !invoke, !destroy, !list, !model, !models, !templates, !persona, !status, !help
- Health checks, crash recovery, graceful shutdown, systemd
- Network policies, thread safety, trigger fix, race condition fix
- Install/uninstall scripts, deployed on 2 machines
- Refactor: firecracker-vm.ts shared helpers, skill extraction
Next up (Phase 5 — by priority)
Quick wins
- Large output handling — save >2K results to file, preview + read_file
- Iteration budget — configurable max rounds per template, prevent runaway loops
Medium effort
- Skill registry git repo — shared skills on Gitea,
fireclaw skills pull - Session persistence — SQLite + FTS5 in workspace
- Context compression — summarize old turns when context gets long
- !logs — tail agent history from workspace
Bigger items
- Skill learning — agents create new skills from experience
- Cron agents — scheduled agent spawns
- Dangerous command approval — pattern detection + allowlist
- Parallel tool execution — concurrent independent tool calls
Polish
- Agent-to-agent response quality — 7B models parrot, needs better prompting or larger models
- Cost tracking per interaction
- Execution recording / audit trail
- Update regression tests for skill system + channel layout
Low priority (from REPORT.md)
- Hardcoded network interface fallback —
src/network.ts:56defaults to"eno2"if route parsing fails - Predictable mount point names —
src/agent-manager.ts:94usesDate.now()instead of crypto random - No Firecracker binary hash verification —
scripts/install.shdownloads without SHA256 check - Ollama response size unbounded —
agent/tools.pyshould limitresp.read()size - Process termination inconsistent — two patterns (ChildProcess vs PID polling), works but could consolidate