- Rewrite system prompt: structured sections, explicit tool descriptions with full SKILL.md descriptions, multi-agent awareness - Add write_file skill for creating/modifying workspace files - Per-template config passthrough: temperature, num_predict, context_size, compress settings, max_tool_rounds, max_response_lines - Bump defaults: 1024 output tokens (was 512), 500-char deque (was 200), 250-token summaries (was 150), compress threshold 16 (was 12), keep 8 (was 4) - Cache compression by content hash — no redundant summarization - Update all 5 templates with tuned settings per role
2.5 KiB
2.5 KiB
TODO
Done
- Firecracker CLI runner with snapshots (~1.1s)
- Multi-agent system — overseer + agent VMs + IRC + Ollama
- 5 templates, 5+ models, hot-reload, non-root agents
- Tools: run_command, web_search, fetch_url, save_memory
- Discoverable skill system — SKILL.md + run.py, auto-loaded
- Persistent workspace + memory (MEMORY.md pattern)
- Overseer: !invoke, !destroy, !list, !model, !models, !templates, !persona, !status, !help
- Health checks, crash recovery, graceful shutdown, systemd
- Network policies, thread safety, trigger fix, race condition fix
- Install/uninstall scripts, deployed on 2 machines
- Refactor: firecracker-vm.ts shared helpers, skill extraction
- Large output handling — save >2K results to file, preview + read_file skill
- Session persistence — SQLite + FTS5, conversation history survives restarts
- !logs — tail agent history from workspace
- Context compression — cached summaries, configurable threshold/keep
- write_file skill — agents can create and modify workspace files
- Structured system prompt — explicit tool descriptions, multi-agent awareness
- Per-template config — temperature, num_predict, context_size, compress settings
- Response quality — 500-char deque storage, 1024 default output tokens, 250-token summaries
- update.sh script — one-command rootfs patching and snapshot rebuild
Next up (Phase 5 — by priority)
Medium effort
- Skill registry git repo — shared skills on Gitea,
fireclaw skills pull
Bigger items
- Skill learning — agents create new skills from experience
- Cron agents — scheduled agent spawns
- Dangerous command approval — pattern detection + allowlist
- Parallel tool execution — concurrent independent tool calls
Polish
- Cost tracking per interaction
- Execution recording / audit trail
- Update regression tests for skill system + channel layout
Low priority (from REPORT.md)
- Hardcoded network interface fallback —
src/network.ts:56defaults to"eno2"if route parsing fails - Predictable mount point names —
src/agent-manager.ts:94usesDate.now()instead of crypto random - No Firecracker binary hash verification —
scripts/install.shdownloads without SHA256 check - Ollama response size unbounded —
agent/tools.pyshould limitresp.read()size - Process termination inconsistent — two patterns (ChildProcess vs PID polling), works but could consolidate