Files
fireclaw/TODO.md
ansible c827d341ab Overhaul agent quality — prompts, tools, config, compression
- Rewrite system prompt: structured sections, explicit tool descriptions
  with full SKILL.md descriptions, multi-agent awareness
- Add write_file skill for creating/modifying workspace files
- Per-template config passthrough: temperature, num_predict, context_size,
  compress settings, max_tool_rounds, max_response_lines
- Bump defaults: 1024 output tokens (was 512), 500-char deque (was 200),
  250-token summaries (was 150), compress threshold 16 (was 12), keep 8 (was 4)
- Cache compression by content hash — no redundant summarization
- Update all 5 templates with tuned settings per role
2026-04-08 18:28:26 +00:00

50 lines
2.5 KiB
Markdown

# TODO
## Done
- [x] Firecracker CLI runner with snapshots (~1.1s)
- [x] Multi-agent system — overseer + agent VMs + IRC + Ollama
- [x] 5 templates, 5+ models, hot-reload, non-root agents
- [x] Tools: run_command, web_search, fetch_url, save_memory
- [x] Discoverable skill system — SKILL.md + run.py, auto-loaded
- [x] Persistent workspace + memory (MEMORY.md pattern)
- [x] Overseer: !invoke, !destroy, !list, !model, !models, !templates, !persona, !status, !help
- [x] Health checks, crash recovery, graceful shutdown, systemd
- [x] Network policies, thread safety, trigger fix, race condition fix
- [x] Install/uninstall scripts, deployed on 2 machines
- [x] Refactor: firecracker-vm.ts shared helpers, skill extraction
- [x] Large output handling — save >2K results to file, preview + read_file skill
- [x] Session persistence — SQLite + FTS5, conversation history survives restarts
- [x] !logs — tail agent history from workspace
- [x] Context compression — cached summaries, configurable threshold/keep
- [x] write_file skill — agents can create and modify workspace files
- [x] Structured system prompt — explicit tool descriptions, multi-agent awareness
- [x] Per-template config — temperature, num_predict, context_size, compress settings
- [x] Response quality — 500-char deque storage, 1024 default output tokens, 250-token summaries
- [x] update.sh script — one-command rootfs patching and snapshot rebuild
## Next up (Phase 5 — by priority)
### Medium effort
- [ ] Skill registry git repo — shared skills on Gitea, `fireclaw skills pull`
### Bigger items
- [ ] Skill learning — agents create new skills from experience
- [ ] Cron agents — scheduled agent spawns
- [ ] Dangerous command approval — pattern detection + allowlist
- [ ] Parallel tool execution — concurrent independent tool calls
## Polish
- [ ] Cost tracking per interaction
- [ ] Execution recording / audit trail
- [ ] Update regression tests for skill system + channel layout
## Low priority (from REPORT.md)
- [ ] Hardcoded network interface fallback — `src/network.ts:56` defaults to `"eno2"` if route parsing fails
- [ ] Predictable mount point names — `src/agent-manager.ts:94` uses `Date.now()` instead of crypto random
- [ ] No Firecracker binary hash verification — `scripts/install.sh` downloads without SHA256 check
- [ ] Ollama response size unbounded — `agent/tools.py` should limit `resp.read()` size
- [ ] Process termination inconsistent — two patterns (ChildProcess vs PID polling), works but could consolidate