Files
fireclaw/TODO.md
ansible 50b8c7464b v0.1.2 — multi-agent system with hardening
Bump version to 0.1.2. Update description and polish TODO.

Since v0.1.0:
- Multi-agent IRC orchestration (overseer + agent VMs)
- 5 agent templates, 5 Ollama models
- Tool access (shell + podman containers)
- Persistent workspace + memory system
- Agent hot-reload, non-root agents
- Thread safety, health checks, network policies
- Trigger matching fix, race condition fix
- Graceful shutdown, crash recovery, systemd service
- DM support, /invite, agent-to-agent communication

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 14:31:27 +00:00

1.8 KiB

TODO

Done

  • Firecracker CLI runner with snapshots (~1.1s)
  • Alpine rootfs with ca-certificates, podman, python3
  • Global fireclaw command
  • Multi-agent system — overseer + agent VMs + IRC + Ollama
  • 5 agent templates (worker, coder, researcher, quick, creative)
  • 5 Ollama models (qwen2.5-coder, qwen2.5, llama3.1, gemma3, phi4-mini)
  • Agent tool access — shell commands + podman containers
  • Persistent workspace + memory system (MEMORY.md pattern)
  • Agent hot-reload — model/persona swap via SSH + SIGHUP
  • Non-root agents — unprivileged agent user
  • Agent-to-agent via IRC mentions (10s cooldown)
  • DM support — private messages, no mention needed
  • /invite support — agents auto-join invited channels
  • Channel layout — #control (commands), #agents (common), DMs
  • Overseer resilience — crash recovery, agent adoption
  • Graceful shutdown — IRC QUIT before VM kill
  • Systemd service (KillMode=process)
  • Regression test suite (20 tests)

Next up

  • Network policies per agent — restrict internet access
  • Warm pool — pre-booted VMs for instant agent spawns
  • Persistent agent memory improvements — richer memory structure, auto-save from conversations
  • Thin provisioning — device-mapper snapshots instead of full rootfs copies

Polish

  • Agent-to-agent response quality — small models (7B) parrot messages instead of answering. Needs better prompting ("don't repeat the question, answer it") or larger models (14B+). Claude API would help here.
  • Cost tracking per agent interaction
  • Execution recording / audit trail
  • Agent health checks — overseer pings agents, restarts dead ones
  • Thread safety in agent.py — lock around IRC socket writes
  • Update regression tests for new channel layout