79 lines
4.2 KiB
Markdown
79 lines
4.2 KiB
Markdown
# Fireclaw Roadmap
|
|
|
|
## Phase 1: Core CLI (done)
|
|
|
|
- [x] Firecracker microVM lifecycle (boot, exec, destroy)
|
|
- [x] SSH-based command execution
|
|
- [x] Network isolation (tap + bridge + NAT)
|
|
- [x] IP pool management for concurrent VMs
|
|
- [x] Signal handling and cleanup
|
|
- [x] CLI interface (`fireclaw run`, `fireclaw setup`)
|
|
|
|
## Phase 2: Fast & Useful (done)
|
|
|
|
- [x] Alpine Linux rootfs (1 GiB sparse, 146 MiB on disk)
|
|
- [x] Precompiled binary, global `fireclaw` command
|
|
- [x] Snapshot & restore (~1.1s vs ~2.9s cold boot)
|
|
|
|
## Phase 3: Multi-Agent System (done)
|
|
|
|
- [x] ngircd configured (`nyx.fireclaw.local`, FireclawNet)
|
|
- [x] Channel layout: #control (overseer), #agents (common room), DMs, /invite
|
|
- [x] Ollama with 5+ models, hot-swappable per agent
|
|
- [x] Agent rootfs — Alpine + Python IRC bot + podman + tools
|
|
- [x] Agent manager — start/stop/list/reload long-running VMs
|
|
- [x] Overseer — !invoke, !destroy, !list, !model, !models, !templates, !persona, !status, !help
|
|
- [x] 5 agent templates — worker, coder, researcher, quick, creative
|
|
- [x] Discoverable skill system — SKILL.md + run.py per tool, auto-loaded at boot
|
|
- [x] Agent tools — run_command, web_search, fetch_url, save_memory
|
|
- [x] Persistent workspace + memory system (MEMORY.md pattern)
|
|
- [x] Agent hot-reload, non-root agents, agent-to-agent, DMs, /invite
|
|
- [x] Overseer resilience, health checks, graceful shutdown, systemd
|
|
|
|
## Phase 4: Hardening & Deployment (done)
|
|
|
|
- [x] Network policies, thread safety, trigger fix, race condition fix
|
|
- [x] Install/uninstall scripts, deployed on Debian + Ubuntu + GPU server
|
|
- [x] Refactor — shared firecracker-vm.ts, skill system extraction
|
|
|
|
### Remaining
|
|
- [ ] Warm pool — pre-booted VMs from snapshots
|
|
- [ ] Concurrent snapshot runs via network namespaces
|
|
- [ ] Thin provisioning — device-mapper snapshots
|
|
|
|
## Phase 5: Agent Intelligence
|
|
|
|
Priority order by gain/complexity ratio.
|
|
|
|
### High priority (high gain, low-medium complexity)
|
|
|
|
- [ ] **Large output handling** — tool results >2K chars saved to workspace file, agent gets preview + can read the rest. Prevents context explosion. Simple, high impact.
|
|
- [ ] **Iteration budget** — shared token/round budget across tool calls. Prevents runaway loops, especially with GPU server running faster models that chain more aggressively. Add per-template configurable limits.
|
|
- [ ] **Skill registry as git repo** — separate git repo for community/shared skills. Clone into agent rootfs. `fireclaw skills pull` to update. Like agentskills.io but self-hosted on Gitea.
|
|
- [ ] **Session persistence** — SQLite in workspace for conversation history. FTS5 full-text search over past sessions. Agents can search their own history.
|
|
|
|
### Medium priority (medium gain, medium complexity)
|
|
|
|
- [ ] **Context compression** — when conversation history exceeds threshold, LLM-summarize middle turns. Protect head (system prompt) and tail (recent messages). Keeps agents coherent in long conversations.
|
|
- [ ] **Skill learning** — after complex multi-tool tasks, agent creates a new SKILL.md + run.py in workspace/skills. Next boot, new skill is available. Self-improving agents.
|
|
- [ ] **Scheduled/cron agents** — template gets a `schedule` field. Overseer spawns agent on schedule, agent does its task, reports to #agents, self-destructs.
|
|
- [ ] **!logs command** — tail agent interaction history from workspace.
|
|
|
|
### Lower priority (good ideas, higher complexity or less immediate need)
|
|
|
|
- [ ] **Dangerous command approval** — pattern-based detection (rm -rf, git reset, etc.) with allowlist. Agent asks for confirmation before destructive commands.
|
|
- [ ] **Parallel tool execution** — detect independent tool calls, run concurrently. Needs safety heuristics (read-only, non-overlapping paths).
|
|
- [ ] **Cost tracking** — Ollama returns token counts. Log per-interaction: duration, model, tokens, skill used.
|
|
- [ ] **Execution recording** — full audit trail of all tool calls and results.
|
|
|
|
## Phase 6: Infrastructure
|
|
|
|
- [ ] MCP servers in Firecracker VM with podman containers
|
|
- [ ] Webhook triggers — HTTP endpoint that spawns ephemeral agents
|
|
- [ ] Alert forwarding — pipe system alerts into #agents
|
|
- [ ] Web dashboard — status page for running agents
|
|
|
|
## Phase 7: Ideas & Experiments
|
|
|
|
See IDEAS.md for the full list.
|