From 9a06bbf5ea89a79ef7f092cb56a17e91ac8f7ab0 Mon Sep 17 00:00:00 2001 From: ansible Date: Tue, 7 Apr 2026 20:50:19 +0000 Subject: [PATCH] Update roadmap with Phase 5 priorities from Hermes analysis --- ROADMAP.md | 78 ++++++++++++++++++++++++++++++------------------------ 1 file changed, 43 insertions(+), 35 deletions(-) diff --git a/ROADMAP.md b/ROADMAP.md index 9dcda46..76caf74 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -22,49 +22,57 @@ - [x] Ollama with 5+ models, hot-swappable per agent - [x] Agent rootfs — Alpine + Python IRC bot + podman + tools - [x] Agent manager — start/stop/list/reload long-running VMs -- [x] Overseer — !invoke, !destroy, !list, !model, !models, !templates, !status, !help +- [x] Overseer — !invoke, !destroy, !list, !model, !models, !templates, !persona, !status, !help - [x] 5 agent templates — worker, coder, researcher, quick, creative -- [x] Agent tools — run_command, web_search (searx), save_memory -- [x] Persistent workspace — 64 MiB ext4 as second virtio drive -- [x] Agent memory system — MEMORY.md pattern, survives restarts -- [x] Agent hot-reload — model/persona swap via SSH + SIGHUP -- [x] Non-root agents — unprivileged `agent` user -- [x] Agent-to-agent via IRC, DMs, /invite -- [x] Overseer resilience — crash recovery, health checks, KillMode=process -- [x] Graceful shutdown — IRC QUIT before VM kill -- [x] Systemd service, regression tests +- [x] Discoverable skill system — SKILL.md + run.py per tool, auto-loaded at boot +- [x] Agent tools — run_command, web_search, fetch_url, save_memory +- [x] Persistent workspace + memory system (MEMORY.md pattern) +- [x] Agent hot-reload, non-root agents, agent-to-agent, DMs, /invite +- [x] Overseer resilience, health checks, graceful shutdown, systemd ## Phase 4: Hardening & Deployment (done) -- [x] Network policies per agent — full/local/none via iptables -- [x] Thread safety — lock around IRC socket writes -- [x] Agent health checks — 30s interval, announces deaths in #control -- [x] Trigger matching fix — start-of-message only -- [x] agents.json race condition fix -- [x] Install script — one-command deployment, battle-tested on Debian + Ubuntu -- [x] Uninstall script -- [x] Deployed on GPU server (Xeon + Quadro P5000) -- [x] Refactor — shared firecracker-vm.ts helpers, -43 lines +- [x] Network policies, thread safety, trigger fix, race condition fix +- [x] Install/uninstall scripts, deployed on Debian + Ubuntu + GPU server +- [x] Refactor — shared firecracker-vm.ts, skill system extraction ### Remaining -- [ ] Warm pool — pre-booted VMs from snapshots for instant spawns +- [ ] Warm pool — pre-booted VMs from snapshots - [ ] Concurrent snapshot runs via network namespaces -- [ ] Thin provisioning — device-mapper snapshots instead of full rootfs copies +- [ ] Thin provisioning — device-mapper snapshots -## Phase 5: Advanced Features +## Phase 5: Agent Intelligence -- [ ] Scheduled/cron tasks — agents that run on a timer -- [ ] !logs command — tail agent interaction history -- [ ] Persistent agent memory v2 — richer structure, auto-save -- [ ] Advanced tool use — MCP servers in Firecracker VMs -- [ ] Cost tracking — duration, model, tokens per interaction -- [ ] Execution recording — audit trail +Priority order by gain/complexity ratio. -## Phase 6: Ideas & Experiments +### High priority (high gain, low-medium complexity) -See IDEAS.md for the full list. Highlights: -- MCP servers as a single Firecracker VM with podman containers -- Cron agents, webhook triggers, alert forwarding -- Agent-written agents, agent debates, dream mode -- Web dashboard, install script dry-run -- Persistent agent memory with CLAUDE.md pattern (v2) +- [ ] **Large output handling** — tool results >2K chars saved to workspace file, agent gets preview + can read the rest. Prevents context explosion. Simple, high impact. +- [ ] **Iteration budget** — shared token/round budget across tool calls. Prevents runaway loops, especially with GPU server running faster models that chain more aggressively. Add per-template configurable limits. +- [ ] **Skill registry as git repo** — separate git repo for community/shared skills. Clone into agent rootfs. `fireclaw skills pull` to update. Like agentskills.io but self-hosted on Gitea. +- [ ] **Session persistence** — SQLite in workspace for conversation history. FTS5 full-text search over past sessions. Agents can search their own history. + +### Medium priority (medium gain, medium complexity) + +- [ ] **Context compression** — when conversation history exceeds threshold, LLM-summarize middle turns. Protect head (system prompt) and tail (recent messages). Keeps agents coherent in long conversations. +- [ ] **Skill learning** — after complex multi-tool tasks, agent creates a new SKILL.md + run.py in workspace/skills. Next boot, new skill is available. Self-improving agents. +- [ ] **Scheduled/cron agents** — template gets a `schedule` field. Overseer spawns agent on schedule, agent does its task, reports to #agents, self-destructs. +- [ ] **!logs command** — tail agent interaction history from workspace. + +### Lower priority (good ideas, higher complexity or less immediate need) + +- [ ] **Dangerous command approval** — pattern-based detection (rm -rf, git reset, etc.) with allowlist. Agent asks for confirmation before destructive commands. +- [ ] **Parallel tool execution** — detect independent tool calls, run concurrently. Needs safety heuristics (read-only, non-overlapping paths). +- [ ] **Cost tracking** — Ollama returns token counts. Log per-interaction: duration, model, tokens, skill used. +- [ ] **Execution recording** — full audit trail of all tool calls and results. + +## Phase 6: Infrastructure + +- [ ] MCP servers in Firecracker VM with podman containers +- [ ] Webhook triggers — HTTP endpoint that spawns ephemeral agents +- [ ] Alert forwarding — pipe system alerts into #agents +- [ ] Web dashboard — status page for running agents + +## Phase 7: Ideas & Experiments + +See IDEAS.md for the full list.