Files
fireclaw/ROADMAP.md
ansible abc91bc149 Add dangerous command blocking and cron agent scheduling
Dangerous command approval: run_command skill now checks commands
against 9 regex patterns (rm -rf /, dd, mkfs, fork bombs, shutdown,
device writes, etc.) and blocks execution with a clear message.
Defense-in-depth layer on top of VM isolation.

Cron agents: templates support schedule (5-field cron) and
schedule_timeout (seconds, default 300) fields. Overseer checks
every 60s, spawns {name}-cron agents on match, auto-destroys after
timeout. Inline cron parser supports *, ranges, lists, and steps.
No npm dependencies added.
2026-04-08 19:26:23 +00:00

4.2 KiB

Fireclaw Roadmap

Phase 1: Core CLI (done)

  • Firecracker microVM lifecycle (boot, exec, destroy)
  • SSH-based command execution
  • Network isolation (tap + bridge + NAT)
  • IP pool management for concurrent VMs
  • Signal handling and cleanup
  • CLI interface (fireclaw run, fireclaw setup)

Phase 2: Fast & Useful (done)

  • Alpine Linux rootfs (1 GiB sparse, 146 MiB on disk)
  • Precompiled binary, global fireclaw command
  • Snapshot & restore (~1.1s vs ~2.9s cold boot)

Phase 3: Multi-Agent System (done)

  • ngircd configured (nyx.fireclaw.local, FireclawNet)
  • Channel layout: #control (overseer), #agents (common room), DMs, /invite
  • Ollama with 5+ models, hot-swappable per agent
  • Agent rootfs — Alpine + Python IRC bot + podman + tools
  • Agent manager — start/stop/list/reload long-running VMs
  • Overseer — !invoke, !destroy, !list, !model, !models, !templates, !persona, !status, !help
  • 5 agent templates — worker, coder, researcher, quick, creative
  • Discoverable skill system — SKILL.md + run.py per tool, auto-loaded at boot
  • Agent tools — run_command, web_search, fetch_url, save_memory
  • Persistent workspace + memory system (MEMORY.md pattern)
  • Agent hot-reload, non-root agents, agent-to-agent, DMs, /invite
  • Overseer resilience, health checks, graceful shutdown, systemd

Phase 4: Hardening & Deployment (done)

  • Network policies, thread safety, trigger fix, race condition fix
  • Install/uninstall scripts, deployed on Debian + Ubuntu + GPU server
  • Refactor — shared firecracker-vm.ts, skill system extraction

Remaining

  • Warm pool — pre-booted VMs from snapshots
  • Concurrent snapshot runs via network namespaces
  • Thin provisioning — device-mapper snapshots

Phase 5: Agent Intelligence

Priority order by gain/complexity ratio.

High priority (high gain, low-medium complexity)

  • Large output handling — tool results >2K chars saved to workspace file, agent gets preview + can read the rest. Prevents context explosion. Simple, high impact.
  • Iteration budget — shared token/round budget across tool calls. Prevents runaway loops, especially with GPU server running faster models that chain more aggressively. Add per-template configurable limits.
  • Skill registry as git repo — separate git repo for community/shared skills. Clone into agent rootfs. fireclaw skills pull to update. Like agentskills.io but self-hosted on Gitea.
  • Session persistence — SQLite in workspace for conversation history. FTS5 full-text search over past sessions. Agents can search their own history.

Medium priority (medium gain, medium complexity)

  • Context compression — when conversation history exceeds threshold, LLM-summarize middle turns. Protect head (system prompt) and tail (recent messages). Keeps agents coherent in long conversations.
  • Skill learning — after complex multi-tool tasks, agent creates a new SKILL.md + run.py in workspace/skills. Next boot, new skill is available. Self-improving agents.
  • Scheduled/cron agents — templates support schedule (5-field cron) and schedule_timeout fields. Overseer checks every 60s, spawns and auto-destroys.
  • !logs command — tail agent interaction history from workspace.

Lower priority (good ideas, higher complexity or less immediate need)

  • Dangerous command approval — pattern-based detection (rm -rf, dd, mkfs, fork bombs, shutdown, etc.) blocks execution in run_command skill.
  • Parallel tool execution — detect independent tool calls, run concurrently. Needs safety heuristics (read-only, non-overlapping paths).
  • Cost tracking — Ollama returns token counts. Log per-interaction: duration, model, tokens, skill used.
  • Execution recording — full audit trail of all tool calls and results.

Phase 6: Infrastructure

  • MCP servers in Firecracker VM with podman containers
  • Webhook triggers — HTTP endpoint that spawns ephemeral agents
  • Alert forwarding — pipe system alerts into #agents
  • Web dashboard — status page for running agents

Phase 7: Ideas & Experiments

See IDEAS.md for the full list.