Dangerous command approval: run_command skill now checks commands
against 9 regex patterns (rm -rf /, dd, mkfs, fork bombs, shutdown,
device writes, etc.) and blocks execution with a clear message.
Defense-in-depth layer on top of VM isolation.
Cron agents: templates support schedule (5-field cron) and
schedule_timeout (seconds, default 300) fields. Overseer checks
every 60s, spawns {name}-cron agents on match, auto-destroys after
timeout. Inline cron parser supports *, ranges, lists, and steps.
No npm dependencies added.
4.2 KiB
4.2 KiB
Fireclaw Roadmap
Phase 1: Core CLI (done)
- Firecracker microVM lifecycle (boot, exec, destroy)
- SSH-based command execution
- Network isolation (tap + bridge + NAT)
- IP pool management for concurrent VMs
- Signal handling and cleanup
- CLI interface (
fireclaw run,fireclaw setup)
Phase 2: Fast & Useful (done)
- Alpine Linux rootfs (1 GiB sparse, 146 MiB on disk)
- Precompiled binary, global
fireclawcommand - Snapshot & restore (~1.1s vs ~2.9s cold boot)
Phase 3: Multi-Agent System (done)
- ngircd configured (
nyx.fireclaw.local, FireclawNet) - Channel layout: #control (overseer), #agents (common room), DMs, /invite
- Ollama with 5+ models, hot-swappable per agent
- Agent rootfs — Alpine + Python IRC bot + podman + tools
- Agent manager — start/stop/list/reload long-running VMs
- Overseer — !invoke, !destroy, !list, !model, !models, !templates, !persona, !status, !help
- 5 agent templates — worker, coder, researcher, quick, creative
- Discoverable skill system — SKILL.md + run.py per tool, auto-loaded at boot
- Agent tools — run_command, web_search, fetch_url, save_memory
- Persistent workspace + memory system (MEMORY.md pattern)
- Agent hot-reload, non-root agents, agent-to-agent, DMs, /invite
- Overseer resilience, health checks, graceful shutdown, systemd
Phase 4: Hardening & Deployment (done)
- Network policies, thread safety, trigger fix, race condition fix
- Install/uninstall scripts, deployed on Debian + Ubuntu + GPU server
- Refactor — shared firecracker-vm.ts, skill system extraction
Remaining
- Warm pool — pre-booted VMs from snapshots
- Concurrent snapshot runs via network namespaces
- Thin provisioning — device-mapper snapshots
Phase 5: Agent Intelligence
Priority order by gain/complexity ratio.
High priority (high gain, low-medium complexity)
- Large output handling — tool results >2K chars saved to workspace file, agent gets preview + can read the rest. Prevents context explosion. Simple, high impact.
- Iteration budget — shared token/round budget across tool calls. Prevents runaway loops, especially with GPU server running faster models that chain more aggressively. Add per-template configurable limits.
- Skill registry as git repo — separate git repo for community/shared skills. Clone into agent rootfs.
fireclaw skills pullto update. Like agentskills.io but self-hosted on Gitea. - Session persistence — SQLite in workspace for conversation history. FTS5 full-text search over past sessions. Agents can search their own history.
Medium priority (medium gain, medium complexity)
- Context compression — when conversation history exceeds threshold, LLM-summarize middle turns. Protect head (system prompt) and tail (recent messages). Keeps agents coherent in long conversations.
- Skill learning — after complex multi-tool tasks, agent creates a new SKILL.md + run.py in workspace/skills. Next boot, new skill is available. Self-improving agents.
- Scheduled/cron agents — templates support
schedule(5-field cron) andschedule_timeoutfields. Overseer checks every 60s, spawns and auto-destroys. - !logs command — tail agent interaction history from workspace.
Lower priority (good ideas, higher complexity or less immediate need)
- Dangerous command approval — pattern-based detection (rm -rf, dd, mkfs, fork bombs, shutdown, etc.) blocks execution in run_command skill.
- Parallel tool execution — detect independent tool calls, run concurrently. Needs safety heuristics (read-only, non-overlapping paths).
- Cost tracking — Ollama returns token counts. Log per-interaction: duration, model, tokens, skill used.
- Execution recording — full audit trail of all tool calls and results.
Phase 6: Infrastructure
- MCP servers in Firecracker VM with podman containers
- Webhook triggers — HTTP endpoint that spawns ephemeral agents
- Alert forwarding — pipe system alerts into #agents
- Web dashboard — status page for running agents
Phase 7: Ideas & Experiments
See IDEAS.md for the full list.