183 lines
9.7 KiB
Markdown
183 lines
9.7 KiB
Markdown
# Fireclaw Ideas
|
|
|
|
Future features and experiments, loosely prioritized by usefulness.
|
|
|
|
## Operator Tools
|
|
|
|
### !status command
|
|
Quick dashboard in IRC: agent count, RAM/CPU per VM, Ollama model currently loaded, system uptime, disk free. One command to see the health of everything.
|
|
|
|
### !logs <agent> [n]
|
|
Tail the last N interactions an agent had. Stored in the agent's workspace. Useful to see what an agent's been doing while you were away.
|
|
|
|
### !persona <agent> [new persona]
|
|
View or live-edit an agent's persona via IRC. "Make the worker more sarcastic" without touching files or restarting. Uses hot-reload under the hood.
|
|
|
|
### !pause / !resume <agent>
|
|
Temporarily mute an agent without destroying it. Agent stays alive but stops responding. Useful when you need a channel to yourself.
|
|
|
|
## Skill System (inspired by mitsuhiko/agent-stuff)
|
|
|
|
### SKILL.md pattern for agent tools
|
|
Replace hardcoded tools in agent.py with a discoverable skill directory. Each skill is a folder with a SKILL.md (description, parameters, examples) and a script (run.sh/run.py).
|
|
|
|
```
|
|
~/.fireclaw/skills/
|
|
web_search/
|
|
SKILL.md # name, description, parameters — parsed into tool definition
|
|
run.py # actual implementation
|
|
fetch_url/
|
|
SKILL.md
|
|
run.py
|
|
git_diff/
|
|
SKILL.md
|
|
run.sh
|
|
```
|
|
|
|
Agent discovers skills at boot, loads SKILL.md into Ollama tool definitions, invokes scripts on tool call. Adding a new tool = drop a folder. No agent.py changes needed.
|
|
|
|
Could also support per-template skill selection — coder gets git/code skills, researcher gets search/fetch skills, worker gets everything.
|
|
|
|
Reference: https://github.com/mitsuhiko/agent-stuff — Pi Coding Agent skill/extension architecture.
|
|
|
|
## Agent Tools
|
|
|
|
### Web search
|
|
Agents can search via the searx instance on mymx. Either bake the searx CLI into the rootfs, or add a proper `web_search(query)` tool that calls the searx API from inside the VM. Agents could actually research topics instead of relying on training data.
|
|
|
|
### Fetch URL
|
|
`fetch_url(url)` tool to grab a webpage, strip HTML, return text. Combined with web search, agents become genuine research assistants. Could use `curl | python3 -c "from html.parser import..."` or a lightweight readability script.
|
|
|
|
### File sharing between agents
|
|
A shared `/shared` mount (third virtio drive, or a common ext4 image) that all agents can read/write. Drop a file from one agent, pick it up from another. Enables collaboration: researcher writes findings, coder reads and implements.
|
|
|
|
### Code execution sandbox
|
|
A `run_python(code)` tool that's safer than `run_command`. Executes in a subprocess with resource limits (timeout, memory cap). Better for code agents that need to test their own output.
|
|
|
|
## Automation
|
|
|
|
### Cron agents
|
|
Template gets an optional `schedule` field: `"schedule": "0 8 * * *"`. The overseer spawns the agent on schedule, it does its task, reports to #agents, and self-destructs. Use cases:
|
|
- Morning health check: "any disk/memory/service issues on grogbox?"
|
|
- Daily digest: "summarize what happened in #agents yesterday"
|
|
- Backup verification: "check that last night's backups completed"
|
|
|
|
### Webhook triggers
|
|
HTTP endpoint on the host (e.g., `:8080/hook/<template>`) that spawns an ephemeral agent with the webhook payload as context. Examples:
|
|
- Gitea push webhook → coder agent reviews the commit in #dev
|
|
- Uptime monitor → agent investigates and reports
|
|
- RSS feed → researcher summarizes new articles
|
|
|
|
### Alert forwarding
|
|
Pipe system alerts (fail2ban, smartmontools, systemd failures, journal errors) into #agents via a simple bridge script. An always-on agent could triage: "fail2ban banned 3 IPs today, all SSH brute force from China, nothing to worry about."
|
|
|
|
### Git integration
|
|
Agent can clone repos from Gitea (on mymx), read code, create branches, commit changes, open PRs. Would need git in the rootfs (already available via `apk add git`) and Gitea API access via the bridge network.
|
|
|
|
## Agent Personality & Memory
|
|
|
|
### Evolving personalities
|
|
Instruct agents to actively develop opinions, preferences, and communication styles over time. The memory system supports this — agents could save "I prefer concise answers" or "human likes dry humor" and adapt. Give them character arcs.
|
|
|
|
### Agent journals
|
|
Each agent maintains a daily journal in its workspace. Auto-saved summary of conversations, decisions made, things learned. Creates a narrative over time. Useful for debugging agent behavior and understanding their "thought process."
|
|
|
|
### Cross-agent memory
|
|
Agents can read (but not write) each other's MEMORY.md. A new agent spawned for a task can inherit context from an existing agent. "Spawn a coder that knows what the researcher found."
|
|
|
|
### Agent self-improvement
|
|
After each conversation, the agent reflects: "What could I have done better?" Saves lessons to memory. Over time, agents get better at their specific role. Needs a meta-prompt that triggers self-reflection.
|
|
|
|
## Multi-Agent Orchestration
|
|
|
|
### Task delegation
|
|
Human gives a complex task to one agent, it breaks it down and delegates subtasks to other agents via IRC. Researcher does the research, coder implements, worker tests. All visible in #agents.
|
|
|
|
### Agent voting
|
|
Multiple agents weigh in on a question. "Should we upgrade the kernel?" Each agent responds in #agents, human gets multiple perspectives. Could formalize with a `!poll` command.
|
|
|
|
### Agent debates
|
|
Two agents argue opposite sides of a technical decision. Useful for exploring trade-offs. "Should we use Rust or Go for this?" Coder argues one side, researcher the other.
|
|
|
|
## MCP Servers as Firecracker VMs
|
|
|
|
Run MCP tool servers in their own Firecracker VMs, same isolation model as agents. Managed by the overseer with the same lifecycle (!invoke, !destroy).
|
|
|
|
### Approach: single Firecracker VM with podman containers
|
|
```
|
|
Firecracker VMs (fcbr0, 172.16.0.x)
|
|
├── worker (agent VM)
|
|
├── coder (agent VM)
|
|
└── mcp-services (service VM, 172.16.0.10)
|
|
└── podman
|
|
├── mcp-fs (:8081)
|
|
├── mcp-git (:8082)
|
|
└── mcp-searx (:8083)
|
|
```
|
|
|
|
One VM hosts all MCP servers in separate containers. Firecracker isolates from the host, podman separates services from each other. Lightweight — MCP servers are just HTTP wrappers, don't need their own VMs.
|
|
|
|
Agents call them at `172.16.0.10:<port>`. Overseer manages the VM and lists available tools via `!services`.
|
|
|
|
One-VM-per-service is overkill for trusted MCP servers but could be used for untrusted third-party tools.
|
|
|
|
### Why a Firecracker VM instead of host podman
|
|
- MCP servers can't access the host filesystem directly
|
|
- Consistent isolation model with agents
|
|
- The VM is independently restartable without affecting the host
|
|
- Podman-in-Firecracker is already working in the agent rootfs
|
|
|
|
### Candidate MCP servers
|
|
- **filesystem** — read/write to a shared volume (mounted as virtio drive)
|
|
- **git** — clone, read, diff, commit (Gitea on mymx accessible via bridge)
|
|
- **searx** — web search via searx.mymx.me
|
|
- **database** — SQLite or PostgreSQL query tool
|
|
- **fetch** — HTTP fetch + readability extraction
|
|
|
|
### Cron / scheduled agents
|
|
Add `schedule` field to templates (cron syntax). Overseer checks every minute, spawns matching agents, they do their task, report to #agents, self-destruct after timeout. Use cases: daily health checks, backup verification, digest summaries.
|
|
|
|
## Logging
|
|
|
|
### Centralized log viewer
|
|
Agent logs go to /workspace/agent.log inside each VM. For a centralized web UI:
|
|
- rsyslog on host (agents send to 172.16.0.1:514) for aggregation
|
|
- frontail (`npx frontail /var/log/fireclaw/*.log --port 9001`) for browser-based real-time viewing
|
|
- Or GoTTY (`gotty tail -f ...`) for zero-config web terminal
|
|
|
|
Start simple (plain files + !logs), add rsyslog + frontail when needed.
|
|
|
|
## Infrastructure
|
|
|
|
### Agent metrics dashboard
|
|
Simple HTML page served from the host showing: running agents, response times, model usage, memory contents, conversation history. No framework — just a static page with data from agents.json and workspace files.
|
|
|
|
### Agent backup/restore
|
|
Export an agent's complete state (workspace, config, rootfs diff) as a tarball. Import on another machine. Portable agent identities.
|
|
|
|
### Multi-host agents
|
|
Run agents on multiple machines (grogbox + odin). Overseer manages VMs across hosts via SSH. Agents on different hosts communicate via IRC federation.
|
|
|
|
### GPU deployment
|
|
Remote machine available: Xeon E5-1620 v4, 32GB RAM, Quadro P5000 (16GB VRAM). Enough for 14B-30B models at 2-5s inference. Standalone fireclaw deployment — its own ngircd, its own agents, completely independent from grogbox.
|
|
|
|
### Install script
|
|
`scripts/install.sh` — one-command deployment to new machines. Installs firecracker, ollama (with GPU if available), ngircd, Node.js, builds rootfs, configures everything. `curl -fsSL .../install.sh | bash` or just `./scripts/install.sh`. No Ansible dependency — plain bash.
|
|
|
|
## Fun & Experimental
|
|
|
|
### Agent challenges
|
|
Post a challenge in #agents: "shortest Python script that sorts a list." Agents compete, see each other's answers, iterate. Gamified agent development.
|
|
|
|
### Honeypot agents
|
|
Agent with fake credentials, fake services, fake data. See what it tries to do. Test agent safety before trusting it with real access. Could also test prompt injection resistance.
|
|
|
|
### Agent-written agents
|
|
An agent creates a new template (persona + config) and asks the overseer to spawn it. Self-replicating agent system. Needs careful guardrails.
|
|
|
|
### IRC games
|
|
Agents play text-based games with each other or with humans. Trivia, 20 questions, collaborative storytelling. Tests agent personality and creativity in a low-stakes way.
|
|
|
|
### Dream mode
|
|
An agent left running overnight with `trigger: all` in an empty channel, talking to itself. Stream of consciousness. Review in the morning. Probably nonsense, but occasionally insightful.
|