Overhaul agent quality — prompts, tools, config, compression

- Rewrite system prompt: structured sections, explicit tool descriptions with full SKILL.md descriptions, multi-agent awareness - Add write_file skill for creating/modifying workspace files - Per-template config passthrough: temperature, num_predict, context_size, compress settings, max_tool_rounds, max_response_lines - Bump defaults: 1024 output tokens (was 512), 500-char deque (was 200), 250-token summaries (was 150), compress threshold 16 (was 12), keep 8 (was 4) - Cache compression by content hash — no redundant summarization - Update all 5 templates with tuned settings per role
2026-04-08 18:28:26 +00:00
parent 6c4ad47b09
commit c827d341ab
6 changed files with 160 additions and 30 deletions
--- a/agent/tools.py
+++ b/agent/tools.py
@@ -72,13 +72,13 @@ def ollama_request(ollama_url, payload):
        return json.loads(resp.read(2_000_000))


-def query_ollama(messages, runtime, tools, skill_scripts, dispatch_fn, ollama_url, max_rounds):
+def query_ollama(messages, runtime, tools, skill_scripts, dispatch_fn, ollama_url, max_rounds, num_predict=1024, temperature=0.7):
    """Call Ollama chat API with skill-based tool support."""
    payload = {
        "model": runtime["model"],
        "messages": messages,
        "stream": False,
-        "options": {"num_predict": 512},
+        "options": {"num_predict": num_predict, "temperature": temperature},
    }

    if tools: