Skip to content

[Bug]: Subagent prompts exceed 24K tokens, breaking local models with limited context windows #951

@KnottyDyes

Description

@KnottyDyes

Environment

  • opencode version: 1.1.27
  • oh-my-opencode version: latest (installed via plugin)
  • OS: Linux
  • Local LLM: Ollama via Harbor (qwen2.5:14b, llama3.1:8b-instruct-q8_0)

Configuration

oh-my-opencode.json:

{
  "agents": {
    "oracle": {
      "model": "harbor-ollama/qwen2.5:14b"
    },
    "explore": {
      "model": "harbor-ollama/llama3.1:8b-instruct-q8_0"
    },
    "librarian": {
      "model": "harbor-ollama/llama3.1:8b-instruct-q8_0"
    }
  }
}

opencode.json provider:

"harbor-ollama": {
  "npm": "@ai-sdk/openai-compatible",
  "name": "Harbor/Ollama (overlord)",
  "options": {
    "baseURL": "http://overlord:33821/v1"
  },
  "models": {
    "qwen2.5:14b": {
      "name": "Qwen2.5 14B",
      "tools": true,
      "options": { "num_ctx": 16384 }
    }
  }
}

Issue

Subagent calls (oracle, explore, librarian) send prompts of ~24-25K tokens to local models configured with 16K context windows. Ollama truncates the prompt, causing:

  1. Empty responses (oracle returns (No text output))
  2. Corrupted/garbage output when truncation breaks mid-instruction

Evidence

Ollama logs show consistent truncation warnings:

level=WARN msg="truncating input prompt" limit=16535 prompt=24775 keep=4 new=16535
level=WARN msg="truncating input prompt" limit=16535 prompt=25835 keep=4 new=16535
level=WARN msg="truncating input prompt" limit=16535 prompt=24357 keep=5 new=16535

Analysis

The stock oracle system prompt from oh-my-opencode is only ~700 tokens. The 24K+ tokens appear to come from:

  1. The full Sisyphus main agent system prompt being passed to subagents
  2. Tool definitions
  3. Possibly conversation context

This works fine for API models (Claude, GPT, Gemini) with 128K+ context, but breaks local models.

Verification

  1. Direct Ollama API calls work perfectly:

    curl http://overlord:33821/v1/chat/completions \
      -d '{"model": "qwen2.5:14b", "messages": [{"role": "user", "content": "What is 2+2?"}]}'
    # Returns: {"content": "4"}
  2. Other agents using same model work (explore, librarian) - they may have shorter prompts or different handling

  3. Oracle consistently fails with empty output

Expected Behavior

Subagents should:

  1. Use their own lightweight system prompts (e.g., the ~700 token oracle prompt)
  2. NOT include the full Sisyphus orchestrator prompt
  3. Or, at minimum, respect the model's context limit and truncate intelligently

Workaround

Currently using API models (google/gemini-2.5-pro) for oracle, which have sufficient context windows.

Related Issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingtriage:bugConfirmed bug with repro steps

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions