[Bug]: Subagent prompts exceed 24K tokens, breaking local models with limited context windows

## Environment
- opencode version: 1.1.27
- oh-my-opencode version: latest (installed via plugin)
- OS: Linux
- Local LLM: Ollama via Harbor (qwen2.5:14b, llama3.1:8b-instruct-q8_0)

## Configuration

**oh-my-opencode.json**:
```json
{
  "agents": {
    "oracle": {
      "model": "harbor-ollama/qwen2.5:14b"
    },
    "explore": {
      "model": "harbor-ollama/llama3.1:8b-instruct-q8_0"
    },
    "librarian": {
      "model": "harbor-ollama/llama3.1:8b-instruct-q8_0"
    }
  }
}
```

**opencode.json** provider:
```json
"harbor-ollama": {
  "npm": "@ai-sdk/openai-compatible",
  "name": "Harbor/Ollama (overlord)",
  "options": {
    "baseURL": "http://overlord:33821/v1"
  },
  "models": {
    "qwen2.5:14b": {
      "name": "Qwen2.5 14B",
      "tools": true,
      "options": { "num_ctx": 16384 }
    }
  }
}
```

## Issue

Subagent calls (oracle, explore, librarian) send prompts of ~24-25K tokens to local models configured with 16K context windows. Ollama truncates the prompt, causing:
1. **Empty responses** (oracle returns `(No text output)`)
2. **Corrupted/garbage output** when truncation breaks mid-instruction

## Evidence

Ollama logs show consistent truncation warnings:
```
level=WARN msg="truncating input prompt" limit=16535 prompt=24775 keep=4 new=16535
level=WARN msg="truncating input prompt" limit=16535 prompt=25835 keep=4 new=16535
level=WARN msg="truncating input prompt" limit=16535 prompt=24357 keep=5 new=16535
```

## Analysis

The stock oracle system prompt from oh-my-opencode is only ~700 tokens. The 24K+ tokens appear to come from:
1. The full Sisyphus main agent system prompt being passed to subagents
2. Tool definitions
3. Possibly conversation context

This works fine for API models (Claude, GPT, Gemini) with 128K+ context, but breaks local models.

## Verification

1. **Direct Ollama API calls work perfectly**:
   ```bash
   curl http://overlord:33821/v1/chat/completions \
     -d '{"model": "qwen2.5:14b", "messages": [{"role": "user", "content": "What is 2+2?"}]}'
   # Returns: {"content": "4"}
   ```

2. **Other agents using same model work** (explore, librarian) - they may have shorter prompts or different handling

3. **Oracle consistently fails** with empty output

## Expected Behavior

Subagents should:
1. Use their own lightweight system prompts (e.g., the ~700 token oracle prompt)
2. NOT include the full Sisyphus orchestrator prompt
3. Or, at minimum, respect the model's context limit and truncate intelligently

## Workaround

Currently using API models (google/gemini-2.5-pro) for oracle, which have sufficient context windows.

## Related Issues

- #380 - Oracle agent returns empty result with GPT-5.2 model (similar symptoms, different cause)
- #84 - oracle failing?
- #289 - Internal info exchange exceeds 200K tokens (related token bloat)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Subagent prompts exceed 24K tokens, breaking local models with limited context windows #951

Environment

Configuration

Issue

Evidence

Analysis

Verification

Expected Behavior

Workaround

Related Issues

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Bug]: Subagent prompts exceed 24K tokens, breaking local models with limited context windows #951

Description

Environment

Configuration

Issue

Evidence

Analysis

Verification

Expected Behavior

Workaround

Related Issues

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions