Summary
A user-reported session export contains an assistant message with a negative output token count (-555). Root cause is an unclamped subtraction in getUsage combined with inconsistent usage accounting from the Kilo gateway for Moonshot kimi-k2.5 (thinking variant).
Repro data
Session: ses_2612c7860ffeWRHlgvs83y9pV2 (export provided by user)
Offending message: msg_d9ed3e053001D3EyHWNlRsnlI4
Provider: kilo (Kilo gateway)
Model: moonshotai/kimi-k2.5, variant thinking
Agent: code-review
Stored tokens on the message:
| field |
value |
| total |
55,963 |
| input |
3,746 |
| output |
-555 |
| reasoning |
3,364 |
| cache.read |
49,408 |
| cache.write |
0 |
Reconstructing the raw gateway response (raw outputTokens = stored.output + stored.reasoning):
- raw
outputTokens = 2,809
- raw
reasoningTokens = 3,364
totalTokens = 55,963 ✓ (matches input + rawOutput + cacheRead)
So the Kilo gateway reported reasoningTokens (3364) > outputTokens (2809) for this turn. Every other assistant message in the same session had reasoning ≤ rawOutput and rendered fine.
Root cause
packages/opencode/src/session/index.ts:295:
output: outputTokens - reasoningTokens,
This assumes the Vercel AI SDK v6 convention that outputTokens includes reasoningTokens, so subtraction yields "visible output". The gateway violated that invariant here, producing a negative value that then propagates into step-finish.tokens, the UI, and session exports.
Impact
- Display: negative numbers surface in the TUI, VS Code extension, and session export JSON.
- Stats:
packages/opencode/src/cli/cmd/stats.ts:214 and :274 sum tokens.output across messages, so lifetime stats are biased low when this quirk hits.
- Cost: not affected for the
kilo provider in this session — KiloSession.providerCost returns the gateway-reported cost before the token-based formula runs (session/index.ts:304-309). If it fell through to the formula, the negative output * rate would be offset by reasoning * rate (both use costInfo.output at :320 + :325), so net cost stays correct but per-bucket breakdowns would go negative.
Two things to investigate
-
Gateway side: why does Moonshot kimi-k2.5 (thinking) via the Kilo gateway report reasoningTokens > outputTokens for some tool-call turns? Likely double-counting or truncation in the thinking-variant usage mapping.
-
CLI side: getUsage should be defensive against this regardless. Suggested fix at session/index.ts:295:
output: Math.max(0, outputTokens - reasoningTokens),
Also worth logging a warning when reasoningTokens > outputTokens so we can measure how often this occurs and across which providers/models.
Files
packages/opencode/src/session/index.ts:258-329 — getUsage
packages/opencode/src/cli/cmd/stats.ts:214, :274 — downstream aggregation
Summary
A user-reported session export contains an assistant message with a negative
outputtoken count (-555). Root cause is an unclamped subtraction ingetUsagecombined with inconsistent usage accounting from the Kilo gateway for Moonshotkimi-k2.5(thinking variant).Repro data
Session:
ses_2612c7860ffeWRHlgvs83y9pV2(export provided by user)Offending message:
msg_d9ed3e053001D3EyHWNlRsnlI4Provider:
kilo(Kilo gateway)Model:
moonshotai/kimi-k2.5, variantthinkingAgent:
code-reviewStored tokens on the message:
Reconstructing the raw gateway response (
raw outputTokens = stored.output + stored.reasoning):outputTokens= 2,809reasoningTokens= 3,364totalTokens= 55,963 ✓ (matchesinput + rawOutput + cacheRead)So the Kilo gateway reported
reasoningTokens (3364) > outputTokens (2809)for this turn. Every other assistant message in the same session hadreasoning ≤ rawOutputand rendered fine.Root cause
packages/opencode/src/session/index.ts:295:This assumes the Vercel AI SDK v6 convention that
outputTokensincludesreasoningTokens, so subtraction yields "visible output". The gateway violated that invariant here, producing a negative value that then propagates intostep-finish.tokens, the UI, and session exports.Impact
packages/opencode/src/cli/cmd/stats.ts:214and:274sumtokens.outputacross messages, so lifetime stats are biased low when this quirk hits.kiloprovider in this session —KiloSession.providerCostreturns the gateway-reported cost before the token-based formula runs (session/index.ts:304-309). If it fell through to the formula, the negativeoutput * ratewould be offset byreasoning * rate(both usecostInfo.outputat:320+:325), so net cost stays correct but per-bucket breakdowns would go negative.Two things to investigate
Gateway side: why does Moonshot
kimi-k2.5(thinking) via the Kilo gateway reportreasoningTokens > outputTokensfor some tool-call turns? Likely double-counting or truncation in the thinking-variant usage mapping.CLI side:
getUsageshould be defensive against this regardless. Suggested fix atsession/index.ts:295:Also worth logging a warning when
reasoningTokens > outputTokensso we can measure how often this occurs and across which providers/models.Files
packages/opencode/src/session/index.ts:258-329—getUsagepackages/opencode/src/cli/cmd/stats.ts:214,:274— downstream aggregation