Negative output token count when reasoning > outputTokens (Kilo gateway / Moonshot kimi-k2.5 thinking)

## Summary

A user-reported session export contains an assistant message with a **negative** `output` token count (`-555`). Root cause is an unclamped subtraction in `getUsage` combined with inconsistent usage accounting from the Kilo gateway for Moonshot `kimi-k2.5` (thinking variant).

## Repro data

Session: `ses_2612c7860ffeWRHlgvs83y9pV2` (export provided by user)
Offending message: `msg_d9ed3e053001D3EyHWNlRsnlI4`
Provider: `kilo` (Kilo gateway)
Model: `moonshotai/kimi-k2.5`, variant `thinking`
Agent: `code-review`

Stored tokens on the message:

| field       |  value |
| ----------- | -----: |
| total       | 55,963 |
| input       |  3,746 |
| **output**  | **-555** |
| reasoning   |  3,364 |
| cache.read  | 49,408 |
| cache.write |      0 |

Reconstructing the raw gateway response (`raw outputTokens = stored.output + stored.reasoning`):

- raw `outputTokens` = 2,809
- raw `reasoningTokens` = 3,364
- `totalTokens` = 55,963 ✓ (matches `input + rawOutput + cacheRead`)

So the Kilo gateway reported `reasoningTokens (3364) > outputTokens (2809)` for this turn. Every other assistant message in the same session had `reasoning ≤ rawOutput` and rendered fine.

## Root cause

`packages/opencode/src/session/index.ts:295`:

```ts
output: outputTokens - reasoningTokens,
```

This assumes the Vercel AI SDK v6 convention that `outputTokens` includes `reasoningTokens`, so subtraction yields "visible output". The gateway violated that invariant here, producing a negative value that then propagates into `step-finish.tokens`, the UI, and session exports.

## Impact

- **Display:** negative numbers surface in the TUI, VS Code extension, and session export JSON.
- **Stats:** `packages/opencode/src/cli/cmd/stats.ts:214` and `:274` sum `tokens.output` across messages, so lifetime stats are biased low when this quirk hits.
- **Cost:** not affected for the `kilo` provider in this session — `KiloSession.providerCost` returns the gateway-reported cost before the token-based formula runs (`session/index.ts:304-309`). If it fell through to the formula, the negative `output * rate` would be offset by `reasoning * rate` (both use `costInfo.output` at `:320` + `:325`), so net cost stays correct but per-bucket breakdowns would go negative.

## Two things to investigate

1. **Gateway side:** why does Moonshot `kimi-k2.5` (thinking) via the Kilo gateway report `reasoningTokens > outputTokens` for some tool-call turns? Likely double-counting or truncation in the thinking-variant usage mapping.
2. **CLI side:** `getUsage` should be defensive against this regardless. Suggested fix at `session/index.ts:295`:

    ```ts
    output: Math.max(0, outputTokens - reasoningTokens),
    ```

   Also worth logging a warning when `reasoningTokens > outputTokens` so we can measure how often this occurs and across which providers/models.

## Files

- `packages/opencode/src/session/index.ts:258-329` — `getUsage`
- `packages/opencode/src/cli/cmd/stats.ts:214`, `:274` — downstream aggregation


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Negative output token count when reasoning > outputTokens (Kilo gateway / Moonshot kimi-k2.5 thinking) #9168

Summary

Repro data

Root cause

Impact

Two things to investigate

Files

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

field	value
total	55,963
input	3,746
output	-555
reasoning	3,364
cache.read	49,408
cache.write	0

Negative output token count when reasoning > outputTokens (Kilo gateway / Moonshot kimi-k2.5 thinking) #9168

Description

Summary

Repro data

Root cause

Impact

Two things to investigate

Files

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions