Skip to content

Commit 44d5ffa

Browse files
authored
Merge pull request #108 from getagentseal/fix/menubar-all-tab-stale-refresh
Release 0.8.0: model comparison, auto-refresh, menubar fix
2 parents bd43b15 + 6e4db43 commit 44d5ffa

6 files changed

Lines changed: 59 additions & 13 deletions

File tree

CHANGELOG.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,22 @@
22

33
## Unreleased
44

5+
## 0.8.0 - 2026-04-19
6+
7+
### Added
8+
- **`codeburn compare` command.** Side-by-side model comparison across any two models in your session data. Interactive model picker, period switching, and provider filtering.
9+
- **Compare view in dashboard.** Press `c` in the TUI to enter compare mode. Arrow keys switch periods, `b` to return.
10+
- **Performance metrics.** One-shot rate, retry rate, and self-correction detection per model. Self-corrections are detected by scanning JSONL transcripts for tool error followed by retry patterns.
11+
- **Efficiency metrics.** Cost per call, cost per edit turn, output tokens per call, and cache hit rate.
12+
- **Per-category one-shot rates.** Breaks down one-shot success by task category (Coding, Debugging, Feature Dev, etc.) for each model.
13+
- **Working style comparison.** Delegation rate, planning rate (TaskCreate, TaskUpdate, TodoWrite), average tools per turn, and fast mode usage.
14+
- **TUI auto-refresh enabled by default.** Dashboard now refreshes every 30 seconds out of the box. Pass `--refresh 0` to disable. Closes #107.
15+
- **36 comparison tests.** Full coverage for metric computation, category breakdown, working style, self-correction scanning, and planning tool detection. Total suite: 274 tests.
16+
17+
### Fixed
18+
- **Planning rate showed ~0% in model comparison.** Only counted `EnterPlanMode` (rarely used) instead of all planning tools (TaskCreate, TaskUpdate, TodoWrite, EnterPlanMode, ExitPlanMode). Now detects planning at the turn level across all five tool types.
19+
- **Menubar "All" tab showed stale data.** Three-layer caching (300s in-memory TTL, daily disk cache, 60s parser cache) prevented tab switches from showing fresh numbers. Cache TTL reduced from 300s to 30s, tab switches always fetch fresh data, background refresh interval reduced from 60s to 15s.
20+
521
## 0.7.4 - 2026-04-19
622

723
### Added

README.md

Lines changed: 35 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ codeburn report -p 30days # rolling 30-day window
5151
codeburn report -p all # every recorded session
5252
codeburn report --from 2026-04-01 --to 2026-04-10 # exact date range
5353
codeburn report --format json # full dashboard data as JSON
54-
codeburn report --refresh 60 # auto-refresh every 60 seconds
54+
codeburn report --refresh 60 # auto-refresh every 60s (default: 30s)
5555
codeburn status # compact one-liner (today + month)
5656
codeburn status --format json
5757
codeburn export # CSV with today, 7 days, 30 days
@@ -60,7 +60,7 @@ codeburn optimize # find waste, get copy-paste fixes
6060
codeburn optimize -p week # scope the scan to last 7 days
6161
```
6262

63-
Arrow keys switch between Today / 7 Days / 30 Days / Month / All Time. Press `q` to quit, `1` `2` `3` `4` `5` as shortcuts. The dashboard also shows average cost per session and the five most expensive sessions across all projects.
63+
Arrow keys switch between Today / 7 Days / 30 Days / Month / All Time. Press `q` to quit, `1` `2` `3` `4` `5` as shortcuts, `c` to open model comparison. The dashboard auto-refreshes every 30 seconds by default (`--refresh 0` to disable). The dashboard also shows average cost per session and the five most expensive sessions across all projects.
6464

6565
### JSON output
6666

@@ -176,7 +176,7 @@ The menu bar widget includes a currency picker with 17 common currencies. For an
176176
npx codeburn menubar
177177
```
178178

179-
One command: downloads the latest `.app`, installs into `~/Applications`, and launches it. Re-run with `--force` to reinstall. Native Swift + SwiftUI app lives in `mac/` (see `mac/README.md` for build details). Shows today's cost with a flame icon, opens a popover with agent tabs, period switcher (Today / 7 Days / 30 Days / Month / All), Trend / Forecast / Pulse / Stats / Plan insights, activity and model breakdowns, optimize findings, and CSV/JSON export. Refreshes live via FSEvents plus a 60-second poll.
179+
One command: downloads the latest `.app`, installs into `~/Applications`, and launches it. Re-run with `--force` to reinstall. Native Swift + SwiftUI app lives in `mac/` (see `mac/README.md` for build details). Shows today's cost with a flame icon, opens a popover with agent tabs, period switcher (Today / 7 Days / 30 Days / Month / All), Trend / Forecast / Pulse / Stats / Plan insights, activity and model breakdowns, optimize findings, and CSV/JSON export. Refreshes live via FSEvents plus a 15-second poll.
180180

181181
## What it tracks
182182

@@ -250,6 +250,37 @@ Each finding shows the estimated token and dollar savings plus a ready-to-paste
250250

251251
You can also open it inline from the dashboard: press `o` when a finding count appears in the status bar, `b` to return.
252252

253+
## Compare
254+
255+
Side-by-side model comparison across any two models in your session data. Pick any pair and see how they stack up on real usage from your own sessions.
256+
257+
```bash
258+
codeburn compare # interactive model picker (default: all time)
259+
codeburn compare -p week # last 7 days
260+
codeburn compare -p today # today only
261+
codeburn compare --provider claude # Claude Code sessions only
262+
```
263+
264+
Or press `c` in the dashboard to enter compare mode. Arrow keys switch periods, `b` to return.
265+
266+
**Metrics compared**
267+
268+
| Section | Metric | What it measures |
269+
|---------|--------|-----------------|
270+
| Performance | One-shot rate | Edits that succeed without retries |
271+
| Performance | Retry rate | Average retries per edit turn |
272+
| Performance | Self-correction | Turns where the model corrected its own mistake |
273+
| Efficiency | Cost / call | Average cost per API call |
274+
| Efficiency | Cost / edit | Average cost per edit turn |
275+
| Efficiency | Output tok / call | Average output tokens per call |
276+
| Efficiency | Cache hit rate | Proportion of input from cache |
277+
278+
**Per-category one-shot rates.** Breaks down one-shot success by task category (Coding, Debugging, Feature Dev, etc.) so you can see where each model excels or struggles.
279+
280+
**Working style.** Compares delegation rate (agent spawns), planning rate (TaskCreate, TaskUpdate, TodoWrite usage), average tools per turn, and fast mode usage.
281+
282+
All metrics are computed from your local session data. No LLM calls, fully deterministic.
283+
253284
## How it reads data
254285

255286
**Claude Code** stores session transcripts as JSONL at `~/.claude/projects/<sanitized-path>/<session-id>.jsonl`. Each assistant entry contains model name, token usage (input, output, cache read, cache write), tool_use blocks, and timestamps.
@@ -280,6 +311,7 @@ src/
280311
parser.ts JSONL reader, dedup, date filter, provider orchestration
281312
models.ts LiteLLM pricing, cost calculation
282313
classifier.ts 13-category task classifier
314+
compare-stats.ts Model comparison engine (metrics, category breakdown, working style)
283315
types.ts Type definitions
284316
format.ts Text rendering (status bar)
285317
menubar-json.ts Payload builder consumed by the native macOS menubar app in mac/

mac/Sources/CodeBurnMenubar/AppStore.swift

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
import Foundation
22
import Observation
33

4-
private let cacheTTLSeconds: TimeInterval = 300
4+
private let cacheTTLSeconds: TimeInterval = 30
55

66
struct CachedPayload {
77
let payload: MenubarPayload
@@ -52,17 +52,15 @@ final class AppStore {
5252
payload.optimize.findingCount
5353
}
5454

55-
/// Switch to a period. Uses cached payload if fresh; otherwise fetches.
55+
/// Switch to a period. Always fetches fresh data so the user never sees stale numbers.
5656
func switchTo(period: Period) async {
5757
selectedPeriod = period
58-
if let cached = cache[currentKey], cached.isFresh { return }
5958
await refresh(includeOptimize: true)
6059
}
6160

62-
/// Switch to a provider filter. Uses cached payload if fresh; otherwise fetches.
61+
/// Switch to a provider filter. Always fetches fresh data so the user never sees stale numbers.
6362
func switchTo(provider: ProviderFilter) async {
6463
selectedProvider = provider
65-
if let cached = cache[currentKey], cached.isFresh { return }
6664
await refresh(includeOptimize: true)
6765
}
6866

mac/Sources/CodeBurnMenubar/CodeBurnApp.swift

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@ import SwiftUI
22
import AppKit
33
import Observation
44

5-
private let refreshIntervalSeconds: UInt64 = 60
5+
private let refreshIntervalSeconds: UInt64 = 15
66
private let nanosPerSecond: UInt64 = 1_000_000_000
77
private let refreshIntervalNanos: UInt64 = refreshIntervalSeconds * nanosPerSecond
88
/// Fixed so the popover's anchor point doesn't shift each time today's cost changes.

package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "codeburn",
3-
"version": "0.7.4",
3+
"version": "0.8.0",
44
"description": "See where your AI coding tokens go - by task, tool, model, and project",
55
"type": "module",
66
"main": "./dist/cli.js",

src/cli.ts

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -247,7 +247,7 @@ program
247247
.option('--format <format>', 'Output format: tui, json', 'tui')
248248
.option('--project <name>', 'Show only projects matching name (repeatable)', collect, [])
249249
.option('--exclude <name>', 'Exclude projects matching name (repeatable)', collect, [])
250-
.option('--refresh <seconds>', 'Auto-refresh interval in seconds', parseInt)
250+
.option('--refresh <seconds>', 'Auto-refresh interval in seconds (0 to disable)', parseInt, 30)
251251
.action(async (opts) => {
252252
let customRange: DateRange | null = null
253253
try {
@@ -502,7 +502,7 @@ program
502502
.option('--format <format>', 'Output format: tui, json', 'tui')
503503
.option('--project <name>', 'Show only projects matching name (repeatable)', collect, [])
504504
.option('--exclude <name>', 'Exclude projects matching name (repeatable)', collect, [])
505-
.option('--refresh <seconds>', 'Auto-refresh interval in seconds', parseInt)
505+
.option('--refresh <seconds>', 'Auto-refresh interval in seconds (0 to disable)', parseInt, 30)
506506
.action(async (opts) => {
507507
if (opts.format === 'json') {
508508
await runJsonReport('today', opts.provider, opts.project, opts.exclude)
@@ -518,7 +518,7 @@ program
518518
.option('--format <format>', 'Output format: tui, json', 'tui')
519519
.option('--project <name>', 'Show only projects matching name (repeatable)', collect, [])
520520
.option('--exclude <name>', 'Exclude projects matching name (repeatable)', collect, [])
521-
.option('--refresh <seconds>', 'Auto-refresh interval in seconds', parseInt)
521+
.option('--refresh <seconds>', 'Auto-refresh interval in seconds (0 to disable)', parseInt, 30)
522522
.action(async (opts) => {
523523
if (opts.format === 'json') {
524524
await runJsonReport('month', opts.provider, opts.project, opts.exclude)

0 commit comments

Comments
 (0)