Skip to content

Pilot vitest+v8 code coverage on ten packages#27093

Draft
tylerbutler wants to merge 5 commits intomicrosoft:mainfrom
tylerbutler:wt1
Draft

Pilot vitest+v8 code coverage on ten packages#27093
tylerbutler wants to merge 5 commits intomicrosoft:mainfrom
tylerbutler:wt1

Conversation

@tylerbutler
Copy link
Copy Markdown
Member

@tylerbutler tylerbutler commented Apr 18, 2026

Description

Pilots vitest with its v8 coverage provider as an alternative to the current c8-based test:coverage script on 10 packages across the runtime, loader, DDS, and common utility layers.

Motivation. Repo-wide code coverage was disabled in CI in April 2026 (tools/pipelines/build-client.yml, testCoverage: false) because c8 performance regressed on Node 22 and further on Node 24, causing test timeouts. This pilot introduces a faster alternative without touching the existing c8 path or CI config.

What this PR does

  • Adds test:coverage:vitest to each of 10 packages.
  • Adds vitest.config.ts per package, with package-specific excludes for fuzz/benchmark/describe-level-this.timeout tests that don't work under strict-ESM vitest.
  • Adds common/build/build-common/vitest-test-setup.mjs — a pre-compiled ESM shim shared by all pilot packages. It aliases mocha's before/after to vitest's beforeAll/afterAll, strips the optional leading-string arg from beforeEach/afterEach (mocha allows beforeEach("name", fn), vitest doesn't), no-ops this.timeout()/this.retries()/this.skip() on the test context, and stubs getTestLogger (normally set by @fluid-internal/mocha-test-setup's beforeAll).
  • Adds scripts/bench-coverage.sh + scripts/bench-coverage-all.sh — a hyperfine-based harness for reproducible c8-vs-vitest timing comparisons, per-package or across the pilot set.

Packages

Layer Package Runs against
DDS @fluidframework/tree lib/** (decorators)
DDS @fluidframework/map src/**
DDS @fluidframework/matrix src/**
DDS @fluidframework/merge-tree src/**
Runtime @fluidframework/container-runtime src/**
Runtime @fluidframework/runtime-utils src/**
Runtime @fluidframework/id-compressor src/**
Loader @fluidframework/container-loader src/**
Common @fluidframework/core-utils src/**
Common @fluid-internal/client-utils src/**

Why one package runs against lib/** and the rest against src/**

Vitest runs directly on source TypeScript via its esbuild transform. It's the idiomatic approach and means no build step is required before running coverage. 9 of the 10 packages use this.

@fluidframework/tree is the exception. It uses the standard-ES decorator syntax (@breakingClass, @breakingMethod in src/util/breakable.ts). OXC — vitest 4's TS transformer — doesn't currently lower these to runtime __decorate calls, so Node's VM can't execute the transformed source. tsc-compiled lib/** has already done the lowering, so pointing vitest there sidesteps the issue. The v8 coverage provider follows source maps back to src/** for line-accurate reporting. If OXC gains decorator-lowering support, the tree config can flip.

Timing (hyperfine, N=3 runs + 1 warmup, macOS on Node 24)

Measured via scripts/bench-coverage-all.sh. Full per-package .md + .json outputs land in .benchmarks/ locally (gitignored).

Package c8+mocha (s) vitest+v8 (s) Speedup
core-utils 1.66 ± 0.02 0.99 ± 0.02 1.7×
client-utils 5.09 ± 0.24 0.99 ± 0.01 5.1×
runtime-utils 1.71 ± 0.02 1.27 ± 0.02 1.3×
id-compressor 10.06 ± 0.19 1.07 ± 0.02 9.4×
container-loader 2.22 ± 0.07 1.84 ± 0.04 1.2×
container-runtime 9.75 ± 0.18 4.11 ± 0.21 2.4×
map 8.09 ± 0.03 1.61 ± 0.01 5.0×
matrix 401.10 ± 29.36 13.62 ± 0.07 29.4×
merge-tree 775.64 ± 12.88 10.83 ± 0.16 71.6×
tree 151.35 ± 4.43 25.84 ± 0.21 5.9×
Sum of means 1366.67 62.16 22.0×

merge-tree and matrix are the dramatic cases — their c8+mocha paths spend 13 min and 6.7 min per run respectively; vitest+v8 brings them under 15 seconds. Smaller utility packages see modest wins because their tests are already fast; gains scale with test-suite size.

Coverage

Package Tests Pass % Cov (stmt)
map 157/157 100% 92.75%
matrix 274/274 100% 90.23%
merge-tree 893/895 (+2s) 99.8% 84.60%
container-runtime 902/904 99.8% 77.84%
container-loader 224/224 100% 66.84%
runtime-utils 92/93 98.9% 56.41%
client-utils 30/30 100% 50.00%
core-utils 73/73 100% 33.51%
id-compressor 24/24 100% 31.03%
tree 7190/7770 92.5% 91.58%

Coverage caveats

Coverage numbers are not strictly apples-to-apples with the c8 baseline:

  • Vitest excludes fuzz/perf/benchmark/this.timeout-at-describe-scope test files that it can't execute; their source contributions are missing from the numerator.
  • For packages running against src/**, coverage reflects what the tests actually exercise — not what merely gets imported by the tsc-compiled output. For some packages (core-utils notably) this drops the % noticeably vs. the lib/** strategy, but the denominator is unchanged.
  • client-utils coverage is limited to the mocha suite; the jest suite is out of scope for this pilot.

Known vitest-only test failures

Documented, not fixed here. None are vitest-compat issues — they're real assertions that behave differently under a single-ESM-pass vitest vs. the multi-pass mocha run:

  • tree — 266 failures in codec/snapshot round-trip suites (likely filesystem/CWD-sensitive).
  • container-runtime — 2 assertion edge cases.
  • runtime-utils — 1 test has a nested it() inside another it(); mocha tolerates it, vitest rejects (this is genuinely a bug in the test).
  • merge-tree — 2 skipped (existing it.skip in the source, not a vitest issue).

Reproducing the timings

# Per package:
scripts/bench-coverage.sh @fluidframework/map

# All 10:
scripts/bench-coverage-all.sh

# Env knobs:
BENCH_MODE=vitest scripts/bench-coverage.sh @fluidframework/tree 5 2

Requires hyperfine (brew install hyperfine). Outputs: .benchmarks/bench-<slug>.{md,json}.

Reviewer Guidance

The review process is outlined on this wiki page.

Draft until we've decided:

  1. Is common/build/build-common the right home for vitest-test-setup.mjs? The file is pre-compiled ESM (no build step required), consistent with build-common's role as the canonical place for shared build/test configs (tsconfig.test.node16.json, api-extractor-base.json, etc.). Alternative would be a new @fluid-internal/vitest-test-setup package paralleling @fluid-internal/mocha-test-setup.
  2. Is 10 packages the right pilot scope, or should we go further (or narrower) before committing to the approach?
  3. Not touching CI. Reenabling coverage in CI is intentionally out of scope — we can do that in a follow-up once the approach is agreed.

How to try it

# src/** packages — no build step needed
pnpm --filter @fluidframework/map run test:coverage:vitest
open packages/dds/map/nyc/report-vitest/index.html

# tree still needs a build
pnpm --filter @fluidframework/tree run build
pnpm --filter @fluidframework/tree run test:coverage:vitest
open packages/dds/tree/nyc/report-vitest/index.html

Adds a local-only `test:coverage:vitest` script to tree, map,
container-runtime, and container-loader. Coverage runs via vitest's
v8 provider against the tsc-compiled lib/** output, with source maps
mapping back to src/**. A shared mocha-compat shim lives in
common/build/build-common/vitest-test-setup.mjs (pre-compiled ESM
so no build step is needed).

Does not touch CI configuration. The c8-based `test:coverage`
script remains intact alongside.

Context: repo-wide code coverage was disabled in CI in April 2026
(tools/pipelines/build-client.yml, testCoverage: false) because
c8 regressed on Node 22/24 and caused timeouts. Head-to-head
timings on this branch (Node 24, local, sequential):

| Package           | vitest+v8 | c8+mocha | Speedup |
|-------------------|-----------|----------|---------|
| map               | 1.36s     | 7.33s    | 5.4x    |
| container-loader  | 1.39s     | 1.73s    | 1.2x    |
| container-runtime | 3.35s     | 8.48s    | 2.5x    |
| tree              | 22.63s    | 133.63s  | 5.9x    |
@tylerbutler tylerbutler self-assigned this Apr 18, 2026
Adds `test:coverage:vitest` to core-utils, client-utils, runtime-utils,
id-compressor, matrix, and merge-tree. Follows the same pattern as the
initial four pilots (tree, map, container-runtime, container-loader),
reusing the shared setup file at common/build/build-common/vitest-test-setup.mjs.

Per-package excludes were tailored:
- core-utils: excludes src/test/bench/** (uses @fluid-tools/benchmark)
- client-utils: only covers the mocha suite under lib/test/mocha/**;
  jest suite under lib/test/jest/** is separate
- matrix: excludes memory/, time/, *.stress/fuzz/big.spec.js
- merge-tree: excludes *.perf.spec.js, *Farm.spec.js, beastTest.spec.js
- runtime-utils, id-compressor: standard excludes only

Coverage results (Node 24, local):

| Package         | Tests         | Pass % | Cov (stmt) | Wall |
|-----------------|---------------|--------|------------|------|
| core-utils      | 73/73         | 100%   | 65.08%     | 1.0s |
| client-utils    | 30/30         | 100%   | 43.24%     | 0.8s |
| runtime-utils   | 92/93         | 98.9%  | 52.38%     | 1.1s |
| id-compressor   | 24/28         | 85.7%  | 31.24%     | 1.0s |
| matrix          | 274/274       | 100%   | 90.13%     | ~5s  |
| merge-tree      | 893/895 (+2s) | 99.8%  | 84.60%     | 9.6s |

client-utils coverage is intentionally limited to what the mocha subset
exercises; raising it would require also running the jest suite, which
is out of scope for this pilot.
Switches the 9 non-tree pilot packages to run vitest directly against
`src/test/**/*.{test,spec}.ts` instead of the tsc-compiled `lib/test/**`
output. This is the idiomatic vitest approach — esbuild transforms TS
on the fly, no prior build step is required.

The `tree` pilot stays on `lib/**` because OXC currently doesn't lower
the standard-ES decorator syntax used by
src/util/breakable.ts (`@breakingClass`/`@breakingMethod`). Running
tsc-compiled output sidesteps that.

Also adds a few minor excludes discovered during verification:
- map: `*.snapshot.spec.ts` (file asserts on `_dirname` ending in
  `(dist|lib)/test/mocha`) and `*FuzzTests.spec.ts`
- id-compressor: `src/test/snapshots/**` (same build-path assertion
  pattern) and `idCompressor.spec.ts` (chains `.timeout()` on `it()`)
- matrix: `src/test/time/**` (benchmark suite missed in first pass)

Coverage results after migration (Node 24, local):

| Package           | Tests         | Pass % | Cov (stmt) | Wall |
|-------------------|---------------|--------|------------|------|
| core-utils        | 73/73         | 100%   | 33.51%     | 0.6s |
| client-utils      | 30/30         | 100%   | 50.00%     | 0.7s |
| runtime-utils     | 92/93         | 98.9%  | 56.41%     | 0.8s |
| id-compressor     | 24/24         | 100%   | 31.03%     | 0.9s |
| matrix            | 274/274       | 100%   | 90.23%     | ~11s |
| map               | 157/157       | 100%   | 92.75%     | 1.3s |
| container-loader  | 224/224       | 100%   | 66.84%     | 1.3s |
| container-runtime | 902/904       | 99.8%  | 77.84%     | 3.1s |
| merge-tree        | 893/895 (+2s) | 99.8%  | 84.60%     | 9.7s |

Statement-count denominators are identical to the lib/** runs, which
confirms the coverage provider is walking the same source tree. The
numerators differ for some packages (notably core-utils) because
running against transformed source evaluates less module-level code
than executing the tsc-compiled output — i.e. coverage under
vitest+src/** reflects what the tests actually exercise rather than
what merely gets imported.
`scripts/bench-coverage.sh <pnpm-filter>` runs hyperfine against one
pilot package, comparing `test:coverage` (c8+mocha) to `test:coverage:vitest`
(vitest+v8) with warmup, multiple runs, and markdown + JSON export to
`.benchmarks/`. The wrapper `scripts/bench-coverage-all.sh` loops over
all 10 pilot packages sequentially, smallest → largest, so a failure
on tree doesn't abort the smaller packages' results.

Design choices:
- `--setup "pnpm --filter X run build"` runs once, not per iteration:
  tree's vitest and every package's c8+mocha need the build to exist.
- `--prepare "rm -rf nyc/report* nyc/.nyc_output"` before every run so
  the second measured run doesn't short-circuit coverage file writes.
- `BENCH_MODE=compare|c8|vitest` env toggle when only one tool's
  timing is of interest (e.g. debugging a regression in just one path).
- No new npm deps. hyperfine is a system binary; script errors with
  an install hint if it's missing.
- Output dir `.benchmarks/` added to .gitignore.
- `biome check --write` collapsed the short exclude arrays to one line
  in the 8 non-tree, non-container-runtime vitest configs (container-runtime
  was already compliant). This unblocks `pnpm run build` on those packages,
  which the hyperfine bench's --setup step depends on.
- Pass `-i`/`--ignore-failure` to hyperfine so the bench doesn't abort when
  the vitest or c8 run exits non-zero due to documented test failures. We
  want timing data regardless of whether all tests pass.
@tylerbutler
Copy link
Copy Markdown
Member Author

tylerbutler commented Apr 18, 2026

/azp run Build - protocol-definitions,Build - test-tools,server-gitrest,server-gitssh,server-routerlicious,Build - client packages,repo-policy-check,Build - build-tools

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 5 pipeline(s).

@tylerbutler tylerbutler changed the title Pilot vitest+v8 code coverage on four packages Pilot vitest+v8 code coverage on ten packages Apr 19, 2026
@microsoft microsoft deleted a comment from azure-pipelines bot Apr 19, 2026
@microsoft microsoft deleted a comment from azure-pipelines bot Apr 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant