Fix operator self-time through Compute Scalar children + one-decimal benefit % (#215 C5, C2) by erikdarlingdata · Pull Request #255 · erikdarlingdata/PerformanceStudio

erikdarlingdata · 2026-04-22T09:59:10Z

Summary

Two Joe-feedback items from #215:

C5 (bug): operator self-time was wrong for any operator whose direct children were Compute Scalars. `GetSerialOwnElapsed` subtracted each child's `ActualElapsedMs`, but Compute Scalars don't carry runtime stats, so it subtracted 0 and the parent kept its full elapsed. Joe's node-6 Hash Spill reported 95% / 8,829ms; correct is 17.4% / 1,609ms (`8.829 − 7.121 − 0.099`). New `GetEffectiveChildElapsedMs` helper walks through pass-through operators to the first descendant with real stats.
C2: benefit % was `N0` everywhere, so tiny waits showed as "up to 0%". Now one decimal except at 100 (shown as "100"). Applied to web strip + wait-stats card, HTML export, text formatter, Avalonia plan + node warning headers.

Details

`GetSerialOwnElapsed` → iterates children calling `GetEffectiveChildElapsedMs(child)` instead of reading `child.ActualElapsedMs` directly. The helper:

Parallelism exchange child → max-child elapsed (unchanged, exchange times unreliable)
Child has `ActualElapsedMs > 0` → use it
Otherwise → recurse into grandchildren and sum

Applies to every operator-time benefit scorer: spills (Rule 7), key lookups (Rule 10), scan with predicate (Rule 11), non-SARGable (Rule 12), bare scan (Rule 34), scan cardinality misestimate (Rule 32), eager index spool, filter operator, nested loops high executions, row estimate mismatch.

Test plan

Joe's private plan — node-6 Hash Spill now shows 17.4% benefit, 1,609ms self-time (matches his math)
Wait items show 0.0–0.5% benefit instead of 0%
Regression check on `20260415_1.sqlplan` — all existing benefits unchanged
Web viewer visual post-deploy
Desktop Plan Warnings expander visual post-deploy

Version 1.7.1 → 1.7.2.

🤖 Generated with Claude Code

…decimal on benefit % C5 (operator self-time look-through): GetSerialOwnElapsed subtracted each direct child's ActualElapsedMs, but Compute Scalar operators don't carry runtime stats (ActualElapsedMs = 0). For an operator whose direct children are Compute Scalars — like the node-6 Hash Match in Joe's private plan — that meant "children elapsed = 0" and the parent kept its full elapsed as self-time. Effect: node-6 Hash Spill reported 95% benefit / 8,829ms self-time. With look-through through the Compute Scalars it's 1,609ms self-time / 17.4% — matching Joe's math of 8.829 - 7.121 - 0.099. New GetEffectiveChildElapsedMs helper walks through pass-through operators (Compute Scalar and anything else missing runtime stats) to the first descendant with real stats. Parallelism exchange children keep existing max-child behavior because exchange times are unreliable. Applies to every operator-time-based benefit score (spills, key lookups, bare scans, scan-with-predicate, scan cardinality, eager index spool, etc). C2 (decimal precision on benefit %): Benefit % was rounded to N0 everywhere, so tiny waits showed as "up to 0%". Switched to one decimal except for 100 (shown as "100"), matching the existing pattern in ComparisonFormatter. Applied to web strip, wait-stats card, HTML export (both surfaces), text formatter (both paths), Avalonia Plan Warnings + Node Warnings headers. Version bump 1.7.1 -> 1.7.2. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

erikdarlingdata

Summary

Fixes two issues from #215 feedback:

Self-time bug (C5): GetSerialOwnElapsed subtracted raw child.ActualElapsedMs, which is 0 for Compute Scalars and similar pass-through operators, so the parent kept its full elapsed time. New GetEffectiveChildElapsedMs recurses through stats-less children to the first descendant with real timing data. Joe's node-6 Hash Spill goes from 95%/8,829ms to a correct 17.4%/1,609ms.
Benefit % precision (C2): N0 formatting turned sub-1% benefits into "up to 0%". Now one decimal, reverting to integer at 100. Applied to Avalonia plan/node warning headers, Razor wait strip + warnings list, HtmlExporter, and TextFormatter.

Version bump 1.7.1 → 1.7.2. Targets dev.

What's good

Self-time fix is localized to one helper; serial-mode change is easy to reason about.
Recursion is bounded by plan tree depth (no cycle risk).
Parallelism handling is preserved and now also looks through chained pass-throughs.
XML doc comments on both GetSerialOwnElapsed and GetEffectiveChildElapsedMs explain the intent.

Needs attention

Cross-repo sync (PlanAnalyzer.cs:1605): this is a behavior change to the scoring pipeline used by Rules 7/10/11/12/32/34 plus several operator scorers. PerformanceMonitor Dashboard and Lite need the same fix or the three products will disagree on the same plan.
Test coverage gap: core analyzer logic changed but nothing in tests/PlanViewer.Core.Tests/ was added/updated. A fixture for the "parent → Compute Scalar → Real Op" shape (and the Parallelism-over-Compute-Scalar shape) would lock in the fix and catch regressions. The PR body's regression check is manual against one .sqlplan.
Formatting duplication: the >= 100 ? N0 : N1 ternary is inlined in six places across three files while PlanViewerControl has a FormatBenefitPercent helper for the same thing. Worth hoisting a single formatter into PlanViewer.Core.
99.95 rounding edge (PlanViewerControl.axaml.cs:805): N1 rounds 99.95 up to "100.0", but the ternary only switches to N0 at >= 100, so you can still get "up to 100.0%". Round before compare if the intent is "integer only at exactly 100".
Multi-child pass-through (PlanAnalyzer.cs:1633): summing over all grandchildren is safe for unary Compute Scalars but would over-subtract for a hypothetical stats-less operator with multiple input branches in serial row mode. A Children.Count == 1 guard or a comment stating the assumption would document the intent.

Inline notes left on the specific lines. Not approving or requesting changes — maintainer's call.

Generated by Claude Code

erikdarlingdata · 2026-04-22T10:00:18Z

+    {
+        // Exchange operators: unreliable times, use max child
+        if (child.PhysicalOp == "Parallelism" && child.Children.Count > 0)
+            return child.Children.Max(GetEffectiveChildElapsedMs);


This is a scoring-rule change — GetSerialOwnElapsed feeds Rules 7, 10, 11, 12, 32, 34 plus eager index spool / filter / nested loops / row estimate scorers per the PR body. The equivalent fix needs to land in PerformanceMonitor's Dashboard and Lite copies of the analyzer or the three products will disagree on the same plan (Joe's node-6 Hash Spill would report 95% in Dashboard/Lite but 17.4% here).

Generated by Claude Code

erikdarlingdata · 2026-04-22T10:00:23Z

+        foreach (var grandchild in child.Children)
+            sum += GetEffectiveChildElapsedMs(grandchild);
+        return sum;
+    }


Summing over all grandchildren works for the common case (Compute Scalar with a single input) but could over-subtract if a pass-through-without-stats ever has multiple input branches — the parent's elapsed includes those branches sequentially in row mode, but the two sibling branches' elapsed are not additive in wall time. Compute Scalar is always unary so this is theoretical today; worth either a comment stating the assumption or a guard that only recurses when child.Children.Count == 1.

Generated by Claude Code

erikdarlingdata · 2026-04-22T10:00:28Z

+        // Child has its own stats: use them
+        if (child.ActualElapsedMs > 0)
+            return child.ActualElapsedMs;
+


Behavior change for Parallelism children worth calling out: the old code did child.Children.Max(c => c.ActualElapsedMs), which would return 0 if the exchange's direct child was itself a Compute Scalar. The new recursive Max(GetEffectiveChildElapsedMs) now looks through that, which is probably what you want, but it changes the reported self-time for any operator sitting above a Parallelism → Compute Scalar → Real Op chain. Make sure the 20260415_1.sqlplan regression check you ran exercises that shape.

Generated by Claude Code

erikdarlingdata · 2026-04-22T10:00:33Z

                var barPct = maxWait > 0 ? (double)w.WaitTimeMs / maxWait * 100 : 0;
                var benefitTag = benefitLookup.TryGetValue(w.WaitType, out var pct)
-                    ? $" <span class=\"warn-benefit\">up to {pct:N0}%</span>"
+                    ? $" <span class=\"warn-benefit\">up to {(pct >= 100 ? pct.ToString("N0") : pct.ToString("N1"))}%</span>"


The >= 100 ? N0 : N1 ternary is now duplicated inline in HtmlExporter (x2), TextFormatter (x2), and Index.razor (x2), while PlanViewerControl has a FormatBenefitPercent helper for the same logic. Consider hoisting a single BenefitFormatter.Format(double pct) (or extension) into PlanViewer.Core so all six call sites agree — currently a future tweak (e.g. changing the cutoff to >= 99.95) has to be made in six places and could drift.

Generated by Claude Code

erikdarlingdata · 2026-04-22T10:00:38Z

    }

+    private static string FormatBenefitPercent(double pct) =>
+        pct >= 100 ? $"{pct:N0}" : $"{pct:N1}";


Edge case: pct in the 99.95–99.99 range will format as "100.0" under N1 (rounds up) but the ternary only switches to N0 at >= 100. You'll get strings like "up to 100.0%" which reads weird next to the "100" case the branch is trying to produce. If the intent was "show integer only at exactly 100", consider rounding before the compare: var r = Math.Round(pct, 1); return r >= 100 ? $"{r:N0}" : $"{r:N1}";.

Generated by Claude Code

erikdarlingdata · 2026-04-22T10:00:41Z

                                @if (benefitPct > 0)
                                {
-                                    <span class="wait-benefit">up to @benefitPct.ToString("N0")%</span>
+                                    <span class="wait-benefit">up to @(benefitPct >= 100 ? benefitPct.ToString("N0") : benefitPct.ToString("N1"))%</span>


Inline ternary inside the Razor expression is hard to read and duplicates the logic from FormatBenefitPercent in PlanViewerControl. Extracting to a Core helper (see HtmlExporter.cs comment) cleans this line up to @BenefitFormatter.Format(benefitPct)%.

Generated by Claude Code

erikdarlingdata commented Apr 22, 2026

View reviewed changes

erikdarlingdata merged commit 3a40ada into dev Apr 22, 2026
2 checks passed

erikdarlingdata deleted the fix/serial-own-elapsed-lookthrough-and-decimal-benefit branch April 22, 2026 10:04

erikdarlingdata mentioned this pull request Apr 22, 2026

Release v1.7.2 — operator self-time fix + decimal benefit % #256

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix operator self-time through Compute Scalar children + one-decimal benefit % (#215 C5, C2)#255

Fix operator self-time through Compute Scalar children + one-decimal benefit % (#215 C5, C2)#255
erikdarlingdata merged 1 commit intodevfrom
fix/serial-own-elapsed-lookthrough-and-decimal-benefit

erikdarlingdata commented Apr 22, 2026

Uh oh!

erikdarlingdata left a comment

Uh oh!

erikdarlingdata Apr 22, 2026

Uh oh!

erikdarlingdata Apr 22, 2026

Uh oh!

erikdarlingdata Apr 22, 2026

Uh oh!

erikdarlingdata Apr 22, 2026

Uh oh!

erikdarlingdata Apr 22, 2026

Uh oh!

erikdarlingdata Apr 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

erikdarlingdata commented Apr 22, 2026

Summary

Details

Test plan

Uh oh!

erikdarlingdata left a comment

Choose a reason for hiding this comment

Summary

What's good

Needs attention

Uh oh!

erikdarlingdata Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

erikdarlingdata Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

erikdarlingdata Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

erikdarlingdata Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

erikdarlingdata Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

erikdarlingdata Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant