FEAT: Add CI perf baseline regression detection & insertmanyvalues benchmark by bewithgaurav · Pull Request #527 · microsoft/mssql-python

bewithgaurav · 2026-04-17T10:09:41Z

Work Item / Issue Reference

AB#44074

Summary

This pull request enhances the performance benchmarking process in the CI pipeline by adding support for baseline comparison and automated artifact publishing and retrieval. The main improvements are focused on making benchmark results reproducible and comparable across pull requests and main branch runs.

Performance benchmarking improvements:

The benchmark script (benchmarks/perf-benchmarking.py) is now run with --baseline benchmark_baseline.json --json benchmark_results.json arguments, enabling baseline comparison and outputting results in JSON format. [1] [2]

Artifact management for benchmarks:

Added a step to publish the current benchmark results (benchmark_results.json) as a pipeline artifact named perf-baseline-$(sqlVersion) when running on the main branch. This artifact serves as the baseline for future comparisons.
Added a step to download the latest baseline artifact from the main branch for pull request runs, enabling direct comparison of current benchmark results against the established baseline.

…n, CI artifact pipeline - Add insertmanyvalues scenario (100K rows, 1000/batch, GH #500 repro) - Add --baseline flag for 3-way comparison (main vs PR vs pyodbc) - Add --json flag to save results for pipeline artifact consumption - Publish benchmark results as CI artifact on main merge - Download baseline artifact on PR runs for automatic regression detection - Fix timing: time.time() -> time.perf_counter() - Move connection setup outside timing window (measure execute+fetch only)

github-actions · 2026-04-17T10:15:22Z

📊 Code Coverage Report

🔥 Diff Coverage 100%	🎯 Overall Coverage 79%	📈 Total Lines Covered: `6677` out of `8449` 📁 Project: `mssql-python`

Diff Coverage

Diff: main...HEAD, staged and unstaged changes

No lines with coverage information in this diff.

📋 Files Needing Attention

📉 Files with overall lowest coverage (click to expand)

mssql_python.pybind.logger_bridge.cpp: 59.2%
mssql_python.pybind.ddbc_bindings.h: 67.8%
mssql_python.row.py: 70.5%
mssql_python.pybind.logger_bridge.hpp: 70.8%
mssql_python.pybind.ddbc_bindings.cpp: 74.2%
mssql_python.pybind.connection.connection.cpp: 75.8%
mssql_python.__init__.py: 77.3%
mssql_python.ddbc_bindings.py: 79.6%
mssql_python.pybind.connection.connection_pool.cpp: 79.6%
mssql_python.connection.py: 85.2%

🔗 Quick Links

⚙️ Build Summary	📋 Coverage Details
View Azure DevOps Build	Browse Full Coverage Report

Copilot

Pull request overview

This PR enhances CI performance benchmarking by generating JSON benchmark outputs, adding optional baseline comparisons against main, and introducing an insertmanyvalues-style insert benchmark alongside existing fetch/query scenarios.

Changes:

Update CI to run benchmarks/perf-benchmarking.py with --baseline ... --json ... and publish benchmark results as a pipeline artifact.
Add logic to download a baseline artifact from the latest main run for PR comparisons.
Extend the benchmark script with baseline/JSON support, updated reporting, and a new insertmanyvalues benchmark scenario.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File	Description
eng/pipelines/pr-validation-pipeline.yml	Runs benchmarks with baseline/JSON flags and adds publish/download artifact steps intended to enable baseline comparisons in CI.
benchmarks/perf-benchmarking.py	Adds argparse-driven baseline + JSON I/O, improves output formatting, and adds an insertmanyvalues benchmark scenario.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-17T10:50:04Z


      Write-Host "`nRunning performance benchmarks..."
-      python benchmarks/perf-benchmarking.py
+      python benchmarks/perf-benchmarking.py --baseline benchmark_baseline.json --json benchmark_results.json


The benchmark run is invoked with --baseline benchmark_baseline.json, but this pipeline only ever produces/publishes benchmark_results.json (and the download step would bring down that file, typically under an artifact directory). As written, the script will never find the baseline file, so baseline comparison/regression detection won’t run. Align the baseline path with the downloaded artifact’s actual filename/location (and avoid clobbering the current run’s benchmark_results.json output).

Suggested change

python benchmarks/perf-benchmarking.py --baseline benchmark_baseline.json --json benchmark_results.json

python benchmarks/perf-benchmarking.py --baseline "$(Pipeline.Workspace)/perf-baseline-$(sqlVersion)/benchmark_results.json" --json benchmark_results.json

Copilot · 2026-04-17T10:50:05Z

+  # Download baseline from latest main run (for PR comparison)
+  - task: DownloadPipelineArtifact@2
+    inputs:
+      source: specific
+      project: $(System.TeamProjectId)
+      pipeline: $(System.DefinitionId)
+      runVersion: latestFromBranch
+      runBranch: refs/heads/main
+      artifact: 'perf-baseline-$(sqlVersion)'
+      path: $(Build.SourcesDirectory)
+    displayName: 'Download baseline from main'
+    condition: and(ne(variables['Build.SourceBranch'], 'refs/heads/main'), or(eq(variables['sqlVersion'], 'SQL2022'), eq(variables['sqlVersion'], 'SQL2025')))
+    continueOnError: true
+


The baseline artifact download happens after the benchmark script runs, so the --baseline ... file cannot exist at the time the script tries to read it. Move the DownloadPipelineArtifact step before the benchmark execution step (and ensure it downloads into the location the script expects).

Suggested change

# Download baseline from latest main run (for PR comparison)

- task: DownloadPipelineArtifact@2

inputs:

source: specific

project: $(System.TeamProjectId)

pipeline: $(System.DefinitionId)

runVersion: latestFromBranch

runBranch: refs/heads/main

artifact: 'perf-baseline-$(sqlVersion)'

path: $(Build.SourcesDirectory)

displayName: 'Download baseline from main'

condition: and(ne(variables['Build.SourceBranch'], 'refs/heads/main'), or(eq(variables['sqlVersion'], 'SQL2022'), eq(variables['sqlVersion'], 'SQL2025')))

continueOnError: true

Copilot · 2026-04-17T10:50:05Z

          pip install pyodbc
          echo 'Running performance benchmarks on $(distroName)'
-          python benchmarks/perf-benchmarking.py || echo 'Performance benchmark failed or database not available'
+          python benchmarks/perf-benchmarking.py --baseline benchmark_baseline.json --json benchmark_results.json || echo 'Performance benchmark failed or database not available'


This container benchmark invocation passes --baseline benchmark_baseline.json, but there is no preceding step that downloads/restores that baseline file into the container/workspace. Either add a baseline download + mount/copy into the container, or drop the --baseline flag here to avoid always falling back to the 2-column output.

Suggested change

python benchmarks/perf-benchmarking.py --baseline benchmark_baseline.json --json benchmark_results.json || echo 'Performance benchmark failed or database not available'

BASELINE_ARG=''

if [ -f /workspace/benchmark_baseline.json ]; then

echo 'Using benchmark baseline file'

BASELINE_ARG='--baseline benchmark_baseline.json'

else

echo 'No benchmark baseline file found; running without baseline'

fi

python benchmarks/perf-benchmarking.py $BASELINE_ARG --json benchmark_results.json || echo 'Performance benchmark failed or database not available'

Copilot · 2026-04-17T10:50:05Z

+    try:
+        with open(path) as f:
+            data = json.load(f)
+        return data.get("baseline", {})


load_baseline() returns {} when the JSON file exists but has no baseline key, but print_results() treats any non-None value as “has baseline”. That can produce a misleading 3-column table full of N/A values. Consider returning None when the parsed baseline is missing/empty and/or changing has_baseline to bool(baseline).

Suggested change

return data.get("baseline", {})

baseline = data.get("baseline")

return baseline if baseline else None

Copilot · 2026-04-17T10:50:06Z

+            conn = conn_factory(conn_str)
+            cursor = conn.cursor()
+            cursor.execute(
+                "IF OBJECT_ID('tempdb..#bench_insert') IS NOT NULL DROP TABLE #bench_insert; "
+                "CREATE TABLE #bench_insert (id INT, val VARCHAR(100))"
+            )
+
+            start = time.perf_counter()
+            for flat_params in batches:
+                cursor.execute(batch_sql, flat_params)
+            elapsed = time.perf_counter() - start
+
+            result.add_time(elapsed, INSERTMANY_ROWS)
+            cursor.close()
+            conn.close()


The insertmanyvalues benchmark measures only the looped INSERT executes and stops timing before any commit()/transaction finalization or connection close. Since mssql_python.connect() defaults to autocommit=False and the connection closes with an implicit rollback when autocommit is off, this can substantially underreport real insert cost and makes results harder to compare. Consider enabling autocommit for this scenario or including an explicit commit (and any required close/cleanup) inside the timed window for both drivers.

github-actions bot added the pr-size: large Substantial code update label Apr 17, 2026

github-advanced-security AI found potential problems Apr 17, 2026

View reviewed changes

Comment thread benchmarks/perf-benchmarking.py Dismissed

bewithgaurav marked this pull request as ready for review April 17, 2026 10:45

Copilot AI review requested due to automatic review settings April 17, 2026 10:45

Copilot started reviewing on behalf of bewithgaurav April 17, 2026 10:45 View session

Copilot AI reviewed Apr 17, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FEAT: Add CI perf baseline regression detection & insertmanyvalues benchmark#527

FEAT: Add CI perf baseline regression detection & insertmanyvalues benchmark#527
bewithgaurav wants to merge 1 commit intomainfrom
bewithgaurav/insertmany-benchmark

bewithgaurav commented Apr 17, 2026 •

edited by azure-boards bot

Loading

Uh oh!

Uh oh!

github-actions bot commented Apr 17, 2026

🔥 Diff Coverage

100%

🎯 Overall Coverage

79%

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 17, 2026

Uh oh!

Copilot AI Apr 17, 2026

Uh oh!

Copilot AI Apr 17, 2026

Uh oh!

Copilot AI Apr 17, 2026

Uh oh!

Copilot AI Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	python benchmarks/perf-benchmarking.py --baseline benchmark_baseline.json --json benchmark_results.json
	python benchmarks/perf-benchmarking.py --baseline "$(Pipeline.Workspace)/perf-baseline-$(sqlVersion)/benchmark_results.json" --json benchmark_results.json

-          python benchmarks/perf-benchmarking.py --baseline benchmark_baseline.json --json benchmark_results.json || echo 'Performance benchmark failed or database not available'
+          BASELINE_ARG=''
+          if [ -f /workspace/benchmark_baseline.json ]; then
+            echo 'Using benchmark baseline file'
+            BASELINE_ARG='--baseline benchmark_baseline.json'
+          else
+            echo 'No benchmark baseline file found; running without baseline'
+          fi
+          python benchmarks/perf-benchmarking.py $BASELINE_ARG --json benchmark_results.json || echo 'Performance benchmark failed or database not available'

	return data.get("baseline", {})
	baseline = data.get("baseline")
	return baseline if baseline else None

Conversation

bewithgaurav commented Apr 17, 2026 • edited by azure-boards bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Work Item / Issue Reference

Summary

Uh oh!

Uh oh!

github-actions bot commented Apr 17, 2026

📊 Code Coverage Report

🔥 Diff Coverage

100%

🎯 Overall Coverage

79%

Diff Coverage

Diff: main...HEAD, staged and unstaged changes

📋 Files Needing Attention

🔗 Quick Links

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

bewithgaurav commented Apr 17, 2026 •

edited by azure-boards bot

Loading