Skip to content

sync-diff-inspector: enforce splitter-strategy + add auto mode#12609

Open
joechenrh wants to merge 7 commits intopingcap:masterfrom
joechenrh:sync-diff-splitter-strategy-enforce
Open

sync-diff-inspector: enforce splitter-strategy + add auto mode#12609
joechenrh wants to merge 7 commits intopingcap:masterfrom
joechenrh:sync-diff-splitter-strategy-enforce

Conversation

@joechenrh
Copy link
Copy Markdown
Contributor

What problem does this PR solve?

Issue Number: close #xxxx

Today splitter-strategy in sync-diff-inspector is consumed only as a fallback when bucket-based splitting fails. The bucket iterator is always attempted first for chunk checksum, and global checksum ignores the setting entirely and always uses random. When TiDB bucket statistics are stale or misleading, bucket-based splits can produce skewed or incorrect chunk boundaries — and the user has no way to opt out.

What is changed and how it works?

splitter-strategy is now a three-valued enforced setting:

Value Behavior
auto (new default) Chunk checksum: try bucket first, fall back to random on error. Global checksum: random. Preserves the pre-existing behavior.
random Random iterator for both chunk checksum and global checksum. Bucket iterator is not attempted.
limit Limit iterator for both chunk checksum and global checksum. Bucket iterator is not attempted.

Implementation:

  • sync_diff_inspector/config/config.go: add SplitterStrategyAuto and update normalizeSplitterStrategy so empty/"auto" normalize to auto, and any other unknown value fails CheckConfig.
  • sync_diff_inspector/source/tidb.go:AnalyzeSplitter: switch on SplitterStrategy up-front. limit/random build their iterators directly; only auto falls through to the bucket-first path.
  • sync_diff_inspector/source/tidb.go:GetGlobalChecksumIterator: switch on SplitterStrategylimit returns a *LimitIterator (chunk count reported as 0 since LimitIterator produces chunks asynchronously with no upfront total); auto/random continue to use the random iterator.

Check List

Tests

  • Unit test

New / updated unit tests:

  • TestNormalizeSplitterStrategy in config/config_test.go — covers empty, auto, mixed case, surrounding whitespace, valid enum values, and an invalid value.
  • TestAnalyzeSplitterEnforcesLimitStrategy / TestAnalyzeSplitterEnforcesRandomStrategy / TestAnalyzeSplitterAutoFallsBackToRandomOnBucketError in source/tidb_splitter_test.go — verify (via sqlmock + iterator type assertions) that non-auto strategies skip the bucket path, and that auto still falls back to random on bucket failure.
  • TestGetGlobalChecksumIteratorUsesLimitWhenConfigured / TestGetGlobalChecksumIteratorDefaultsToRandom in source/tidb_checksum_test.go — verify the global-checksum dispatch honors limit and treats auto/random/"" identically as random.

Questions

Will it cause performance regression or break compatibility?

No performance regression. The default (auto) reproduces today's behavior byte-for-byte. Users who previously set splitter-strategy = "random" will now skip the bucket iterator — that matches the intent of having set the field explicitly.

Do you need to update user documentation, design documentation or monitoring documentation?

No. splitter-strategy is not publicly documented.

Release note

sync-diff-inspector: `splitter-strategy` is now enforced. Setting `random` or `limit` skips the bucket iterator entirely for both chunk checksum and global checksum. A new `auto` value (the default) preserves the previous bucket-first-with-random-fallback behavior.

joechenrh and others added 7 commits April 16, 2026 23:58
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Route GetGlobalChecksumIterator to LimitIterator when SplitterStrategy is
"limit", and keep RandomIterator for "auto", "random", and the empty default.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@ti-chi-bot
Copy link
Copy Markdown
Contributor

ti-chi-bot bot commented Apr 17, 2026

[FORMAT CHECKER NOTIFICATION]

Notice: To remove the do-not-merge/needs-linked-issue label, please provide the linked issue number on one line in the PR body, for example: Issue Number: close #123 or Issue Number: ref #456.

📖 For more info, you can check the "Contribute Code" section in the development guide.

@ti-chi-bot ti-chi-bot bot added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Apr 17, 2026
@ti-chi-bot
Copy link
Copy Markdown
Contributor

ti-chi-bot bot commented Apr 17, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign overvenus for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Apr 17, 2026
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new "auto" splitter strategy for the sync_diff_inspector, making it the default option. The changes include updating the configuration normalization logic, modifying the TiDB table analyzer to explicitly handle "limit", "random", and "auto" strategies, and adding support for the "limit" strategy in global checksum iterations. Additionally, new unit tests were implemented to ensure correct strategy enforcement and configuration normalization. I have no feedback to provide as the existing review comments did not identify actionable issues or improvements within the scope of the provided code.

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 17, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 53.4152%. Comparing base (7e0d4db) to head (55323a0).
⚠️ Report is 1 commits behind head on master.
✅ All tests successful. No failed tests found.

❌ Your project check has failed because the head coverage (53.4152%) is below the target coverage (60.0000%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files
Components Coverage Δ
cdc 57.3566% <ø> (-0.0234%) ⬇️
dm 49.1975% <ø> (+0.0112%) ⬆️
engine 50.6659% <ø> (-0.0678%) ⬇️
Flag Coverage Δ
cdc 57.3566% <ø> (?)
unit 53.4152% <ø> (-0.1003%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

@@               Coverage Diff                @@
##             master     #12609        +/-   ##
================================================
- Coverage   53.5155%   53.4152%   -0.1003%     
================================================
  Files          1037       1011        -26     
  Lines        145625     139975      -5650     
================================================
- Hits          77932      74768      -3164     
+ Misses        61845      59588      -2257     
+ Partials       5848       5619       -229     
🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@ti-chi-bot
Copy link
Copy Markdown
Contributor

ti-chi-bot bot commented Apr 17, 2026

@joechenrh: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-cdc-integration-kafka-test 55323a0 link true /test pull-cdc-integration-kafka-test
pull-dm-integration-test-next-gen 55323a0 link false /test pull-dm-integration-test-next-gen
pull-dm-integration-test 55323a0 link true /test pull-dm-integration-test

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/needs-linked-issue release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant