Skip to content

Add Cython ABI checking tool to toolshed#1929

Merged
mdboom merged 7 commits intoNVIDIA:mainfrom
mdboom:add-abi-checking-tool
Apr 17, 2026
Merged

Add Cython ABI checking tool to toolshed#1929
mdboom merged 7 commits intoNVIDIA:mainfrom
mdboom:add-abi-checking-tool

Conversation

@mdboom
Copy link
Copy Markdown
Contributor

@mdboom mdboom commented Apr 16, 2026

This is a rehash of the ABI-checking tool in #1067. That PR was much more ambitious, and aimed to automatically detect changes to the ABI in CI, but required committing ABI descriptions to the repo, so was ultimately too much burden.

This just adds a commandline script that we can run when we want to compare the ABI of two versions of cuda-python. Checking it in to toolshed makes it easier to find.

Though there is nothing cuda-python specific about it -- it could also be a standalone package.

@mdboom mdboom self-assigned this Apr 16, 2026
@mdboom mdboom added the CI/CD CI/CD infrastructure label Apr 16, 2026
@mdboom mdboom requested review from Copilot and leofang April 16, 2026 19:25
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a standalone toolshed command-line utility to snapshot and compare the Cython “ABI” surface (via __pyx_capi__) between two installed versions of a package, as a lightweight alternative to committing ABI baselines into the repo/CI.

Changes:

  • Introduces toolshed/check_cython_abi.py with generate and check subcommands.
  • Generates per-extension-module JSON snapshots of __pyx_capi__ and compares them to detect missing/changed/added symbols.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread toolshed/check_cython_abi.py
Comment thread toolshed/check_cython_abi.py Outdated
Comment thread toolshed/check_cython_abi.py
Comment thread toolshed/check_cython_abi.py Outdated
Comment thread toolshed/check_cython_abi.py Outdated
Comment thread toolshed/check_cython_abi.py Outdated
Comment thread toolshed/check_cython_abi.py Outdated
Comment thread toolshed/check_cython_abi.py Outdated
Comment thread toolshed/check_cython_abi.py Outdated
@github-actions

This comment has been minimized.

mdboom and others added 4 commits April 16, 2026 15:54
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Member

@leofang leofang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Mike — I did an independent read and cross-referenced cython/cython#7603 for the full ABI picture.

Approving. The core approach is right: __pyx_capi__ capsule names are exactly what Cython's cross-module cimport string-compares at runtime, so snapshotting them is the correct signal for catching cdef-function ABI drift. I also agree with pulling back from #1067's ambition — no baselines checked in means no maintenance tax, and running this on demand against a tag is a sensible workflow.

Left four non-blocking suggestions inline. The one I'd most like to see before we lean on this heavily is a scope caveat in the docstring pointing at #7603, so future users don't read a clean run as "full ABI is intact" — it explicitly isn't covering cdef class layout or vtable reorders, and those are the silent-UB failure mode the Cython maintainers called out. The other three (direct PyCapsule_GetName vs regex on repr, per-module try/except on import, Cython version stamped in the JSON) are all "would be nice, can follow up."

Also +1 on declining Copilot's defensive EXT_SUFFIX fallback and the unchanged return signature on check() — right calls.

-- Leo's bot



"""
Tool to check for Cython ABI changes in a given package.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scope caveat worth adding to the docstring. Per cython/cython#7603, Cython has four runtime ABI failure modes:

  1. cdef function signatures (capsule strings) — covered here
  2. cdef class struct size (tp_basicsize) — not covered
  3. cdef class vtable layout / method reordering — not covered, and this one fails as silent UB rather than an import-time error
  4. Fused specialization ordering — partially covered (reorders manifest as capsule-name deltas, but the mapping is non-obvious)

A one-liner pointing at #7603 would set expectations so users don't assume a clean run here means the full ABI is intact.

-- Leo's bot

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good callout. 3 (maybe 2 also) are an order of magnitude harder, because I think it requires parsing the .pxd files since the information is lost in the compiled artifact.

Comment thread toolshed/check_cython_abi.py Outdated

def extract_name(v: object) -> str:
v = str(v)
match = re.match(r'<capsule object "([^\"]+)" at 0x[0-9a-fA-F]+>', v)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional robustness nit: ctypes.pythonapi.PyCapsule_GetName is the canonical way to read a capsule's name, and it's what the reference inspection snippet in cython/cython#7603 uses. The regex works today because CPython's capsule repr() has been stable, but the direct C API removes that coupling:

import ctypes
_get_name = ctypes.pythonapi.PyCapsule_GetName
_get_name.restype = ctypes.c_char_p
_get_name.argtypes = [ctypes.py_object]
name = _get_name(capsule).decode("utf-8")

Not a blocker — mentioning it since this tool is meant to outlive a handful of CPython/Cython revs.

-- Leo's bot

Comment thread toolshed/check_cython_abi.py Outdated
for abi_path in Path(abi_dir).glob(f"**/*{ABI_SUFFIX}"):
so_path = abi_path_to_so_path(abi_path, build_dir, abi_dir)
if so_path.is_file():
module = import_from_path(package, build_dir, so_path)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocking: if any one extension fails to import here (missing shared lib, init-time crash), the whole check run aborts. Same hazard at line 121 in this function and line 143 in regenerate. Copilot flagged this too. For a tool we may run over a whole package, a per-module try/except that records the failure and continues is more useful than a hard abort — especially in a "generate then diff" workflow where one flaky module shouldn't force the user to regenerate all baselines from scratch.

-- Leo's bot

Comment thread toolshed/check_cython_abi.py Outdated
abi_path = so_path_to_abi_path(so_path, build_dir, abi_dir)
abi_path.parent.mkdir(parents=True, exist_ok=True)
with open(abi_path, "w", encoding="utf-8") as f:
json.dump(pyx_capi_to_json(module.__pyx_capi__), f, indent=2)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: consider stamping the Cython (and maybe Python) version into the generated JSON — either as a top-level _meta key alongside the symbols, or a sibling file. Capsule-name formatting has drifted across Cython versions in the past (typedef mangling in particular), so a user who regenerates under Cython X and checks under Y can see false-positive "ABI breakage" that is really just a format shift. At minimum, worth calling out in the docstring that generate and check should be run with the same Cython version.

-- Leo's bot

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no direct way to get the Cython version from the compiled artifact. You can look for the string like _cython_3_4_2 in the module. But if Cython decided to compile with compressed strings, that fails.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I missed this, sorry @mdboom. This is doable similar to #1817 (comment). We just stringify the Cython compiler version instead of the CUDA version.

@mdboom mdboom enabled auto-merge (squash) April 17, 2026 13:40
@mdboom mdboom merged commit 56b53ca into NVIDIA:main Apr 17, 2026
91 checks passed
@github-actions
Copy link
Copy Markdown

Doc Preview CI
Preview removed because the pull request was closed or merged.

mdboom added a commit to mdboom/cuda-python that referenced this pull request Apr 20, 2026
* Add Cython ABI checking tool to toolshed

* Update toolshed/check_cython_abi.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update toolshed/check_cython_abi.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update toolshed/check_cython_abi.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Address comments in PR

* Address some of the comments in the PR

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI/CD CI/CD infrastructure

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants