Add Cython ABI checking tool to toolshed#1929
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a standalone toolshed command-line utility to snapshot and compare the Cython “ABI” surface (via __pyx_capi__) between two installed versions of a package, as a lightweight alternative to committing ABI baselines into the repo/CI.
Changes:
- Introduces
toolshed/check_cython_abi.pywithgenerateandchecksubcommands. - Generates per-extension-module JSON snapshots of
__pyx_capi__and compares them to detect missing/changed/added symbols.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This comment has been minimized.
This comment has been minimized.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
leofang
left a comment
There was a problem hiding this comment.
Thanks, Mike — I did an independent read and cross-referenced cython/cython#7603 for the full ABI picture.
Approving. The core approach is right: __pyx_capi__ capsule names are exactly what Cython's cross-module cimport string-compares at runtime, so snapshotting them is the correct signal for catching cdef-function ABI drift. I also agree with pulling back from #1067's ambition — no baselines checked in means no maintenance tax, and running this on demand against a tag is a sensible workflow.
Left four non-blocking suggestions inline. The one I'd most like to see before we lean on this heavily is a scope caveat in the docstring pointing at #7603, so future users don't read a clean run as "full ABI is intact" — it explicitly isn't covering cdef class layout or vtable reorders, and those are the silent-UB failure mode the Cython maintainers called out. The other three (direct PyCapsule_GetName vs regex on repr, per-module try/except on import, Cython version stamped in the JSON) are all "would be nice, can follow up."
Also +1 on declining Copilot's defensive EXT_SUFFIX fallback and the unchanged return signature on check() — right calls.
-- Leo's bot
|
|
||
|
|
||
| """ | ||
| Tool to check for Cython ABI changes in a given package. |
There was a problem hiding this comment.
Scope caveat worth adding to the docstring. Per cython/cython#7603, Cython has four runtime ABI failure modes:
cdeffunction signatures (capsule strings) — covered herecdef classstruct size (tp_basicsize) — not coveredcdef classvtable layout / method reordering — not covered, and this one fails as silent UB rather than an import-time error- Fused specialization ordering — partially covered (reorders manifest as capsule-name deltas, but the mapping is non-obvious)
A one-liner pointing at #7603 would set expectations so users don't assume a clean run here means the full ABI is intact.
-- Leo's bot
There was a problem hiding this comment.
That's a good callout. 3 (maybe 2 also) are an order of magnitude harder, because I think it requires parsing the .pxd files since the information is lost in the compiled artifact.
|
|
||
| def extract_name(v: object) -> str: | ||
| v = str(v) | ||
| match = re.match(r'<capsule object "([^\"]+)" at 0x[0-9a-fA-F]+>', v) |
There was a problem hiding this comment.
Optional robustness nit: ctypes.pythonapi.PyCapsule_GetName is the canonical way to read a capsule's name, and it's what the reference inspection snippet in cython/cython#7603 uses. The regex works today because CPython's capsule repr() has been stable, but the direct C API removes that coupling:
import ctypes
_get_name = ctypes.pythonapi.PyCapsule_GetName
_get_name.restype = ctypes.c_char_p
_get_name.argtypes = [ctypes.py_object]
name = _get_name(capsule).decode("utf-8")Not a blocker — mentioning it since this tool is meant to outlive a handful of CPython/Cython revs.
-- Leo's bot
| for abi_path in Path(abi_dir).glob(f"**/*{ABI_SUFFIX}"): | ||
| so_path = abi_path_to_so_path(abi_path, build_dir, abi_dir) | ||
| if so_path.is_file(): | ||
| module = import_from_path(package, build_dir, so_path) |
There was a problem hiding this comment.
Non-blocking: if any one extension fails to import here (missing shared lib, init-time crash), the whole check run aborts. Same hazard at line 121 in this function and line 143 in regenerate. Copilot flagged this too. For a tool we may run over a whole package, a per-module try/except that records the failure and continues is more useful than a hard abort — especially in a "generate then diff" workflow where one flaky module shouldn't force the user to regenerate all baselines from scratch.
-- Leo's bot
| abi_path = so_path_to_abi_path(so_path, build_dir, abi_dir) | ||
| abi_path.parent.mkdir(parents=True, exist_ok=True) | ||
| with open(abi_path, "w", encoding="utf-8") as f: | ||
| json.dump(pyx_capi_to_json(module.__pyx_capi__), f, indent=2) |
There was a problem hiding this comment.
Suggestion: consider stamping the Cython (and maybe Python) version into the generated JSON — either as a top-level _meta key alongside the symbols, or a sibling file. Capsule-name formatting has drifted across Cython versions in the past (typedef mangling in particular), so a user who regenerates under Cython X and checks under Y can see false-positive "ABI breakage" that is really just a format shift. At minimum, worth calling out in the docstring that generate and check should be run with the same Cython version.
-- Leo's bot
There was a problem hiding this comment.
There is no direct way to get the Cython version from the compiled artifact. You can look for the string like _cython_3_4_2 in the module. But if Cython decided to compile with compressed strings, that fails.
There was a problem hiding this comment.
Oh I missed this, sorry @mdboom. This is doable similar to #1817 (comment). We just stringify the Cython compiler version instead of the CUDA version.
|
* Add Cython ABI checking tool to toolshed * Update toolshed/check_cython_abi.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update toolshed/check_cython_abi.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update toolshed/check_cython_abi.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Address comments in PR * Address some of the comments in the PR --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
This is a rehash of the ABI-checking tool in #1067. That PR was much more ambitious, and aimed to automatically detect changes to the ABI in CI, but required committing ABI descriptions to the repo, so was ultimately too much burden.
This just adds a commandline script that we can run when we want to compare the ABI of two versions of cuda-python. Checking it in to
toolshedmakes it easier to find.Though there is nothing cuda-python specific about it -- it could also be a standalone package.