cuda.core: two API contracts that could complicate future changes

## Background

A comprehensive audit of all `cuda_core` wrapper code was performed to find places where the handling of input arguments or output values limits capabilities of the underlying low-level `cuda_bindings` API. The audit identified 18 such limitations. A follow-up analysis checked which of those would require breaking API or ABI changes to address — to ensure the project doesn't paint itself into a corner.

**Good news**: all 18 limitations can be resolved with purely additive changes (new keyword args with defaults, new methods/properties, new classes). However, two areas establish behavioral contracts or property shapes that deserve attention now, before more users depend on them.

---

## 1. Python `int` kernel argument convention

**File**: `cuda/core/_kernel_arg_handler.pyx`

When a bare Python `int` is passed as a kernel argument, it is unconditionally treated as an `intptr_t` (pointer address). This is documented in a code comment as an intentional judgment call:

> We want to have a fast path to pass in Python integers as pointer addresses, but one could also (mistakenly) pass it with the intention of passing a scalar integer. It's a mistake because a Python int is ambiguous (arbitrary width). Our judgement call here is to treat it as a pointer address, without any warning!

**Why this matters**: This establishes a semantic contract that users will build against. If the project ever wanted `int` to mean "scalar integer of kernel-parameter-width" instead, that would be a **silent behavioral breaking change** — existing code passing pointer addresses as `int` would break without any error.

**Recommendation**: The current convention is defensible and the alternative (typed scalars via `numpy`/`ctypes`) covers the scalar case. However, consider:
- Adding an explicit note in public documentation that `int` means pointer address
- Optionally, adding a `Pointer(addr)` wrapper type so the intent is unambiguous, giving a future path to change the bare-`int` behavior if ever desired (with a deprecation cycle)

**Risk level**: Low, as long as the convention is documented and stable.

---

## 2. `KernelNode.config` and `MemcpyNode` — lossy round-trip of graph node parameters

**File**: `cuda/core/graph/_subclasses.pyx`

### KernelNode.config

`KernelNode.config` reconstructs a `LaunchConfig` from `CUDA_KERNEL_NODE_PARAMS_v3` but **silently drops** `cluster_dimension` and `cooperative_launch`. The docstring acknowledges this:

> cluster dimensions and cooperative_launch are not preserved by the CUDA driver's kernel node params, so they are not included.

Code that reads `.config`, mutates it, and passes it to a new launch will silently lose cluster/cooperative settings. Fixing this later (populating the missing fields) is purely additive and non-breaking.

### MemcpyNode

`MemcpyNode` flattens a `CUDA_MEMCPY3D_v2` descriptor to 1D — only `dst`, `src`, and `size` (all `int`) are exposed as public properties. The `Height`, `Depth`, `srcPitch`, `srcHeight`, `dstPitch`, `dstHeight` fields are discarded.

**Why this matters**: The current properties (`dst: int`, `src: int`, `size: int`) define a public contract. If users write code that unpacks these three values, adding richer 3D properties later is safe (additive), but **changing the meaning or type of the existing properties would be breaking**. As long as new dimensions are exposed via *new* properties (e.g. `height`, `depth`, `src_pitch`), there is no conflict.

**Recommendation**:
- For `KernelNode.config`: populate the missing `LaunchConfig` fields as soon as the driver exposes them through node params, or store them at node-creation time. This is additive.
- For `MemcpyNode`: add `height`, `depth`, `src_pitch`, `dst_pitch` etc. as new properties rather than changing `dst`/`src`/`size`. Document that the current 1D view is intentionally minimal.
- Do **not** rename or retype the existing `dst`, `src`, `size` properties in the future — that would be a breaking change.

**Risk level**: Low, as long as the additive-only approach is followed.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cuda.core: two API contracts that could complicate future changes #1949

Background

1. Python `int` kernel argument convention

2. `KernelNode.config` and `MemcpyNode` — lossy round-trip of graph node parameters

KernelNode.config

MemcpyNode

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

cuda.core: two API contracts that could complicate future changes #1949

Description

Background

1. Python int kernel argument convention

2. KernelNode.config and MemcpyNode — lossy round-trip of graph node parameters

KernelNode.config

MemcpyNode

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

1. Python `int` kernel argument convention

2. `KernelNode.config` and `MemcpyNode` — lossy round-trip of graph node parameters