UCP/CORE: Detect memory type on cache miss with non-host detect MDs by yafshar · Pull Request #11332 · openucx/ucx

yafshar · 2026-04-08T21:30:14Z

What?

On memtype cache miss, avoid assuming host memory when detect‑capable MDs supporting non-host memory types are present.
Add a context-level flag indicating whether any detect-capable MD supports non-host memory types.
Use this flag to choose cache-miss behavior: run slowpath detection or fall back to host memory.

Why?

External runtimes may allocate accelerator memory outside UCX-visible contexts, resulting in missing memtype cache entries.
Falling back to host memory on cache miss can select host-only transports for accelerator pointers, leading to incorrect behavior or runtime failures.
Running slowpath detection when non-host detection is possible prevents wrong transport and protocol selection.

How?

Set the new context flag during resource discovery when any MD advertises non-host detect support.
On memtype cache miss, invoke slowpath detection when the flag is set.
Slowpath queries detect-capable MDs to resolve memory type and sys_dev before transport and protocol selection.

On memtype cache miss, avoid assuming host memory when non-host detect-capable MDs are present. Run the detection slowpath first to determine memory type and sys_dev. This prevents incorrect transport selection on cold-cache paths (e.g. host paths chosen for accelerator memory). Add has_non_host_detect_md flag to ucp_context and use it to trigger slowpath detection instead of immediate host fallback.

tvegas1 · 2026-04-09T07:08:02Z

@@ -683,6 +684,14 @@ ucp_memory_detect_internal(ucp_context_h context, const void *address,

    status = ucs_memtype_cache_lookup(address, length, mem_info);
    if (ucs_likely(status == UCS_ERR_NO_ELEM)) {


do you know what memory allocator is being used for such unknown memory, and would it make sense to add hook under src/ucm instead? afaiu the slow path was meant to be used when memtype cache was disabled.

do you know what memory allocator is being used for such unknown memory, and would it make sense to add hook under src/ucm instead? afaiu the slow path was meant to be used when memtype cache was disabled.

Unknown memtype here is not a specific allocator. It can happen even with memtype cache enabled, for example when UCM reports UNKNOWN for existing allocations or paths it cannot classify immediately, or when cache coverage is incomplete for the queried range.

Because of that, the internal slowpath is not only for the cache-disabled case. It is the correctness fallback for unknown or non-covered entries while cache is active.

Adding a hook under src/ucm is useful only if we identify a concrete allocator/runtime path that currently bypasses UCM memtype events. That may reduce slowpath frequency, but it will not remove the need for fallback in cross-context cases such as separate L0 contexts (for example PyTorch or SYCL vs UCX).

thanks for details, also I think that memtype cache only tracks non-host memory, so no element currently means host memory type. if so I think that doing slowpath for those cases could have perf impact? for all the cases you mention maybe the memtype could be passed along with pointer?

Thanks, that makes sense. I agree we should avoid per-call warnings in this path, but we can add lightweight observability: I mean to track slowpath hits (miss and unknown cases) in the existing UCP stats tree using UCS_STATS_NODE_DECLARE, and emit a one-time or end-of-run summary hint when the counters are non-zero. That gives actionable feedback to pass explicit memtype hints without adding hot-path log noise.

Also agreed that passing memtype with the pointer is the preferred fix at the application boundary. In our NIXL/Dynamo integration we already do this for PyTorch GPU buffers by setting UCP_MEM_MAP_PARAM_FIELD_MEMORY_TYPE to UCS_MEMORY_TYPE_ZE_DEVICE, and in that path detection is bypassed.

i think there is logging aspect but also the implied uct_md_mem_query() that could have perf impact for host memory case (repeatedly calling it as host mem type is never in memtype cache).

i think there is logging aspect but also the implied uct_md_mem_query() that could have perf impact for host memory case (repeatedly calling it as host mem type is never in memtype cache).

I agree we should avoid changing NO_ELEM semantics globally due performance risk on host-heavy paths. I will make this PR into draft for now

yafshar marked this pull request as ready for review April 8, 2026 21:30

tvegas1 reviewed Apr 9, 2026

View reviewed changes

yafshar marked this pull request as draft April 9, 2026 18:36

yafshar mentioned this pull request Apr 17, 2026

NIXL/UCX: enforce VRAM memtype hint query behavior and add tests ai-dynamo/nixl#1536

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UCP/CORE: Detect memory type on cache miss with non-host detect MDs#11332

UCP/CORE: Detect memory type on cache miss with non-host detect MDs#11332
yafshar wants to merge 1 commit intoopenucx:masterfrom
intel-staging:fix/ucp-core-memtype-cache-miss

yafshar commented Apr 8, 2026

Uh oh!

tvegas1 Apr 9, 2026

Uh oh!

yafshar Apr 9, 2026

Uh oh!

tvegas1 Apr 9, 2026

Uh oh!

yafshar Apr 9, 2026

Uh oh!

tvegas1 Apr 9, 2026

Uh oh!

yafshar Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		@@ -683,6 +684,14 @@ ucp_memory_detect_internal(ucp_context_h context, const void *address,

		status = ucs_memtype_cache_lookup(address, length, mem_info);
		if (ucs_likely(status == UCS_ERR_NO_ELEM)) {

Conversation

yafshar commented Apr 8, 2026

What?

Why?

How?

Uh oh!

tvegas1 Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

yafshar Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

tvegas1 Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

yafshar Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

tvegas1 Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

yafshar Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants