Skip to content

scx_utils: use proper NVML API for GPU NUMA node lookup#3312

Open
devnexen wants to merge 1 commit intosched-ext:mainfrom
devnexen:scx_get_numa_id
Open

scx_utils: use proper NVML API for GPU NUMA node lookup#3312
devnexen wants to merge 1 commit intosched-ext:mainfrom
devnexen:scx_get_numa_id

Conversation

@devnexen
Copy link
Copy Markdown
Contributor

using the wrapper but still falling back to the slow(er) path in case. new attempt after #3108 only this time a new nvml-wrapper version had been released.

@devnexen
Copy link
Copy Markdown
Contributor Author

cc @arighi

@EricccTaiwan EricccTaiwan requested a review from arighi February 25, 2026 13:13
Copy link
Copy Markdown
Contributor

@htejun htejun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please prove that this optimization is meaningful.

using the wrapper but still falling back to the slow(er) path in case.
new attempt after sched-ext#3108 only this time a new nvml-wrapper version
had been released.

Signed-off-by: David Carlier <devnexen@gmail.com>
@devnexen
Copy link
Copy Markdown
Contributor Author

devnexen commented Mar 8, 2026

To be honest, the win when I tried it was not that much performance wise mostly saving the FS operation time, I m digging the benchmark (hopefully still have it somewhere); but to summarize. The primary motivation is correctness rather than raw speed. The old code reverse-engineers a sysfs path from NVML's bus ID format by stripping a "0000" prefix — this is fragile and the FIXME comment in the
code acknowledges it may not always be 4 leading zeros.

numa_node_id() uses the proper NVML API (nvmlDeviceGetNumaNodeId) directly. The fallback to the sysfs path is kept for older NVML versions that don't support it.

@devnexen
Copy link
Copy Markdown
Contributor Author

devnexen commented Mar 8, 2026

here it is, as gist. Locally it is only ~1.1 faster on average (I have only one nvidia gpu tough). Anyhow, using the now available direct wrapper NVML call alone, is a valid reason IMHO.

@devnexen devnexen changed the title scx_utils: optimise gpu NUMA node id info retrieval scx_utils: use proper NVML API for GPU NUMA node lookup Mar 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants