Skip to content

Make MPS sharing available#97

Draft
tehut wants to merge 18 commits intomainfrom
NMD-1234
Draft

Make MPS sharing available#97
tehut wants to merge 18 commits intomainfrom
NMD-1234

Conversation

@tehut
Copy link
Copy Markdown

@tehut tehut commented Mar 6, 2026

  • If a change needs to be reverted, we will roll out an update to the code within 7 days.

Changes to Security Controls

Are there any changes to security controls (access controls, encryption, logging) in this pull request? If so, explain.

Copy link
Copy Markdown
Member

@tgross tgross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking like a good first pass!

Comment thread README.md
from the list passed to it to retrieve device specific MPS config values like
log and pipe directory environment variables.

## Config
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we moved all the MPS configuration under a top-level MPS object? That would make it easy to bail out early from MPS code paths with just a nil check. And configuration keys would be less repetitive. Something like this:

plugin "nvidia" {

  fingerprint_period = "5s"
  ignored_gpu_ids    = ["acd90ec4-be06-4461-a8cb-e1f4b17b1126"]

  mps {
    user     = "nvidiamps"
    pipe_dir = "/tmp/nvidia-mps"
    log_dir  = "/var/log/nvidia-mps"

    device {
      id       = "b52cf8ab-5045-43d5-b724-119e7208d7f4"
      pipe_dir = "/tmp/nvidia-mps-dev1"
      log_dir  = "/var/logs/nvidia-mps-dev1"
    }

    device {
      id       = "e51fd310-8c60-4762-9ae3-b3c8b19e030d"
      pipe_dir = "/tmp/nvidia-mps-dev2"
      log_dir  = "/var/logs/nvidia-mps-dev2"
    }
  }
}

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that is cleaner, though I do want to keep the device-specific config in it's own list so we can be clear that it is either all device-specific config or no device-specific config.

Comment thread README.md Outdated
- MPS is only supported on linux runtimes in Docker or Podman containers
- MPS is not currently supported on MIG partitioned GPUs or their parents
- An individual task can only be run against a single MPS server. In a multi
gpu environment, the Reserve() function arbitrarily selects a single GPU UID
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Referring to the Reserve() function mixes internal API documentation with the user-facing configuration docs. We'll probably want to focus this on the user-facing stuff.

Comment thread go.mod Outdated
github.com/NVIDIA/go-nvml v0.13.0-1
github.com/hashicorp/go-hclog v1.6.3
github.com/hashicorp/nomad v1.11.2
github.com/hashicorp/nomad v1.11.3
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once 1.11.3 goes out next week we'll want to pull this out to its own PR

Comment thread fingerprint.go Outdated
// getDeviceSharingStatus attempts to connect to the mps-control socket
// using the configured mps_pip_directory
func (d *NvidiaDevice) getDeviceSharingStatus(dialtype string, mps_pipe_directory string) device.DeviceSharing {
sockAddr := mps_pipe_directory + "/control"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given this is the only consumer of this value, should we let the user configure the whole path instead of just the directory? That way if Nvidia changes anything at their end the user can fix that without having to get a whole new device driver build.

Edit: oh, is this because we need the whole directory to be bind-mounted to the task?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edit: oh, is this because we need the whole directory to be bind-mounted to the task?
Yes, but we could make this an undocumented optional variable so that we're not confusing people with it now but we just have to update the readme if Nvidia does change things.

Comment thread fingerprint.go Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this take a context so that we can bail out early from the various timeouts and retries if the plugin is shut down?

Comment thread device.go Outdated
Comment on lines +312 to +314
if i == 0 {
reserveDeviceBuilder.WriteString(strings.Join(deviceIDs, ","))
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than doing string manipulation in the middle of this function, maybe we pass around a []string{} of reserved IDs instead and then do a strings.Join at the end?

Comment thread fingerprint.go Outdated
Comment on lines +81 to +84
// skip mig mode devices marked ineligible by the client
if dev.SharingStatus != device.SharingIneligible {
continue
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure from context which "client" we're referring to here. This isn't in the Nomad client agent config, right? Is it coming from the Nvidia API client?

Copy link
Copy Markdown
Author

@tehut tehut Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, yes. this is the nvml.client. The nvml.GetFingerprintData function marks the mig mode clients as ineligible so that we don't have to worry about dynamic UUIDs that don't map to anything in the map of MPS configured GPUs.

Comment thread nvml/client.go Outdated
}

// only set sharing status for mig devices in the client
// otherwise leave nil for device driver to set
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is an enum, it's some non-nil default value, isn't it?

Comment thread device.go Outdated
Comment thread device.go Outdated
Comment on lines +341 to +344
mountPipeDir := &device.Mount{
TaskPath: containerEnvs[MpsPipeDirectory],
ReadOnly: false,
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The MPS pipe directory will be the path on disk from the perspective of the host, so I think that needs to be passed as the host path. The task path could be literally anything, but we're giving the task the environment variable that points to the MPS pipe directory so we can give it the same value there too.

i.e. if the MPS pipe directory is /tmp/nvidia-mps on the host, we want to do the equivalent of:

docker run -d \
  -v /tmp/nvidia-mps:/tmp/nvidia-mps \
  -e MPS_PIPE_DIRECTORY=/tmp/nvidia-mps \
  myimage mycommand

While I'm here though, given that we're giving the task the MPS logs directory as an environment variable, do we need to mount the MPS logs directory as well? Is that really something we're supposed to expose to the task?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I wasn't quite sure about the mounts.

do we need to mount the MPS logs directory as well?
I don't think we do. I think we need to mount the pipe directory for the task to talk to the daemon over IPC but the envvar should be enough for the logs.

Comment thread device.go Outdated
[]*hclspec.Spec{
hclspec.NewObject(map[string]*hclspec.Spec{
"uuid": hclspec.NewAttr("uuid", "string", true),
"mps_pipe_directory": hclspec.NewDefault(
Copy link
Copy Markdown
Member

@tgross tgross Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having moved this all under mps lets us make the keys less verbose. Ex. right now this is plugin.config.mps.device_specific_mps_config.mps_pipe_directory, which means we have "mps" 3 times and "config" twice in the same config key. I think we could boil this down to plugin.config.mps.device_specific.pipe_directory with no loss of clarity.

@tehut tehut force-pushed the NMD-1234 branch 2 times, most recently from d9d8195 to a717ebc Compare March 16, 2026 16:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants