Skip to content

chore: update the number of nbd devices#2400

Merged
jakubno merged 1 commit intomainfrom
chore/update-number-of-nbd-devices
Apr 15, 2026
Merged

chore: update the number of nbd devices#2400
jakubno merged 1 commit intomainfrom
chore/update-number-of-nbd-devices

Conversation

@jakubno
Copy link
Copy Markdown
Member

@jakubno jakubno commented Apr 15, 2026

No description provided.

@cursor
Copy link
Copy Markdown

cursor Bot commented Apr 15, 2026

PR Summary

Medium Risk
Changes a runtime tuning default that can affect block-device I/O concurrency and performance; mis-sizing could cause throughput regressions or increased contention until overridden via LaunchDarkly.

Overview
Lowers the default value of the nbd-connections-per-device feature flag from 4 to 1, reducing the number of NBD socket connections used per device unless overridden.

Reviewed by Cursor Bugbot for commit e8e5ee3. Bugbot is set up for automated code reviews on this repo. Configure here.

Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — trivial one-line default change from 4 to 1, overridable via LaunchDarkly at any time.

Extended reasoning...

Overview

Single-line change to the NBDConnectionsPerDevice feature flag default in packages/shared/pkg/featureflags/flags.go, reducing the default from 4 to 1.

Security risks

None. This is a tuning parameter with no security implications. The value is runtime-overridable via LaunchDarkly.

Level of scrutiny

Minimal scrutiny required. The change is mechanical and self-contained. Both the old default (4) and the new default (1) are valid positive integers. The flag is designed for runtime tuning and can be reverted instantly without a deploy.

Other factors

The only bug reported is pre-existing (missing lower-bound guard for zero values) and was present before this PR. This PR does not introduce or worsen that issue — the new default of 1 is still a safe positive value. No outstanding reviewer comments.

Comment on lines 168 to +169
// NBDConnectionsPerDevice the number of NBD socket connections per device
NBDConnectionsPerDevice = newIntFlag("nbd-connections-per-device", 4)
NBDConnectionsPerDevice = newIntFlag("nbd-connections-per-device", 1)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟣 The consumer in path_direct.go uses for i := range connections (Go 1.22+ integer range) with no lower-bound guard, so a LaunchDarkly operator setting nbd-connections-per-device to 0 would leave d.socksClient empty and cause nbdnl.Connect to fail for every NBD-backed sandbox. This validation gap pre-dates this PR (it existed when the default was 4 as well); the PR only changes the default from 4 to 1, both of which are safe positive values. Adding a max(1, connections) guard in path_direct.go before the socket loop would eliminate the risk.

Extended reasoning...

What the bug is and how it manifests

The feature flag nbd-connections-per-device is an IntFlag with no lower-bound enforcement at the definition site or at the consumer site. In path_direct.go:126-128, the consumer reads the flag with connections := d.featureFlags.IntFlag(ctx, featureflags.NBDConnectionsPerDevice) and immediately iterates with for i := range connections. In Go 1.22+, ranging over an integer value of 0 (or any non-positive int) produces zero iterations. When this happens, d.socksClient is never populated and remains an empty slice.

The specific code path that triggers it

At path_direct.go:180, nbdnl.Connect(deviceIndex, d.socksClient, ...) is called with that empty slice. The Linux kernel NBD netlink API requires at least one socket; passing zero causes an immediate error. The verifier analysis notes that path_direct.go:204 catches invalid argument and returns immediately without retry, meaning all NBD-backed sandboxes fail to mount with no recovery path.

Why existing code does not prevent it

newIntFlag in flags.go only stores a fallback value — it applies no validation or clamping. There is no max(1, connections) guard, no if connections < 1 check, and no assertions before the loop. The LaunchDarkly flag system allows any integer to be pushed at runtime, including 0 or negative values.

Impact

If a LaunchDarkly operator accidentally sets this flag to 0, every sandbox that relies on NBD-backed storage will fail to mount, producing a complete outage for that workload class. Because the error path does not retry, there is no self-healing until the flag is corrected and new mounts are attempted.

How to fix it

Add a lower-bound guard in path_direct.go before the socket loop:

if connections < 1 {
connections = 1
}

or equivalently connections = max(1, connections). A point-of-use guard is sufficient and most explicit.

Step-by-step proof

  1. Operator sets nbd-connections-per-device to 0 in LaunchDarkly.
  2. Orchestrator reads connections = 0 via IntFlag(ctx, featureflags.NBDConnectionsPerDevice).
  3. for i := range 0 produces zero iterations; d.socksClient remains empty.
  4. nbdnl.Connect(deviceIndex, []*os.File{}, ...) is called.
  5. Kernel rejects the call with EINVAL; path_direct.go:204 returns the error immediately.
  6. Every new sandbox creation requiring an NBD mount fails at this step with no retry or fallback.
  7. Existing sandboxes are unaffected until restarted; new sandboxes cannot be created.

Pre-existing nature

This validation gap existed identically when the default was 4. This PR changes only the default to 1, which is still a safe positive value. The risk is not materially increased — a LaunchDarkly operator could have set the flag to 0 at any time before this PR. All four verifiers independently confirmed severity as pre_existing.

@jakubno jakubno merged commit 7de6311 into main Apr 15, 2026
48 checks passed
@jakubno jakubno deleted the chore/update-number-of-nbd-devices branch April 15, 2026 10:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants