Conversation
PR SummaryMedium Risk Overview Reviewed by Cursor Bugbot for commit e8e5ee3. Bugbot is set up for automated code reviews on this repo. Configure here. |
There was a problem hiding this comment.
LGTM — trivial one-line default change from 4 to 1, overridable via LaunchDarkly at any time.
Extended reasoning...
Overview
Single-line change to the NBDConnectionsPerDevice feature flag default in packages/shared/pkg/featureflags/flags.go, reducing the default from 4 to 1.
Security risks
None. This is a tuning parameter with no security implications. The value is runtime-overridable via LaunchDarkly.
Level of scrutiny
Minimal scrutiny required. The change is mechanical and self-contained. Both the old default (4) and the new default (1) are valid positive integers. The flag is designed for runtime tuning and can be reverted instantly without a deploy.
Other factors
The only bug reported is pre-existing (missing lower-bound guard for zero values) and was present before this PR. This PR does not introduce or worsen that issue — the new default of 1 is still a safe positive value. No outstanding reviewer comments.
| // NBDConnectionsPerDevice the number of NBD socket connections per device | ||
| NBDConnectionsPerDevice = newIntFlag("nbd-connections-per-device", 4) | ||
| NBDConnectionsPerDevice = newIntFlag("nbd-connections-per-device", 1) |
There was a problem hiding this comment.
🟣 The consumer in path_direct.go uses for i := range connections (Go 1.22+ integer range) with no lower-bound guard, so a LaunchDarkly operator setting nbd-connections-per-device to 0 would leave d.socksClient empty and cause nbdnl.Connect to fail for every NBD-backed sandbox. This validation gap pre-dates this PR (it existed when the default was 4 as well); the PR only changes the default from 4 to 1, both of which are safe positive values. Adding a max(1, connections) guard in path_direct.go before the socket loop would eliminate the risk.
Extended reasoning...
What the bug is and how it manifests
The feature flag nbd-connections-per-device is an IntFlag with no lower-bound enforcement at the definition site or at the consumer site. In path_direct.go:126-128, the consumer reads the flag with connections := d.featureFlags.IntFlag(ctx, featureflags.NBDConnectionsPerDevice) and immediately iterates with for i := range connections. In Go 1.22+, ranging over an integer value of 0 (or any non-positive int) produces zero iterations. When this happens, d.socksClient is never populated and remains an empty slice.
The specific code path that triggers it
At path_direct.go:180, nbdnl.Connect(deviceIndex, d.socksClient, ...) is called with that empty slice. The Linux kernel NBD netlink API requires at least one socket; passing zero causes an immediate error. The verifier analysis notes that path_direct.go:204 catches invalid argument and returns immediately without retry, meaning all NBD-backed sandboxes fail to mount with no recovery path.
Why existing code does not prevent it
newIntFlag in flags.go only stores a fallback value — it applies no validation or clamping. There is no max(1, connections) guard, no if connections < 1 check, and no assertions before the loop. The LaunchDarkly flag system allows any integer to be pushed at runtime, including 0 or negative values.
Impact
If a LaunchDarkly operator accidentally sets this flag to 0, every sandbox that relies on NBD-backed storage will fail to mount, producing a complete outage for that workload class. Because the error path does not retry, there is no self-healing until the flag is corrected and new mounts are attempted.
How to fix it
Add a lower-bound guard in path_direct.go before the socket loop:
if connections < 1 {
connections = 1
}
or equivalently connections = max(1, connections). A point-of-use guard is sufficient and most explicit.
Step-by-step proof
- Operator sets nbd-connections-per-device to 0 in LaunchDarkly.
- Orchestrator reads connections = 0 via IntFlag(ctx, featureflags.NBDConnectionsPerDevice).
- for i := range 0 produces zero iterations; d.socksClient remains empty.
- nbdnl.Connect(deviceIndex, []*os.File{}, ...) is called.
- Kernel rejects the call with EINVAL; path_direct.go:204 returns the error immediately.
- Every new sandbox creation requiring an NBD mount fails at this step with no retry or fallback.
- Existing sandboxes are unaffected until restarted; new sandboxes cannot be created.
Pre-existing nature
This validation gap existed identically when the default was 4. This PR changes only the default to 1, which is still a safe positive value. The risk is not materially increased — a LaunchDarkly operator could have set the flag to 0 at any time before this PR. All four verifiers independently confirmed severity as pre_existing.
No description provided.