Skip to content

perf(storage): remove unnecessary mutex from multipart upload#2393

Closed
ValentaTomas wants to merge 1 commit intomainfrom
perf/minimal-multipart-fix
Closed

perf(storage): remove unnecessary mutex from multipart upload#2393
ValentaTomas wants to merge 1 commit intomainfrom
perf/minimal-multipart-fix

Conversation

@ValentaTomas
Copy link
Copy Markdown
Member

@ValentaTomas ValentaTomas commented Apr 14, 2026

Remove mutex on parts array — goroutines write to distinct indices. Supersedes #2381.

- Remove per-part MD5 computation (Content-MD5 is optional for GCS XML
  multipart uploads and the final object has no MD5 hash anyway)
- Configure HTTP/2 transport with connection pooling and larger buffers
- Remove unnecessary mutex on parts array (goroutines write to distinct
  indices)
@cursor
Copy link
Copy Markdown

cursor Bot commented Apr 14, 2026

PR Summary

Medium Risk
Removes an integrity check header and a mutex in a concurrent upload path; while intended as a pure optimization, any latent loop-variable capture issues could now manifest as concurrent writes to the same parts index without synchronization.

Overview
Optimizes GCS XML multipart uploads by removing per-part MD5 computation/Content-MD5 headers and dropping the mutex around parts collection during parallel part uploads, relying on each goroutine writing its own slice slot to reduce CPU and contention.

Reviewed by Cursor Bugbot for commit c4f0100. Bugbot is set up for automated code reviews on this repo. Configure here.

g.SetLimit(maxConcurrency) // Limit concurrent goroutines

// Thread-safe map to collect parts
var partsMu sync.Mutex
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're removing the mutex because each slot in the array is modified exactly once, and never read until it is returned, right?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

@linear
Copy link
Copy Markdown

linear Bot commented Apr 14, 2026

Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — removing optional Content-MD5 and the now-unnecessary mutex are both correct optimizations.

Extended reasoning...

Overview

Single-file change to packages/shared/pkg/storage/gcp_multipart.go. Removes MD5 per-part computation and a mutex guarding the parts slice in the concurrent upload loop.

Security risks

None. MD5 here was used only as an optional data-integrity hint to GCS, not for authentication or authorization. Removing it has no security implications.

Level of scrutiny

Low. Both removals are mechanical and independently verifiable: (1) GCS XML API documents Content-MD5 as optional; (2) each goroutine writes to a distinct slice index (parts[partNumber-1]), so no two goroutines can race on the same element, and errgroup.Wait() provides the required happens-before barrier. The -race test run is a strong signal.

Other factors

No interface or behavioral changes. Single file, no CODEOWNER concerns, no security-sensitive code paths touched.

@ValentaTomas
Copy link
Copy Markdown
Member Author

We will reopen this when we merge compression as that will provide us a way to check integrity that makes this obsolete.

@ValentaTomas ValentaTomas changed the title perf(storage): remove MD5 and mutex from multipart upload perf(storage): remove unnecessary mutex from multipart upload Apr 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants