Skip to content

Allocation tracing: fix panic in debug builds #25223

@pront

Description

@pront

Context

PR #25136 fixed a latent panic/UB in the allocation-tracing custom allocator's dealloc path, but it did so by always wrapping every allocation with a 1-byte header — even when --allocation-tracing is not passed at runtime. Because the allocation-tracing Cargo feature is included in the default unix feature set (Cargo.toml:559), every production Linux build pays a small-but-real overhead on every alloc / dealloc whether the user ever enables tracing or not.

#25136 has been reverted via #25222 to restore the pre-#25136 fast path. This issue tracks picking a proper long-term fix, since the original panic #25136 addressed is still latent.

Root cause of the original panic

The allocator is installed as #[global_allocator] at static-init time, but TRACK_ALLOCATIONS is flipped from falsetrue at runtime inside main() after CLI parsing (src/main.rs:32-38). Allocations made before the flip use the unwrapped layout; when any of them are later freed after the flip, dealloc reads an out-of-bounds byte as the group ID and hits NonZeroU8::new_unchecked(0) — UB that recent Rust toolchains turn into an abort in debug builds.

The comment at src/main.rs:32-33 documents the assumption that "the heap does not contain any allocations that have a shorter lifetime than the program" at the point the flag is flipped. In practice that invariant is not enforced and gets violated.

Long-term options

Option A: Set the tracing state before any allocation via a pre-main init hook

  • Use a .init_array / __mod_init_func entry (raw #[link_section] or the ctor crate) to scan argv for --allocation-tracing and libc::getenv("ALLOCATION_TRACING") before any Rust allocation happens. Latch a static AtomicU8 once; never mutate after.
  • Allocator branches on the latch: disabled → fast path (direct passthrough), enabled → wrapped. No transition boundary, no panic.
  • Pro: preserves current CLI/env UX, zero hot-path cost when off.
  • Con: linker-section hackery or a new dependency, platform-specific shims, adds meaningful complexity to a rarely-used feature.

Option B (recommended): Drop allocation-tracing from the default unix feature set

  • Cargo.toml:559: change unix = ["tikv-jemallocator", "allocation-tracing"] to unix = ["tikv-jemallocator"]. The tracing allocator is only compiled in when someone explicitly builds with --features allocation-tracing.
  • Pro: default release binaries are completely unaffected — the fast path is literally the system allocator, no branch, no atomic, no wrapping. Simplest possible fix.
  • Con: narrow usage change — vector --allocation-tracing on a stock distribution binary would error instead of enabling tracing. Users who want tracing build a custom binary.

Recommendation

To be discussed internally.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions