view: opt-in ViewEncode for serializing from borrowed fields by rpb-ant · Pull Request #55 · anthropics/buffa

rpb-ant · 2026-04-18T19:50:57Z

view: `ViewEncode` — serialize from borrowed fields

log_record build+encode from &str source: 1.35 µs → 227 ns (5.96×). Serialize-only at parity with owned Message::encode (±5%).

Why

Encoding currently requires building an owned message — cloning every string and map entry just to pass &self to Message::encode(), then dropping the struct. When the source data is already in-memory &strs (RPC handlers serializing from app state), that's ~19 allocs/message for a typical string-heavy payload. Views already have &'a str fields and pub constructors; this adds the encode side.

Change

ViewEncode<'a>: MessageView<'a> sub-trait with the same two-pass compute_size/write_to model as Message. Generated *View<'a> types implement it whenever views are generated (generate_views(true), the default).

The codegen reuses the existing per-field encode-stmt builders unchanged — they already emit &self.field-relative code that takes &str/&[u8], so they apply to view fields as-is. Oneof same: arm builders are duck-typed; pointing them at the view-side enum is the only delta. MessageView itself is unchanged.

Compatibility — 0.4.0

The trait surface is additive, but every generated view struct gains a __buffa_cached_size field — public-shape change for code that constructs views without ..Default::default(). Folded into 0.4.0 alongside the existing Any.value: Bytes break in Unreleased. Uncalled .text cost is negligible (~12 B/message in release, no LTO; the linker dedups identical cached_size bodies and write_to is generic so it monomorphizes only at call sites).

Testing

Every view-generating buffa-test proto exercises the encode path (proto3, proto2 groups/closed-enums, nested oneofs, WKT, utf8_validation=NONE, use_bytes_type, preserve_unknown_fields=false, edge_cases) — compile-time coverage for the duck-type-reuse claim across syntaxes. 9 view-encode round-trip / construct-from-borrows tests. Conformance 6× PASSED, 0 unexpected.

Companion

For connectrpc handlers, connect-rust needs a way to return pre-encoded bytes (PreEncoded newtype). Companion PR linked once open. This change is independently useful for any non-connectrpc encode path.

Follow-ups (not here)

MessageFieldView boxes the inner view (recursive types) — one alloc per nested-message field; non-boxed variant for non-recursive fields would close the gap for tree-shaped messages.
via-view-encode conformance mode (decode_view → encode_view).

Adds ViewEncode<'a>: MessageView<'a> with compute_size / write_to / cached_size, mirroring Message's two-pass model. Provided encode/encode_to_vec/encode_to_bytes. Field bytes are written by borrow — no String/Vec<u8> allocations. MessageFieldView gains compute_size/write_to/cached_size forwarding when V: ViewEncode. MapView gains new()/From<Vec>/FromIterator so callers can build a view map from borrowed (&str, &str) pairs (duplicate keys are encoded as-is). UnknownFieldsView gains write_to (concatenate raw spans). MessageView itself is unchanged. View remains the zero-copy read path by default; encode is opt-in via codegen. Targets 0.4.0: enabling view_encode on buffa-types adds a __buffa_cached_size field to WKT view structs (see CHANGELOG).

CodeGenConfig::view_encode (default false). When set, codegen emits __buffa_cached_size on each view struct and a separate impl ViewEncode<'a> for FooView<'a> block. The per-field encode-stmt builders (scalar_/repeated_/map_/oneof_*_stmt) emit &self.field-relative code that already takes &str/&[u8], so they apply to view field types unchanged. Oneof: the view-side enum (mod::FooOneofView) has the same variant names as owned with borrowed payload types; oneof_{size,write}_arm dispatch by name and call duck-typed primitives (string_encoded_len(x), x.compute_size()), so they work once pointed at the view enum path. buffa_build::Config::view_encode(bool) setter; protoc-gen-buffa view_encode=true option.

WKTs are commonly nested inside application messages (Timestamp, Any, Struct/Value); a consumer that opts into view_encode needs the WKT view types to implement ViewEncode for nested-message dispatch to compile. The extra __buffa_cached_size field is 4 bytes per WKT view. `task gen-wkt-types`

github-actions · 2026-04-18T19:51:08Z

All contributors have signed the CLA ✍️ ✅
_{Posted by the CLA Assistant Lite bot.}

Enable view_encode(true) for basic.proto. Add Address home_address = 15 to Person's contact oneof to cover ViewEncode dispatch through Box<View<'a>>. New tests: round-trip (decode_view → encode_to_vec byte-equal), construct-from-borrows for scalars/repeated/map/oneof, compute_size matches encoded len.

encode_view: serialize a pre-decoded view (parity with owned encode; wire-compat asserted by decode-and-compare). build_encode vs build_encode_view: construct a string+map LogRecord from borrowed source data and encode. Includes the per-field String allocs that the view path avoids — 1.35 µs → 227 ns (5.96×) on a 15-label fixture.

rpb-ant · 2026-04-18T20:11:26Z

I have read the CLA Document and I hereby sign the CLA

binary-encode.svg gains a "buffa (view)" series (encode_view, all 4 messages). New build-encode.svg shows build_encode vs build_encode_view for LogRecord — the alloc-elimination case (522 -> 3011 MiB/s, 5.77x). generate.py: messages_with_data() so per-chart message lists auto-shrink to those with data (build-encode is LogRecord-only); skip all-None series. Regenerated from native buffa/prost cargo bench + Docker google/Go.

…d owned test, doc notes - impl_message.rs: extract classify_fields() shared between generate_message_impl and build_view_encode_methods so a new field category only needs adding once. Add doc-notes on MessageSet (verbatim spans need no Item-group rewrap) and MapView/HashMap match-ergonomics. - benches/protobuf.rs: bench_build_encode! macro + per-type fixtures for all 4 message types (was log_record only). Spectrum: GoogleMessage1 1.34x to LogRecord 5.80x; view never slower. - buffa-test/src/tests/view.rs: restore owned test_compute_size_matches_encode_len alongside the new _view variant. - charts/: regenerate (build-encode.svg now shows all 4).

ViewEncode is now generated whenever views are generated (generate_views, on by default). The separate view_encode opt-in flag is removed: the uncalled .text cost measured ~12 B/message in release without LTO (linker dedups identical cached_size bodies; write_to is generic so monomorphizes only at call sites), and the WKT struct-shape break landed in 0.4.0 regardless. generate_views(false) remains the escape hatch for targets that want neither. Removes Config::view_encode field/builder/plugin-param and the 9+1 build.rs callsites. WKT regen byte-identical.

rpb-ant · 2026-04-18T21:45:24Z

Why ViewEncode is always-on (no `view_encode` flag)

An earlier draft of this PR gated ViewEncode behind Config::view_encode(false) (opt-in, mirroring generate_text). Review pushed back on that, and after measuring we dropped the flag. Here's the reasoning.

Options considered:

	description	downside
A	keep `view_encode` flag, flip default to `true`	one more config knob (#9); the WKT struct-shape break lands in 0.4.0 regardless of the flag, so it doesn't actually protect against the break
B	drop the flag; tie ViewEncode to `generate_views`	the (narrow) audience that wants zero-copy decode but never encode-from-borrows pays for unused codegen
C	keep flag, default `false` (original draft)	common case requires config; WKT break still lands; `generate_text` analogy is weak (that flag exists because it gates a runtime feature dep — ViewEncode has none)

Why B: the cost B imposes turns out to be negligible. Measured on buffa-test (82 messages, release build, no LTO, ViewEncode generated but never called):

	`.text`
`view_encode` off	2,724,340 B
`view_encode` on, uncalled	2,725,288 B
Δ	+948 B (~12 B/message)

write_to(&mut impl BufMut) is generic — zero .text unless monomorphized at a call site. cached_size impls are 4 bytes each and the linker deduplicates identical bodies (nm shows e.g. Int32ValueView::cached_size and FloatValueView::cached_size at the same address). compute_size is the only per-message unique code.

For a hypothetical 1000-message embedded schema without LTO that's ~12 KB; with LTO (standard for size-constrained builds) it's zero. The escape hatch for anyone who genuinely can't afford it is the existing generate_views(false).

So: one fewer config knob, no meaningful cost, and the __buffa_cached_size struct-shape break is already in 0.4.0 via WKTs anyway.

rpb-ant · 2026-04-19T01:05:18Z

Built a standalone echo bench to test whether view-decode pays off without view-encode: 906-byte string-heavy request (4 strings, 8 tags, 10-entry label map), measuring the full bytes-in → decode → build response → encode → bytes-out pipeline. Baseline uses move semantics (most charitable to owned).

Path	latency	RPS/core	vs baseline
owned-decode → owned-encode	1.433 µs	698k	1.00×
view-decode → owned-encode	1.223 µs	818k	1.17×
owned-decode → view-encode	1.397 µs	716k	1.03×
view-decode → view-encode	655 ns	1.53M	2.19×

View-decode alone: +17%. View-encode alone: +3%. Both: +119% — far more than the sum, because only view→view eliminates the string-alloc pass entirely (the &'a str from the request wire buffer flows straight to the response buffer with zero intermediate String). Either half alone still does one full alloc pass. So ViewEncode isn't a standalone win — it's the piece that unlocks the view-decode investment for echo-shaped handlers.

rpb-ant added 3 commits April 18, 2026 19:16

rpb-ant requested a review from iainmcgin April 18, 2026 19:51

rpb-ant added 2 commits April 18, 2026 19:52

rpb-ant force-pushed the rpb/view-encode-optin branch from fec0c02 to ea68c0d Compare April 18, 2026 19:53

rpb-ant marked this pull request as ready for review April 18, 2026 20:10

github-actions bot added a commit that referenced this pull request Apr 18, 2026

@rpb-ant has signed the CLA in #55

040ec1c

rpb-ant added 3 commits April 18, 2026 20:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

view: opt-in ViewEncode for serializing from borrowed fields#55

view: opt-in ViewEncode for serializing from borrowed fields#55
rpb-ant wants to merge 8 commits intomainfrom
rpb/view-encode-optin

rpb-ant commented Apr 18, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Apr 18, 2026 •

edited

Loading

Uh oh!

rpb-ant commented Apr 18, 2026

Uh oh!

rpb-ant commented Apr 18, 2026

Uh oh!

rpb-ant commented Apr 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rpb-ant commented Apr 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

view: ViewEncode — serialize from borrowed fields

Why

Change

Compatibility — 0.4.0

Testing

Companion

Follow-ups (not here)

Uh oh!

github-actions bot commented Apr 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rpb-ant commented Apr 18, 2026

Uh oh!

rpb-ant commented Apr 18, 2026

Why ViewEncode is always-on (no view_encode flag)

Uh oh!

rpb-ant commented Apr 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

rpb-ant commented Apr 18, 2026 •

edited

Loading

view: `ViewEncode` — serialize from borrowed fields

github-actions bot commented Apr 18, 2026 •

edited

Loading

Why ViewEncode is always-on (no `view_encode` flag)