Skip to content

anthropics/buffa

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

63 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

buffa

crates.io docs.rs CI MSRV deps.rs no_std License

A pure-Rust Protocol Buffers implementation with first-class protobuf editions support. Written by Claude ❣️

Why buffa?

The Rust ecosystem lacks an actively maintained, pure-Rust library that supports protobuf editions. Buffa fills that gap with a ground-up design that treats editions as the core abstraction. It passes the full protobuf conformance suite — binary, JSON, and text — with zero expected failures.

Features

  • Editions-first. Proto2 and proto3 are understood as feature presets within the editions model. One code path, parameterized by resolved features.

  • Two-tier owned/borrowed types. Each message generates both MyMessage (owned, heap-allocated) and MyMessageView<'a> (zero-copy from the wire). OwnedView<V> wraps a view with its backing Bytes buffer for use across async boundaries.

  • MessageField<T>. Optional message fields deref to a default instance when unset -- no Option<Box<T>> unwrapping ceremony.

  • EnumValue<T>. Type-safe open enums with proper Rust enum types and preservation of unknown values, instead of raw i32.

  • Linear-time serialization. Cached encoded sizes prevent the exponential blowup that affects libraries without a size-caching pass.

  • Unknown field preservation. Round-trip fidelity for proxy and middleware use cases.

  • no_std + alloc. The core runtime works without std, including JSON serialization via serde. Enabling std adds std::io integration, std::time conversions, and thread-local JSON parse options.

Wire formats

buffa supports binary, JSON, and text protobuf encodings:

  • Binary wire format -- full support for all scalar types, nested messages, repeated/packed fields, maps, oneofs, groups, and unknown fields.

  • Proto3 JSON -- canonical protobuf JSON mapping via optional serde integration. Includes well-known type serialization (Timestamp as RFC 3339, Duration as "1.5s", int64/uint64 as quoted strings, bytes as base64, etc.).

  • Text format (textproto) -- the human-readable debug format. Covers Any expansion ([type.googleapis.com/...] { ... }), extension bracket syntax ([pkg.ext] { ... }), and group/DELIMITED fields. no_std-compatible.

Unsupported features

These are intentionally out of scope:

  • Runtime reflection (DynamicMessage, descriptor-driven introspection) — planned for a future release. The descriptor types are now available in buffa-descriptor as a first step. Buffa remains a codegen-first library; if you need schema-agnostic processing today, consider preserving unknown fields or using Any.
  • Proto2 optional-field getter methods[default = X] on optional fields does not generate fn field_name(&self) -> T unwrap-to-default accessors. Custom defaults are applied only to required fields via impl Default. Optional fields are Option<T>; use pattern matching or .unwrap_or(X).
  • Scoped JsonParseOptions in no_std — serde's Deserialize trait has no context parameter, so runtime options must be passed through ambient state. In std builds, with_json_parse_options provides per-closure, per-thread scoping via a thread-local. In no_std builds, set_global_json_parse_options provides process-wide set-once configuration via a global atomic. The two APIs are mutually exclusive. The no_std global supports singular-enum accept-with-default but not repeated/map container filtering (which requires scoped strict-mode override).

Known limitations

These are gaps we intend to address in future releases:

  • Closed-enum unknown values in packed-repeated view decode are silently dropped (not routed to unknown fields). The owned decoder handles this correctly; the view decoder handles singular, optional, oneof, and unpacked repeated correctly. Packed blobs have no per-element tag to borrow, so the zero-copy UnknownFieldsView<'a> has no span to reference.
  • Closed-enum unknown values in map values are silently dropped (not routed to unknown fields). The proto spec requires the entire map entry (key + value) to go to unknown fields, which requires re-encoding. This affects proto2 schemas with map<K, ClosedEnum> where an evolved sender adds new enum values.

Semver and API stability

Buffa is pre-1.0. We follow the Rust community convention for 0.x crates: breaking changes increment the minor version (0.1.x → 0.2.0), additive changes increment the patch version (0.1.0 → 0.1.1). Pin to a minor version (buffa = "0.5") to avoid surprises.

The generated code API (struct shapes, Message trait, MessageView trait, EnumValue, MessageField) is considered the primary stability surface. Internal helper modules marked #[doc(hidden)] (__private, __buffa_* fields) may change at any time.

Quick start

Using buf generate (recommended)

Install buf and the protoc plugins, then create a buf.gen.yaml:

version: v2
plugins:
  - local: protoc-gen-buffa
    out: src/gen
  - local: protoc-gen-buffa-packaging
    out: src/gen
    strategy: all
buf generate

Using buffa-build in build.rs

Alternatively, use buffa-build for a build.rs-based workflow (requires protoc on PATH):

// build.rs
fn main() {
    buffa_build::Config::new()
        .files(&["proto/my_service.proto"])
        .includes(&["proto/"])
        .compile()
        .unwrap();
}

Encoding and decoding

use buffa::Message;

// Encode
let msg = MyMessage { id: 42, name: "hello".into(), ..Default::default() };
let bytes = msg.encode_to_vec();

// Decode (owned)
let decoded = MyMessage::decode_from_slice(&bytes).unwrap();

// Decode (zero-copy view)
let view = MyMessageView::decode_view(&bytes).unwrap();
println!("name: {}", view.name); // &str, no allocation

// Decode (owned view — zero-copy + 'static, for async/RPC use)
let owned_view = OwnedView::<MyMessageView>::decode(bytes.into()).unwrap();
println!("name: {}", owned_view.name); // still zero-copy, but 'static + Send

JSON serialization (with json feature)

let json = serde_json::to_string(&msg).unwrap();
let decoded: MyMessage = serde_json::from_str(&json).unwrap();

Documentation

  • User Guide — comprehensive guide to buffa's API, generated code shape, encoding/decoding, views, JSON, well-known types, and editions support.
  • Migrating from prost — step-by-step migration guide with before/after code examples.
  • Migrating from protobuf — migration guide covering both stepancheg v3 and Google official v4.

Workspace layout

Crate Purpose
buffa Core runtime: Message trait, wire format codec, no_std support
buffa-types Well-known types: Timestamp, Duration, Any, Struct, wrappers, etc.
buffa-descriptor Protobuf descriptor types (FileDescriptorProto, DescriptorProto, ...)
buffa-codegen Code generation from protobuf descriptors
buffa-build build.rs helper for invoking codegen via protoc
protoc-gen-buffa protoc plugin binary

Performance

Throughput comparison across five representative message types, measured on an Intel Xeon Platinum 8488C (x86_64) at buffa v0.5.0. Cross-implementation benchmarks run in Docker for toolchain consistency (task bench-cross). Higher is better.

Binary decode

Binary decode — ApiResponse Binary decode — LogRecord Binary decode — AnalyticsEvent Binary decode — GoogleMessage1 Binary decode — MediaFrame

Raw data (MiB/s)
Message buffa buffa (view) prost prost (bytes) protobuf-v4 Go
ApiResponse 825 1,399 (+70%) 756 (−8%) 677 (−18%) 689 (−16%) 272 (−67%)
LogRecord 741 1,869 (+152%) 735 (−1%) 682 (−8%) 867 (+17%) 251 (−66%)
AnalyticsEvent 192 317 (+65%) 254 (+32%) 197 (+3%) 359 (+87%) 91 (−53%)
GoogleMessage1 905 1,201 (+33%) 989 (+9%) 930 (+3%) 643 (−29%) 348 (−62%)
MediaFrame 17,682 71,426 (+304%) 9,612 (−46%) 23,577 (+33%) 17,894 (+1%) 1,250 (−93%)

Binary encode

Binary encode — ApiResponse Binary encode — LogRecord Binary encode — AnalyticsEvent Binary encode — GoogleMessage1 Binary encode — MediaFrame

Raw data (MiB/s)
Message buffa buffa (view) prost prost (bytes) protobuf-v4 Go
ApiResponse 2,566 2,537 (−1%) 1,801 (−30%) 1,033 (−60%) 561 (−78%)
LogRecord 4,029 4,703 (+17%) 3,116 (−23%) 1,651 (−59%) 305 (−92%)
AnalyticsEvent 582 623 (+7%) 359 (−38%) 509 (−13%) 161 (−72%)
GoogleMessage1 2,441 2,725 (+12%) 1,817 (−26%) 865 (−65%) 362 (−85%)
MediaFrame 43,830 45,425 (+4%) 38,652 (−12%) 10,616 (−76%) 1,673 (−96%)

Build + binary encode

The build + encode measure starts from raw field values rather than a pre-built message struct, so it counts struct construction. The buffa (view) path constructs a borrowed view directly over the input slices and never allocates an owned message at all, which is why it is consistently faster than building owned structs and then encoding them.

Build + binary encode — ApiResponse Build + binary encode — LogRecord Build + binary encode — AnalyticsEvent Build + binary encode — GoogleMessage1 Build + binary encode — MediaFrame

Raw data (MiB/s)
Message buffa buffa (view)
ApiResponse 732 1,649 (+125%)
LogRecord 498 2,843 (+471%)
AnalyticsEvent 520 1,166 (+124%)
GoogleMessage1 818 1,169 (+43%)
MediaFrame 20,893 52,910 (+153%)

JSON encode

JSON encode — ApiResponse JSON encode — LogRecord JSON encode — AnalyticsEvent JSON encode — GoogleMessage1 JSON encode — MediaFrame

Raw data (MiB/s)
Message buffa prost Go
ApiResponse 872 942 (+8%) 115 (−87%)
LogRecord 1,332 1,401 (+5%) 139 (−90%)
AnalyticsEvent 766 849 (+11%) 52 (−93%)
GoogleMessage1 968 1,033 (+7%) 125 (−87%)
MediaFrame 1,460 1,445 (−1%) 209 (−86%)

JSON decode

JSON decode — ApiResponse JSON decode — LogRecord JSON decode — AnalyticsEvent JSON decode — GoogleMessage1 JSON decode — MediaFrame

Raw data (MiB/s)
Message buffa prost Go
ApiResponse 680 299 (−56%) 68 (−90%)
LogRecord 795 701 (−12%) 108 (−86%)
AnalyticsEvent 268 239 (−11%) 45 (−83%)
GoogleMessage1 649 253 (−61%) 71 (−89%)
MediaFrame 1,910 1,958 (+3%) 264 (−86%)

Message types: ApiResponse (~200 B, flat scalars), LogRecord (~1 KB, strings + map + nested message), AnalyticsEvent (~10 KB, deeply nested + repeated sub-messages), GoogleMessage1 (standard protobuf benchmark message), MediaFrame (~10 KB, dominated by bytes fields — primary body + chunked sub-blobs + named attachments).

Libraries: prost 0.13 + pbjson 0.7, protobuf‑v4 (Google Rust/upb, v4.33.1), Go google.golang.org/protobuf v1.36.6. protobuf-v4 JSON is not included as it does not provide a JSON codec.

prost (bytes) uses prost-build's .bytes(["."]) config so every proto bytes field is generated as bytes::Bytes instead of Vec<u8>, and decodes from a bytes::Bytes input to exercise Bytes' zero-copy copy_to_bytes slicing. The substitution only affects the decode path, so only decode numbers are reported — prost (bytes) encode tracks default prost by construction. On the four non-bytes messages, prost (bytes) tracks default prost within noise (and is slightly slower on ApiResponse where the per-message Bytes::clone refcount overhead isn't offset by any actual zero-copy). On MediaFrame it runs ~2.4× faster than default prost at decode, confirming that prost's feature does land when it has bytes fields to work with. buffa views are in a different regime again: they borrow directly from the input buffer for strings, bytes, and nested message bodies, so buffa (view) on MediaFrame is ~3× the prost (bytes) number and ~4× buffa's own owned decode. Views also benefit on the four non-bytes messages, where prost's bytes feature is inert.

Owned decode trade-offs: buffa's owned decode is typically within ±10% of prost, trading a small throughput cost for features prost omits: unknown-field preservation by default, typed EnumValue<E> wrappers (not raw i32), and a type-stable decode loop that supports recursive message types without manual boxing. The zero-copy view path (MyMessageView::decode_view) sidesteps allocation entirely and is the recommended fast decode path. protobuf-v4's decode advantage on deeply-nested messages comes from upb's arena allocator — all sub-messages are bump-allocated in one arena rather than individually boxed.

Conformance

buffa passes the protobuf binary and JSON conformance test suite (v33.5, editions up to 2024). Both std and no_std builds pass the full suite including JSON. Run with task conformance.

Compiler compatibility

buf is the recommended way to compile .proto files. The buf CLI has its own built-in compiler, so no separate protoc install is needed — just install buf and protoc-gen-buffa.

protoc is also fully supported. protoc-gen-buffa and buffa-build work with protoc v21.12 and later. The minimum version varies by feature:

Feature Minimum protoc
Proto2 + proto3 v21.12
Editions 2023 v27.0
Editions 2024 v33.0

Note that Linux distro packages (Debian Bookworm, Ubuntu 24.04) ship protoc v21.12, which does not support editions. Install protoc v27+ from GitHub releases or use buf if you need editions support.

Compatibility is tested against protoc v21.12, v22.5, v25.5, v27.3, v29.5, and v33.5 (task protoc-compat).

Minimum supported Rust version

1.85

License

Apache-2.0

About

Rust implementation of protobuf with editions support, JSON serialization, and zero-copy views

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages