feat(protoutils): UnmarshalWithLimit — pre-decode allocation estimate#3615
feat(protoutils): UnmarshalWithLimit — pre-decode allocation estimate#3615wen-coding wants to merge 3 commits into
Conversation
PR SummaryMedium Risk Overview
Extensive unit, gogo, and load/benchmark tests cover amplification rejection, legitimate max blocks, and fast rejection paths. Reviewed by Cursor Bugbot for commit 15f42cb. Bugbot is set up for automated code reviews on this repo. Configure here. |
|
The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).
|
|
The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #3615 +/- ##
==========================================
- Coverage 59.01% 58.12% -0.90%
==========================================
Files 2224 2151 -73
Lines 182814 174443 -8371
==========================================
- Hits 107893 101399 -6494
+ Misses 65220 64029 -1191
+ Partials 9701 9015 -686
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
| } | ||
|
|
||
| const ( | ||
| pointerSize = 8 // pointer on 64-bit |
There was a problem hiding this comment.
reflect.TypeFor[...]().Size() ?
| case protoreflect.Int32Kind, protoreflect.Uint32Kind, | ||
| protoreflect.Sint32Kind, protoreflect.EnumKind: | ||
| return countVarintsInPacked(bz) * 4 | ||
| default: // int64, uint64, sint64 |
There was a problem hiding this comment.
make it explicit case, panic on default
| for len(bz) > 0 { | ||
| _, n := protowire.ConsumeVarint(bz) | ||
| if n < 0 { | ||
| break // malformed; allocEstimate will catch the outer error |
There was a problem hiding this comment.
wdym by "will catch the outer error"?
| case protowire.VarintType: | ||
| _, n := protowire.ConsumeVarint(data) | ||
| if n < 0 { | ||
| return 0, fmt.Errorf("alloc scan: malformed varint field %d: %w", num, protowire.ParseError(n)) |
There was a problem hiding this comment.
nit: this shared error message prefix looks like it can be prepended at the call site.
| // the unknown fields blob, allocating exactly tagLen+n bytes. | ||
| total += tagLen + n | ||
| } else if fd.IsList() { | ||
| total += sliceHeaderSize + 8 // over-counts slice header per element, noise at 1MB scale |
There was a problem hiding this comment.
technically this can cause up to 32x higher estimate than the real thing.
There was a problem hiding this comment.
fixed, scalarElementSize now returns 1 for bool, 4 for 32-bit kinds, 8 for 64-bit kinds. The 32× overcount for bool (1 byte wire, was returning 32) is gone.
|
Note that this impl is meaningfully slower than the current wireguard approach because of the use of descriptors (note that proto.Unmarshal does not use protoreflect on the hot path). Probably still acceptable overhead though, just saying. |
| for len(bz) > 0 { | ||
| _, n := protowire.ConsumeVarint(bz) | ||
| if n < 0 { | ||
| break // partial count on malformed input; proto.Unmarshal will reject it |
There was a problem hiding this comment.
should we return an error here, rather than a partial count?
There was a problem hiding this comment.
done, returning an error now
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes using default effort and found 2 potential issues.
There are 3 total unresolved issues (including 1 from previous review).
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 4aec24b. Configure here.
Adds allocation-bounded protobuf unmarshaling. Before calling Unmarshal, walks raw wire bytes with allocEstimate to bound the heap allocation the decode would make. Returns an error if the estimate exceeds limitBytes, without allocating anything. Public API: UnmarshalWithLimit[T](bz []byte, limitBytes int) — google v2 proto UnmarshalGogoWithLimit(bz []byte, msg, limitBytes int) — gogoproto allocEstimate accounts for: - sizeof(GoStruct) per message occurrence (protoregistry / gogoproto) - backing arrays for bytes/string fields - packed varint amplification (up to 8× wire-to-Go size difference) - map field runtime overhead (hmap + bucket slots) - unknown fields and wire-type mismatches (full wire record: tagLen+n) - all wire occurrences of singular fields (gogoproto merge behavior) - truncated or corrupt input returns an error immediately Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
60a08ff to
a5c8a03
Compare
A zero or negative limit is a programming error. Panic with a descriptive message rather than silently rejecting all messages. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Map fields satisfy IsList() (maps encode as repeated entries), so the previous ordering added a spurious sliceHeaderSize per map occurrence. A Go map is not a slice — check IsMap first and skip the slice header. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Summary
Adds allocation-bounded protobuf unmarshaling to bound heap allocation during decode.
UnmarshalWithLimit[T](bz []byte, limitBytes int)— google v2 protoUnmarshalGogoWithLimit(bz []byte, msg gogoproto.Message, limitBytes int)— gogoproto (Tendermint P2P types)Both functions run
Scan/ScanAny(wireguard schema checks) first, then walk the raw wire bytes withallocEstimatebefore callingUnmarshal. If the estimated heap allocation exceedslimitBytes, the message is rejected without allocating anything.How
allocEstimateworksThe function walks raw wire bytes using
protoreflect.MessageDescriptorfor the field schema andprotoregistry/gogoproto.MessageTypefor Go struct sizes. It accumulates:sizeof(GoStruct)from type registrypointerSizeper element (backing array pointer)bytesfieldsliceHeaderSize + len(val)stringfieldstringHeaderSize + len(val)mapEntryOverheadper entry (runtime map internals) + key/value contenttagLen + ntagLen + n) — no panicThe estimate is conservative (may over-count). A 10× estimation error is acceptable — the goal is to distinguish legitimate messages from wire payloads whose decoded size far exceeds their wire size.
Gogoproto bridge
Tendermint P2P types (
tmproto.*) are generated byprotoc-gen-gogofasterand do not implementgoogle.golang.org/protobuf/proto.Message.UnmarshalGogoWithLimitusesgithub.com/golang/protobuf/proto.MessageReflect(deprecated but the only bridge) to obtain theprotoreflect.MessageDescriptor, and falls back togogoproto.MessageTypefor struct size lookups.Singular field merge behavior
gogoproto merges repeated wire occurrences of a singular field (e.g.
Proposal.last_commit) by appending sub-fields. Per-occurrence wireguard checks pass individually, butallocEstimaterecurses into every wire occurrence and accumulates the totals, so the full merged allocation is estimated correctly.Map fields
Map fields are counted at
mapEntryOverheadper entry (64 bytes) for runtime overhead plus key/value content from the recursive walk. This covers thehmapstruct and bucket array slots conservatively. Currently no wireguard-protected message has map fields; support is included for future messages.Test plan
SmallMessageAccepted— small repeated-message field passesManyEmptyEntriesRejected— 10k empty repeated entries caught before UnmarshalLargePayloadRejected— large bytes field caughtUnknownBytesFieldCounted— unknown bytes field counted toward limitSmallUnknownFieldsAccepted— small unknown scalars pass cleanlyResultIsCorrect— decoded result is correct when within limitPackedVarintAmplificationCounted— 200k packed uint64 (1 byte/wire, 8 bytes/Go) rejectedWireTypeMismatchNoPanic— mismatched wire type: no panic, counted as unknownTruncatedInputReturnsError— truncated mid-field returns error, not panicZeroLimitPanics—limitBytes <= 0panics (programming error)LargeMapRejected— 50k map entries exceed 1MB limitSmallMapAccepted— 2-entry map passesGogoManyEmptySignaturesRejected— gogoproto Commit with 10k signatures caughtGogoSmallCommitAccepted— legitimate gogoproto Commit passesGogoSingularFieldMerge— 500 × last_commit each with 50 signatures exceeds 1MBLoad tests (
alloc_scan_load_test.go):TestUnmarshalWithLimit_MaxBlockAccepted— max-sized autobahn Block (2000 txs × 1024 bytes) accepted at 4MB limitBenchmarkUnmarshalWithLimit_MaxBlock— ~242µs for 2MB block, 8.5 GB/sBenchmarkUnmarshalWithLimit_AmplifiedPayload— ~316ns for 20KB payload rejected before Unmarshal