Skip to content

fix(cpp): row encoder truncates container size, causing out-of-bounds heap writes #3758

@ayush00git

Description

@ayush00git

The C++ row encoder (cpp/fory/encoder) passes container element counts into the row writers without checking that they fit the integer types the writers use. For containers large enough to overflow those types, encoding produces out-of-bounds heap writes. These are encode-side bugs, reachable whenever an application encodes very large (or attacker-influenced) containers.

1. Container size truncated to uint32_t (heap overflow)

ArrayWriter::reset takes a uint32_t element count, but the encoder passes the container's size(), which is a size_t. On a 64-bit platform, a container with >= 2^32 elements truncates the count. For example, a std::vector<bool> of 2^32 elements is only about 512 MB of RAM, so this is reachable.

After truncation, the writer allocates space for the truncated (small) count, but the encoder still loops over all the real elements and writes each one into the buffer without a bounds check. The result is a heap write far past the allocation.

The existing size check inside ArrayWriter::reset runs on the already-truncated value, so it does not catch this.

2. Signed overflow of the element loop counter

The encoder iterates over container elements with a signed int counter. For containers with more than 2^31 elements, incrementing that counter is signed-overflow undefined behavior, and a wrapped negative index makes the writer compute an offset before the start of the buffer. This is the same class of memory-safety bug as (1), but it triggers at half the element count.

Suggested fix

Validate the element count once at the encoder entry points, before any writes happen, so an oversized container fails cleanly instead of silently truncating. Use an unsigned, size_t-wide counter for the element loops (or bound the count up front).

Environment

  • Module: cpp/fory/encoder (row encoder) driving cpp/fory/row writers.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions