feat(data-pipeline): OTLP HTTP/protobuf trace export#2115
Conversation
Records the approved design: vendor OTLP trace + collector protos and generate prost types (zero new runtime deps), keep the hand-rolled serde JSON path, share one mapper with a serde->prost converter, and select the protocol via builder + C FFI. Includes the dd-trace-py companion wiring and the layered E2E plan (local receiver, system-tests, sdk-backend-verify). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Bite-sized, TDD-structured plan across 9 phases: vendor+generate prost types, serde->prost converter, encoder dispatch, protocol config through builder + C FFI, full validation gauntlet + libdatadog PR, then dd-trace-py PyO3/writer wiring and three E2E tiers (local receiver, system-tests, sdk-backend-verify) + dd-trace-py PR. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Vendors opentelemetry/proto/trace/v1/trace.proto and opentelemetry/proto/collector/trace/v1/trace_service.proto from open-telemetry/opentelemetry-proto commit 1e725b853bc8f6b46ee62e8232e4c83017b9536f (matching the already-vendored common.proto and resource.proto). Adds both protos to the prost_build compile list in build.rs, generates the committed Rust types (opentelemetry.proto.trace.v1.rs and opentelemetry.proto.collector.trace.v1.rs), and updates _includes.rs. Also qualifies "Span" → "pb.Span" / "pb.idx.Span" in build.rs type_attribute calls to prevent serde derives from leaking into the new opentelemetry::proto::trace::v1::Span type.
…ests Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…-type match Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…o avoid doctest failures
… Grpc content-type arm
🎉 All green!🧪 All tests passed 🎯 Code Coverage (details) 🔗 Commit SHA: 80aad24 | Docs | Datadog PR Page | Give us feedback! |
📚 Documentation Check Results📦
|
Clippy Allow Annotation ReportComparing clippy allow annotations between branches:
Summary by Rule
Annotation Counts by File
Annotation Stats by Crate
About This ReportThis report tracks Clippy allow annotations for specific rules, showing how they've changed in this PR. Decreasing the number of these annotations generally improves code quality. |
🔒 Cargo Deny Results📦
|
ee71538 to
664f16f
Compare
Artifact Size Benchmark Reportaarch64-alpine-linux-musl
aarch64-unknown-linux-gnu
libdatadog-x64-windows
libdatadog-x86-windows
x86_64-alpine-linux-musl
x86_64-unknown-linux-gnu
|
The OTLP design spec and implementation plan are linked from the PR description (internal chonk) rather than committed to the repo. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… encoding, errors on grpc Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The build-time gRPC check fired unconditionally, so a `grpc` protocol (e.g. resolved from the OTel-default OTEL_EXPORTER_OTLP_PROTOCOL) failed the build of a normal Datadog-agent exporter even with no OTLP endpoint set — violating the "protocol is inert without an endpoint" contract. Move the rejection inside the OTLP-endpoint branch so it only fires when OTLP export is actually enabled. The send-time arm remains a guard. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…per-span alloc Pre-push review cleanups: keep the internal OtlpWireProtocol and the json_serializer module crate-private (not public API surface), and use eq_ignore_ascii_case in the span-kind mappers to avoid a per-span to_lowercase() allocation on the encode hot path. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Criterion benches for the OTLP encoder hot paths — native spans -> prost IR (the mapper), and prost IR -> HTTP/protobuf and HTTP/JSON wire — plus end-to-end map+encode, over ~1000-span payloads (one large trace and many small traces). Inputs are decoded from msgpack into borrowed SpanSlices, matching the production exporter path. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… hex Pre-size the per-export span Vec and per-span attribute Vec in the prost mapper to avoid reallocations as they fill. This also makes the resulting prost IR more compact, speeding up the downstream protobuf encode (a sequential read of the IR). Serialize OTLP/JSON trace/span ids from a stack buffer (hex::encode_to_slice) instead of allocating a String per id. Benches (~1000 spans): map -21%, protobuf encode -19%, JSON encode -9%, end-to-end -10..12%. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ck buffer OTLP encodes 64-bit integers (nanosecond timestamps and intValue attributes) as JSON strings to avoid precision loss; these were each allocating a String per span via to_string(). Format them into a stack buffer instead (NumStr, mirroring the HexId id writer). Removes ~3 timestamp + N int-attribute String allocations per span on the JSON path. Bench: encode_json a further -4.5% (now -12.5% vs the original baseline). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…s text Addresses the build.rs comment-strip review note. `[lib] doctest = false` does not suppress doctests under `cargo test --doc` (a required gate), so instead fence the one offending example block (Span.attributes in trace.proto) as a ```text block — rustdoc renders it as text, not a Rust doctest — and drop `disable_comments` so the OTLP proto docs are generated onto the prost structs again. `cargo test --doc` stays green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
test_otlp_grpc_without_endpoint_still_builds builds a real TraceExporter, which spawns the agent_info worker and makes syscalls miri can't execute. Add #[cfg_attr(miri, ignore)] to match the file's other live-build tests. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
[EDIT: you can ignore, see my later comment below].
Following @ekump 's comment, I also wonder why all the helpers in the mapper take spans by reference, while the main mapping functions actually has ownership. I feel like we could avoid most of the string clones when we're converting native spans to the IR, if we did this transformation destructively (which is fine, we have ownership)? Though I might be missing some details.
|
Ok, I'll take back the comment on owning spans. Spans text is an abstract |
…ress review
Per reviewer feedback (ekump/bengl/yannham), the public OTLP protocol no longer
carries the unsupported gRPC variant:
- OtlpProtocol is now {HttpJson, HttpProtobuf}; `grpc` is rejected at the parse
boundary (FromStr) instead of constructed and guarded downstream.
- Remove the internal OtlpWireProtocol; fold content_type()/encode() onto
OtlpProtocol (identical once gRPC was gone). The send path encodes via
OtlpProtocol::encode; the exporter derives content-type from config.protocol.
- Drop #[non_exhaustive]; remove the build-time gRPC rejection and the obsolete
builder tests, since gRPC can no longer be constructed.
- Narrow the public surface: otlp is pub(crate); only OtlpProtocol is re-exported
(libdd_data_pipeline::OtlpProtocol). OtlpTraceConfig and the mapper helpers are
pub(crate).
Also: nest push_str_attr, `(p >= 1.0) as u32`, drop a type annotation, idiomatic
iterator/.map(), doc-links, document DecimalBuf, declutter bench code, and add a
protobuf round-trip test.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Integrate the OTLP metrics export feature that landed on main with this PR's protobuf trace export + prost-IR rework: - exporter: send_otlp_http takes a content_type param (was hardcoded JSON); the trace wrapper passes config.protocol.content_type() (JSON or protobuf), metrics passes application/json. - builder: keep both the otlp_protocol field and main's metrics fields; the trace OtlpTraceConfig uses self.otlp_protocol (not hardcoded HttpJson). - mapper: port main's `_dd.stats_computed` resource attribute (from the new client_computed_stats field) onto the prost build_resource; port its two stats tests to prost assertions. - metrics depended on the deleted json_types only for status_code constants; those now live in (and are re-used from) otlp_encoder::mapper::status_code. - otlp module keeps main's pub(crate) visibility; OtlpProtocol stays the one externally re-exported symbol (libdd_data_pipeline::OtlpProtocol). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
map_span_link hardcoded `flags: 0`, dropping the linked context's W3C trace
flags (the v04 SpanLink.flags field is populated by the msgpack decoder), so
OTLP consumers saw incorrect link metadata. Set `flags: link.flags`; add
regression tests on both the protobuf (mapper) and JSON (serializer) paths.
Also from pre-push review:
- json_serializer: fix a stale comment (output-buffer pre-sizing was measured and
reverted as a regression; plain `to_vec` is intentional).
- config: make OtlpProtocol::{content_type,encode} pub(crate) so serde_json::Error
does not leak into the public API (the type stays public for the FFI).
- lib: the OtlpProtocol re-export carries a non-doc impl note.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The shared benchmark suite runs ~53 min on main against a 60 min GitLab job cap; the 10 OTLP benches (5 types x 2 fixtures) pushed it over the limit. Drop the 100x10 fixture, keeping all five bench types (map_to_prost, encode_protobuf, encode_json, e2e_protobuf, e2e_json) on the 1x1000 fixture. Both fixtures totalled ~1000 spans and gave similar signal. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
yannham
left a comment
There was a problem hiding this comment.
All concerns addressed for me 👍
In the long run we might want to look at deserializing native span directly to protobuf without going through prost and the IR at some point, but this is obviously way out of the scope of this PR and OTLP in general.
…st mapper main landed feat(otlp)!: attribute-level OTel compatibility (#2091), which adds `otel_trace_semantics_enabled`: when set, the OTLP payload omits DD-specific per-span attributes (service.name/operation.name/resource.name/span.type) and the error.*/span.kind meta tags (that info lives in the OTLP Status and Span.kind fields). Ported that gating into the prost mapper: - map_traces_to_otlp / map_span / collect_span_attributes thread the flag; the promoted tags and the error.*/span.kind meta tags are skipped when enabled, and the dropped-count accounting mirrors what was emitted (promoted/excluded compat tags), matching #2091's behavior. - OtlpTraceConfig gains otel_trace_semantics_enabled; the builder wires it; the send path passes config.otel_trace_semantics_enabled. Trace protocol stays protobuf-capable (config.protocol.encode), metrics stay JSON. - Ported #2091's test as a prost-native assertion; existing OTLP/metrics tests unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…Status span_status only read meta["error.msg"], but the #2091 OTel-semantics gating drops both error.msg and error.message meta tags. For the common case (error.message — used by every SDK except .NET), that silently lost the error text: empty Status.message and, under OTel-semantics, no error.message attribute either. Fall back to error.message (mirroring main's json_types mapper) so the message is always promoted to Status. Add an OTel-semantics regression test. Found by Codex adversarial review of the #2091 integration. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Adds OTLP HTTP/protobuf trace export support to libdatadog’s data pipeline, making the trace exporter’s encoding selectable via an OTel-standard protocol setting, while switching OTLP trace encoding to a single prost-generated intermediate representation (IR) used by both JSON and protobuf encoders.
Changes:
- Replaces the hand-rolled OTLP JSON model with a prost OTLP IR + a custom OTLP/JSON serializer, and adds a protobuf encoder via
prost::Message::encode_to_vec. - Introduces
OtlpProtocol(HttpJson/HttpProtobuf) inlibdd-data-pipeline, threads it through the trace exporter + sender, and exposes it through the FFI config. - Vendors/generates the OTLP trace protobuf definitions needed for the prost IR and adds tests/benchmarks around encoding parity and performance.
Reviewed changes
Copilot reviewed 23 out of 24 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| libdd-trace-utils/src/otlp_encoder/mod.rs | Exposes prost OTLP request type and adds JSON/protobuf encoding entry points. |
| libdd-trace-utils/src/otlp_encoder/mapper.rs | Reworks mapping to build prost OTLP types directly; updates tests for the new IR. |
| libdd-trace-utils/src/otlp_encoder/json_types.rs | Removes the prior hand-rolled OTLP JSON model (breaking change). |
| libdd-trace-utils/src/otlp_encoder/json_serializer.rs | Adds a custom serializer implementing OTLP/JSON wire format from prost types. |
| libdd-trace-utils/Cargo.toml | Adds hex dependency used by the JSON serializer/tests. |
| libdd-trace-utils/benches/otlp_encoding.rs | Adds benchmarks for mapping + protobuf/JSON encoding paths. |
| libdd-trace-utils/benches/main.rs | Registers the new OTLP encoding benchmarks. |
| libdd-trace-protobuf/src/pb/opentelemetry/proto/trace/v1/trace.proto | Adds vendored OTLP trace protobuf schema. |
| libdd-trace-protobuf/src/pb/opentelemetry/proto/collector/trace/v1/trace_service.proto | Adds vendored OTLP trace service schema. |
| libdd-trace-protobuf/src/opentelemetry.proto.trace.v1.rs | Adds prost-generated Rust types for OTLP trace. |
| libdd-trace-protobuf/src/opentelemetry.proto.collector.trace.v1.rs | Adds prost-generated Rust types for OTLP trace service request/response. |
| libdd-trace-protobuf/src/_includes.rs | Wires generated OTLP trace modules into the crate’s include tree. |
| libdd-trace-protobuf/build.rs | Updates prost-build inputs and attributes to generate OTLP trace types. |
| libdd-data-pipeline/tests/test_trace_exporter_otlp_protobuf_export.rs | Adds an integration test validating protobuf payload decodes server-side. |
| libdd-data-pipeline/src/trace_exporter/mod.rs | Switches trace export to protocol-selected encoding via OtlpProtocol. |
| libdd-data-pipeline/src/trace_exporter/builder.rs | Adds builder support for selecting OTLP trace protocol. |
| libdd-data-pipeline/src/otlp/mod.rs | Re-exports OtlpProtocol and updates module docs around encoding choice. |
| libdd-data-pipeline/src/otlp/metrics.rs | Adjusts status code import and passes explicit JSON content-type to sender. |
| libdd-data-pipeline/src/otlp/exporter.rs | Generalizes OTLP HTTP sender to accept content-type + body bytes. |
| libdd-data-pipeline/src/otlp/config.rs | Implements OtlpProtocol parsing + content-type + request encoding helper. |
| libdd-data-pipeline/src/lib.rs | Re-exports OtlpProtocol at crate root for FFI consumption. |
| libdd-data-pipeline/Cargo.toml | Adds prost as a dev-dependency for tests decoding protobuf payloads. |
| libdd-data-pipeline-ffi/src/trace_exporter.rs | Adds FFI config setter for OTLP protocol selection and threads into builder. |
| Cargo.lock | Updates lockfile for newly introduced dependencies. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
What does this PR do?
Adds OTLP HTTP/protobuf as a trace-export encoding alongside HTTP/JSON, selectable via the OTel-standard
OTEL_EXPORTER_OTLP_TRACES_PROTOCOL(http/jsondefault,http/protobuf). The generatedprostOTLP types are the single intermediate representation: the mapper builds them directly from native spans, protobuf isprost::encode_to_vec, and JSON is a serde serializer over the same types.Motivation
libdatadog's OTLP trace export only spoke HTTP/JSON. SDKs that honor
OTEL_EXPORTER_OTLP_TRACES_PROTOCOLneedhttp/protobufto match the OTel default and to talk to collectors that expect protobuf.Additional Notes
encode_otlp_protobufisencode_to_vec;encode_otlp_jsonis a serde serializer (json_serializer.rs) emitting OTLP-spec JSON (hex ids, base64 bytes, int64-as-string, lowerCamelCase, proto3 defaults omitted).OtlpProtocolis{HttpJson, HttpProtobuf};grpcis rejected at the parse boundary (FromStr) rather than carried as an unsupported variant.content_type()/encode()live on the type.otel_trace_semantics_enabled(OTel attribute compatibility, feat(otlp)!: Export OTLP spans with attribute-level OTel compatibility #2091): when set, the prost mapper omits the DD-specific span attributes and promotes the error message to the OTLPStatus.Link.flags.How to test the change?
cargo test -p libdd-trace-utils -p libdd-data-pipeline -p libdd-data-pipeline-ffi,cargo test --doc, clippy, fmt, andcargo ffi-testpass. A parity test asserts the JSON and protobuf encodings carry the same span from the one prost IR; a protobuf round-trip test asserts the encoding is lossless.application/x-protobuf(HTTP 200), spans ingested with correct service/resource and a preserved 128-bit trace id.cargo bench -p libdd-trace-utils --bench main -- otlp/.BREAKING CHANGE: removes the previously public
libdd_trace_utils::otlp_encoder::json_typesmodule (the hand-rolled OTLP JSON model). OTLP encoding now builds prost-generated types as the single IR. libdatadog consumers pin by version, so they pick this up on the next release.