feat: add otel logger injection#427
Conversation
| 2. The invocation span from the plugin registry. This is the path used | ||
| for top-level handler code: the invocation span is never attached to | ||
| the worker thread's context, so the registry is the only way to | ||
| resolve it. |
There was a problem hiding this comment.
Currently, on_invocation_start is running on main thread (instead of worker thread). So, even if we attach the span to the context, it's invisible for worker thread.
|
|
||
| # enrich_logger defaults to True, so the execution logger is wrapped with OTel | ||
| # trace context injection (otel.trace_id, otel.span_id, otel.trace_sampled). | ||
| otel = DurableExecutionOtelPlugin(tracer_provider) |
There was a problem hiding this comment.
nit: For follow up in future pr, I think we also want to make the DurableExecutionOtelPlugin as "no-touch" as possible. The typical workflow for a customer adding OpenTelemetry is just adding a lambda layer and changing a few lambda environment variables and it should "just work".
There was a problem hiding this comment.
In our case, we'd like it to be as simple as the following (if possible):
- Adding ADOT layer (or whatever layer we recommend)
- adding the aws-durable-execution-sdk-python-otel to their dependencies
- changing lambda env var
- enabling tracing on the lambda function
- profit
| if span_context and span_context.is_valid: | ||
| return span_context | ||
|
|
||
| invocation_span = self._get_span(None) |
There was a problem hiding this comment.
does the self._get_span(None) always return the invocation span? I don't see it being set anywhere. Do we even need this line and the block after it?
There was a problem hiding this comment.
ah never mind, I see that on_invocation_start calls _start_span(None,...) which will call
self._operation_spans[operation_id] = span
setting the invocation_span with operation_id = None
But do we need this fallback code? I would think that trace.get_current_span().get_span_context() should return the current span, including the invocation span.
There was a problem hiding this comment.
But do we need this fallback code?
Yes, we need it for current implementation, reason:
-
Currently we only register span in
on_user_function_start. We still need to get the invocation span if user call logger outside of user func (step, child) -
Currently
on_invocation_startis called on main thread,on_user_function_startis called on worker thread. Handler code is called on worker thread. So, even if we register the span inon_invocation_start, the logger in worker thread cannot find the span.
| # Logged inside a child context: enriched with the child-context span_id. | ||
| child_context.logger.info("Entering child context") | ||
| result: str = child_context.step(greet(name), name="child-greet") | ||
| child_context.logger.info("Leaving child context", extra={"result": result}) |
There was a problem hiding this comment.
I don't see this log in your example logs, can you check if it actually is present and that the spanId is the same as the log for "Entering child context"?
There was a problem hiding this comment.
Update in the pr description
Issue #, if available: #385
Description
wrap_loggerhook to DurableInstrumentationPlugin and chain it inPluginExecutor.wrap_logger.OtelEnrichedLoggerthat injects the active span context into each log extra.get_current_span_context: the active operation span inside a step/child context, the enclosing operation span between steps, and the invocation span (from the registry) at the top level.on_user_function_endnow restores the enclosing operation span (parent_id) rather than always the invocation span, so between-step logs inside a child context correlate to that child context.Testing