[refine](exec/pipeline) replace std::mutex/std::unique_lock with annotated wrappers for thread safety analysis#63106
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
/review |
There was a problem hiding this comment.
I found one blocking concurrency/lifetime regression. The PR's goal appears to be adding thread-safety annotations and replacing plain mutex guards in BE pipeline dependency/task queue code; the changes are generally focused, but PipelineTask::debug_string() now snapshots raw dependency pointers and releases the lifecycle lock before using them. That breaks the lock's previous lifetime protection against finalize() clearing the owning shared states.
Critical checkpoints: goal is partially achieved, but not safely due to the debug-string lifetime race; scope is small and focused; concurrency is involved and the new lock lifetime in debug_string() is unsafe; no new lifecycle/static initialization/config/serialization compatibility concerns found; parallel code paths touched consistently for the annotated mutex conversion; no new tests are included for this race; no test result changes; observability is the affected area and should not introduce crash risk; no transaction/data write/FE-BE variable passing concerns found; no additional performance concern beyond the small debug snapshot copy.
User focus: no additional user-provided review focus was specified.
| fmt::format_to(debug_string_buffer, "{}. {}\n", i, | ||
| _read_dependencies[i][j]->debug_string(cast_set<int>(i) + 1)); | ||
| read_dependencies[i][j]->debug_string(cast_set<int>(i) + 1)); | ||
| } |
There was a problem hiding this comment.
This now dereferences raw Dependency* values after _dependency_lifecycle_lock has been released. The lock comment says it protects the dependency containers and the raw pointers they contain, and finalize() takes the same lock before clearing _sink_shared_state, _op_shared_states, and _shared_state_map, which own these dependencies. A concrete race is: debug_string() copies read_dependencies and releases the lock, observes the task as not finalized, then another thread runs finalize() and clears the shared states; the subsequent read_dependencies[i][j]->debug_string() can use a freed Dependency. Please keep the lifetime protection until all raw dependency pointers are no longer dereferenced, or snapshot owning shared_ptr state under the lock instead of raw pointers.
e520e45 to
73b7805
Compare
|
run buildall |
TPC-H: Total hot run time: 29844 ms |
TPC-DS: Total hot run time: 172362 ms |
What problem does this PR solve?
Problem Summary:
This is a follow-up to #63070, extending the thread safety annotation migration into the pipeline executor (
be/src/exec/pipeline/). Plainstd::mutexandstd::unique_lock<std::mutex>were replaced withAnnotatedMutexandLockGuard/UniqueLockwrappers fromthread_safety_annotations.h, enabling Clang's-Wthread-safetystatic analysis to catch lock ordering and data race bugs at compile time.While applying annotations, two latent data races in
debug_string()methods were fixed:Dependency::debug_string()andCountedFinishDependency::debug_string()read_blocked_task.size()without holding_task_lock.PipelineTask::debug_string()read_blocked_depand all dependency vectors without holding_dependency_lifecycle_lock.CountedFinishDependency::debug_string()read_counterwithout holding_mtx.RuntimeFilterTimerQueue::start()held two separate mutexes (cv_m+_que_lock) simultaneously. Consolidated into a singleAnnotatedMutex _que_lockwithstd::condition_variable_anyand aUniqueLock, eliminating the redundant lock.PriorityTaskQueue::push()checked_closedbefore acquiring the lock — a data race now caught because_closedis annotatedGUARDED_BY(_work_size_mutex). Moved the_closedcheck inside the lock, and hoisted_compute_level()(which doesn't touch shared state) before the lock.Replaced C-style
<stddef.h>/<stdint.h>with<cstddef>/<cstdint>, and removed unused includes.Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)