Skip to content

[core] Add committer-side bucket consistency check#7793

Open
Aitozi wants to merge 1 commit intoapache:masterfrom
Aitozi:aitozi/bucket-unorder
Open

[core] Add committer-side bucket consistency check#7793
Aitozi wants to merge 1 commit intoapache:masterfrom
Aitozi:aitozi/bucket-unorder

Conversation

@Aitozi
Copy link
Copy Markdown
Contributor

@Aitozi Aitozi commented May 9, 2026

Purpose

Add committer-side bucket consistency validation for write-only unordered append tables.

after #6741 When bucket-append-ordered=false and write-only=true, writers skip restoring previous files, so bucket-count validation can be bypassed after bucket rescale. This change adds an internal commit-side checkSameBucket path for fixed hash bucket tables to validate touched partitions before committing.

The check is integrated with ConflictDetection, reuses the existing conflict path when available, and uses a bounded partition cache to avoid repeatedly checking the same partition within one committer lifecycle.

Tests

  • Added core coverage for unordered write-only append:
    • prepareCommit succeeds when bucket count changes.
    • commit fails for an existing partition with mismatched bucket count.
    • writing a new partition succeeds.
    • append-style commits with both DELETE and ADD still validate bucket consistency.
  • Added Flink IT coverage:
    • INSERT INTO fails after bucket count changes.
    • INSERT OVERWRITE succeeds for rescaling.

@Aitozi Aitozi force-pushed the aitozi/bucket-unorder branch 3 times, most recently from 06926eb to ab6ff42 Compare May 9, 2026 12:36
@Aitozi Aitozi force-pushed the aitozi/bucket-unorder branch from ab6ff42 to 8384547 Compare May 9, 2026 13:15
@Aitozi Aitozi requested a review from JingsongLi May 9, 2026 14:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant