[HUDI-8808] Fix concurrent execution of appending rollback blocks in the same file group #12568
+217
−3
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Change Logs
When there are multiple log files generated in the same file group in a inflight deltacommit, rollback of such a deltacommit can fail, because (1) the rollback plan contains multiple rollback requests targeting the log files to roll back in the same file group (2) concurrent execution of these rollback requests in Spark executors cause creation of new rollback log files (appending rollback blocks) in parallel, which involves determining the new log version concurrently, leading to the same new log version to be used in multiple executors, causing marker creation to fail (i.e., multiple executors try to create the marker on the same file name, and subsequent marker creation requests fail due to the fact that marker already exists).
Note that this problem only happens for table version 6 and below, and backwards compatible writer in Hudi 1.0 writing table version 6. Hudi 1.x does not append rollback blocks any more because of the new format spec.
This issue can be reproduced by the new test added
TestMarkerBasedRollbackStrategy#testRollbackMultipleLogFilesInOneFileGroupInMOR
before applying the fix (need to change lineassertEquals(1, rollbackRequests.size());
toassertEquals(199, rollbackRequests.size());
):This PR makes two fixes to tackle the problem:
(1) For rollback plan generation, group the log files to roll back based on the file group (partition + file group ID) so that the log files in the same file group only appears in the same rollback request in the rollback plan (see
MarkerBasedRollbackStrategy#getRollbackRequests
). Note that we only fix marker-based rollback as that's the only supported rollback mode going forward;(2) There can still be a case that the existing rollback plan generated on the timeline contains multiple rollback requests targeting the log files in the same file group. To avoid marker creation failure from concurrent execution on these original rollback requests, the rollback execution first groups the rollback requests based on the file group (partition + file group ID) in the rollback plan before parallelization (see
BaseRollbackHelper#maybeDeleteAndCollectStats
).This PR adds new tests to comprehensively cover the rollback logic (see
TestBaseRollbackHelper
,TestMarkerBasedRollbackStrategy
,TestRollbackUtils
,TestMarkerBasedRollbackStrategy
).Impact
Fixes rollback failures due to concurrent execution of appending rollback blocks in the same file group.
Risk level
low
Documentation Update
N/A
Contributor's checklist