Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Draft][HUDI-7944]Adding Config to stop job incase of failure in any table during Multi… #12563

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

bibhu107
Copy link
Contributor

@bibhu107 bibhu107 commented Jan 1, 2025

Change Logs and Description Update

Overview

This proposal introduces a feature allowing clients to designate specific tables as critical. If any critical table fails during a write operation, the sync job will terminate immediately. The implementation is highly configurable, offering both strict and lenient failure handling policies to accommodate diverse use cases.


Proposed Configurations

  1. Critical Table Flag (Table-level Configuration):

    • Description: Marks specific tables as critical.
    • Behavior: The job fails immediately if any critical table encounters a failure.
    • Default: false (tables are non-critical by default).
  2. Failed Table Limit (Global Configuration for Multi-Streamer Job):

    • Description: Sets a threshold for the number of table failures allowed before the job fails.
    • Default: 0 (fail on any table failure unless overridden).
  3. Failed Table Handling Enabled (Global Configuration for Multi-Streamer Job):

    • Description: Toggles enforcement of the failed table limit.
    • Default: false (disabled by default).

Configuration Scenarios

Global Failure Threshold

  • The job fails if the total number of failed tables exceeds the configured limit.
  • Example Configuration:
    hoodie.streamer.failed_table_limit_enabled=true
    hoodie.streamer.failed_table_limit=2

Table-Specific Failure Rule

  • Specific tables can be marked as critical:
    hoodie.streamer.table.t2.criticalTable=true

Impact

None for current jobs, clients given option to add configs in future releases.

Risk Level

Low:

Documentation Updates


Contributor's Checklist

  • Read the contributor's guide.
  • Clearly state Change Logs and Impact.
  • Add adequate tests to validate failure configurations.
  • CI passed for implemented changes.

@github-actions github-actions bot added the size:S PR with lines of changes in (10, 100] label Jan 1, 2025
@hudi-bot
Copy link

hudi-bot commented Jan 1, 2025

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size:S PR with lines of changes in (10, 100]
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants