Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP][Pull-based Ingestion] Support segment replication for pull-based ingestion #17359

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

varunbharadwaj
Copy link
Contributor

@varunbharadwaj varunbharadwaj commented Feb 14, 2025

Description

This PR is a follow up for pull-based-ingestion to support segment replication. The primary shard will ingest from the streaming source and replica shards will rely on segment replication.

Disaster scenarios, replica promotion and remote store tests will be added in a separate PR.

Related Issues

Resolves #16929

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions github-actions bot added enhancement Enhancement or improvement to existing feature or request Indexing Indexing, Bulk Indexing and anything related to indexing labels Feb 14, 2025
Copy link
Contributor

❌ Gradle check result for 7a682ef: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@@ -600,6 +638,11 @@ public void writeIndexingBuffer() throws EngineException {

@Override
public boolean shouldPeriodicallyFlush() {
ensureOpen();
if (shouldPeriodicallyFlushAfterBigMerge.get()) {
return true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious, why is this needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not required for segrep. I found the use of shouldPeriodicallyFlushAfterBigMerge after a big merge, but was not used anywhere except for here. We can remove this if not required.

@@ -632,7 +675,8 @@ public void flush(boolean force, boolean waitIfOngoing) throws EngineException {
// (3) the newly created commit points to a different translog generation (can free translog),
// or (4) the local checkpoint information in the last commit is stale, which slows down future recoveries.
boolean hasUncommittedChanges = indexWriter.hasUncommittedChanges();
if (hasUncommittedChanges || force) {
boolean shouldPeriodicallyFlush = shouldPeriodicallyFlush();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i remember i discussed this with @msfroh , and we thought shouldPeriodicallyFlush is enabled only with translog?

Copy link
Contributor

❌ Gradle check result for cea42e1: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@varunbharadwaj varunbharadwaj changed the title [Pull-based Ingestion] Support segment replication for pull-based ingestion [WIP][Pull-based Ingestion] Support segment replication for pull-based ingestion Feb 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request Indexing Indexing, Bulk Indexing and anything related to indexing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request] A new IngestionEngine that can pull data from streaming sources.
2 participants