TimeSeries Desc Sort gets skipped with Lucene 10 upgrade #17329

expani · 2025-02-11T22:50:45Z

Description

We override IndexSearcher#search(List<LeafReaderContext> leaves, Weight weight, Collector collector) and it contains the logic to apply time series desc optimisation for scanning segments in reverse for descending sort use-cases.

Existing Lucene 9.12.1 and OS 2.19 call flow

Lucene 10.1.0 and OS 3.0 call flow

With Lucene 10 changes, the new replacement method

search(LeafReaderContextPartition[] partitions, Weight weight, Collector collector)

only gets invoked via IndexSearcher#search(Query, CollecterManager) which is not used by OpenSearch in QueryPhase#searchWithCollector.

So, it was never getting called causing the time series desc optimization to be skipped.

My changes ensure it will be called in the same way that IndexSearcher does for CollectorManager variant.

Related Issues

Resolves #16934

…IdxSearcher Signed-off-by: expani <[email protected]>

Signed-off-by: expani <[email protected]>

expani · 2025-02-11T22:52:23Z

@msfroh @reta Please help in reviewing the same.

I am trying to see if there is an easy to catch this with existing UTs as well.

reta · 2025-02-11T22:59:20Z

I am trying to see if there is an easy to catch this with existing UTs as well.

This is explains the xxx_desc regressions in OSB, thanks @expani !

expani · 2025-02-11T23:21:08Z

@reta With this change, OS 2.19 and OS 3.0 are at same for following query types in big5

asc_sort_timestamp
asc_sort_timestamp_can_match_shortcut
asc_sort_timestamp_no_can_match_shortcut

Without it, the latency for OS 3.0 was 10ms more than OS 2.19

github-actions · 2025-02-11T23:28:27Z

❌ Gradle check result for 21d5903: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions · 2025-02-12T00:09:45Z

❌ Gradle check result for 368383f: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: expani <[email protected]>

reta · 2025-02-12T01:06:31Z

server/src/main/java/org/opensearch/search/internal/ContextIndexSearcher.java

+        // TODO : Remove when switching to use the @org.apache.lucene.search.IndexSearcher#search(Query, CollectorManager) variant from
+        // @org.opensearch.search.query.QueryPhase#searchWithCollector which then calls the overridden
+        // search(LeafReaderContextPartition[] partitions, Weight weight, Collector collector)
+        query = collector.scoreMode().needsScores() ? rewrite(query) : rewrite(new ConstantScoreQuery(query));


@msfroh this change make sense to me to restore searchContext.shouldUseTimeSeriesDescSortOptimization() optimization, wdyt?

github-actions · 2025-02-12T01:40:01Z

❌ Gradle check result for 21a924d: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

reta · 2025-02-12T02:15:36Z

server/src/main/java/org/opensearch/search/internal/ContextIndexSearcher.java

+        query = collector.scoreMode().needsScores() ? rewrite(query) : rewrite(new ConstantScoreQuery(query));
+        Weight weight = createWeight(query, collector.scoreMode(), 1);
+        LeafSlice[] leafSlices = getSlices();
+        for (LeafSlice leafSlice : leafSlices) {


@expani I think we should get all partitions across all slices and call search only once:

LeafReaderContextPartition[] partitions = Arrays.stream(getSlices()).flatMap(LeafSlice::partitions).toArray(); search(partitions, weight, collector)

If we can override the logic in getSlices, we could even handle the desc sort use-case more efficiently by returning slices that contain leafcontextpartitions with the higher doc ids and front-load those.

Thanks for the suggestions @reta @msfroh

I am still facing regression with desc_sort_timestamp and it's variants let me try fixing using these and will update in sometime.

expani added 2 commits February 11, 2025 12:37

Ensuring time series desc sort optimisation invokes searchLeaf on Ctx…

0442ed9

…IdxSearcher Signed-off-by: expani <[email protected]>

Added TODO to remove when migrating towards CollectorManager

21d5903

Signed-off-by: expani <[email protected]>

github-actions bot added bug Something isn't working Other labels Feb 11, 2025

Merge branch 'opensearch-project:main' into perf_16934_1

368383f

reta added the skip-changelog label Feb 12, 2025

remove post collection as it happens in child method

21a924d

Signed-off-by: expani <[email protected]>

reta reviewed Feb 12, 2025

View reviewed changes

reta approved these changes Feb 12, 2025

View reviewed changes

reta reviewed Feb 12, 2025

View reviewed changes

opensearch-ci-bot mentioned this pull request Feb 11, 2025

[AUTOCUT] Gradle Check Flaky Test Report for MetadataCreateIndexServiceTests #17291

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TimeSeries Desc Sort gets skipped with Lucene 10 upgrade #17329

TimeSeries Desc Sort gets skipped with Lucene 10 upgrade #17329

expani commented Feb 11, 2025 •

edited

Loading

expani commented Feb 11, 2025

reta commented Feb 11, 2025

expani commented Feb 11, 2025 •

edited

Loading

github-actions bot commented Feb 11, 2025

github-actions bot commented Feb 12, 2025

reta Feb 12, 2025

github-actions bot commented Feb 12, 2025

reta Feb 12, 2025 •

edited

Loading

msfroh Feb 12, 2025

expani Feb 12, 2025

TimeSeries Desc Sort gets skipped with Lucene 10 upgrade #17329

Are you sure you want to change the base?

TimeSeries Desc Sort gets skipped with Lucene 10 upgrade #17329

Conversation

expani commented Feb 11, 2025 • edited Loading

Description

Related Issues

expani commented Feb 11, 2025

reta commented Feb 11, 2025

expani commented Feb 11, 2025 • edited Loading

github-actions bot commented Feb 11, 2025

github-actions bot commented Feb 12, 2025

reta Feb 12, 2025

Choose a reason for hiding this comment

github-actions bot commented Feb 12, 2025

reta Feb 12, 2025 • edited Loading

Choose a reason for hiding this comment

msfroh Feb 12, 2025

Choose a reason for hiding this comment

expani Feb 12, 2025

Choose a reason for hiding this comment

expani commented Feb 11, 2025 •

edited

Loading

expani commented Feb 11, 2025 •

edited

Loading

reta Feb 12, 2025 •

edited

Loading