Skip to content

Avoid over collecting in Limit or Lucene Operator #123296

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 1, 2025

Conversation

dnhatn
Copy link
Member

@dnhatn dnhatn commented Feb 24, 2025

Currently, we rely on signal propagation for early termination. For example, FROM index | LIMIT 10 can be executed by multiple Drivers: several Drivers to read document IDs and extract fields, and the final Driver to select at most 10 rows. In this scenario, each Lucene Driver can independently collect up to 10 rows until the final Driver has enough rows and signals them to stop collecting. In most cases, this model works fine, but when extracting fields from indices in the warm/cold tier, it can impact performance. This change introduces a Limiter used between LimitOperator and LuceneSourceOperator to avoid over-collecting. We will also need a follow-up to ensure that we do not over-collect between multiple stages of query execution.

@dnhatn dnhatn force-pushed the shared-limiter branch 2 times, most recently from 7754300 to 80a3936 Compare February 24, 2025 18:19
@dnhatn dnhatn changed the title Share limiter between limit or lucene source operators Avoid over collecting in Limit or Lucene Operator Feb 24, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @dnhatn, I've created a changelog YAML for you.

@dnhatn dnhatn added the auto-backport Automatically create backport pull requests when merged label Feb 24, 2025
@dnhatn dnhatn marked this pull request as ready for review February 24, 2025 23:56
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Feb 24, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

Copy link
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great stuff. Sharing state in the pipeline execution (limit, min/max for filters) to trigger early execution is going to help significantly in high cardinality scenarios.

@dnhatn
Copy link
Member Author

dnhatn commented Mar 1, 2025

Thanks Costin!

@dnhatn dnhatn merged commit 333e252 into elastic:main Mar 1, 2025
17 checks passed
@dnhatn dnhatn deleted the shared-limiter branch March 1, 2025 00:43
@elasticsearchmachine
Copy link
Collaborator

elasticsearchmachine commented Mar 1, 2025

Status Branch Result
8.18
8.x
9.0

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 123296

dnhatn added a commit to dnhatn/elasticsearch that referenced this pull request Mar 1, 2025
Currently, we rely on signal propagation for early termination. For 
example, FROM index | LIMIT 10 can be executed by multiple Drivers:
several Drivers to read document IDs and extract fields, and the final
Driver to select at most 10 rows. In this scenario, each Lucene Driver
can independently collect up to 10 rows until the final Driver has
enough rows and signals them to stop collecting. In most cases, this
model works fine, but when extracting fields from indices in the
warm/cold tier, it can impact performance. This change introduces a
Limiter used between LimitOperator and LuceneSourceOperator to avoid
over-collecting. We will also need a follow-up to ensure that we do not
over-collect between multiple stages of query execution.
dnhatn added a commit to dnhatn/elasticsearch that referenced this pull request Mar 1, 2025
Currently, we rely on signal propagation for early termination. For 
example, FROM index | LIMIT 10 can be executed by multiple Drivers:
several Drivers to read document IDs and extract fields, and the final
Driver to select at most 10 rows. In this scenario, each Lucene Driver
can independently collect up to 10 rows until the final Driver has
enough rows and signals them to stop collecting. In most cases, this
model works fine, but when extracting fields from indices in the
warm/cold tier, it can impact performance. This change introduces a
Limiter used between LimitOperator and LuceneSourceOperator to avoid
over-collecting. We will also need a follow-up to ensure that we do not
over-collect between multiple stages of query execution.
elasticsearchmachine pushed a commit that referenced this pull request Mar 1, 2025
Currently, we rely on signal propagation for early termination. For 
example, FROM index | LIMIT 10 can be executed by multiple Drivers:
several Drivers to read document IDs and extract fields, and the final
Driver to select at most 10 rows. In this scenario, each Lucene Driver
can independently collect up to 10 rows until the final Driver has
enough rows and signals them to stop collecting. In most cases, this
model works fine, but when extracting fields from indices in the
warm/cold tier, it can impact performance. This change introduces a
Limiter used between LimitOperator and LuceneSourceOperator to avoid
over-collecting. We will also need a follow-up to ensure that we do not
over-collect between multiple stages of query execution.
elasticsearchmachine pushed a commit that referenced this pull request Mar 1, 2025
…23784)

* Avoid over collecting in Limit or Lucene Operator (#123296)

Currently, we rely on signal propagation for early termination. For 
example, FROM index | LIMIT 10 can be executed by multiple Drivers:
several Drivers to read document IDs and extract fields, and the final
Driver to select at most 10 rows. In this scenario, each Lucene Driver
can independently collect up to 10 rows until the final Driver has
enough rows and signals them to stop collecting. In most cases, this
model works fine, but when extracting fields from indices in the
warm/cold tier, it can impact performance. This change introduces a
Limiter used between LimitOperator and LuceneSourceOperator to avoid
over-collecting. We will also need a follow-up to ensure that we do not
over-collect between multiple stages of query execution.

* Fix compilation after #123784

* fix compile

* fix compile
elasticsearchmachine pushed a commit that referenced this pull request Mar 1, 2025
…123783)

* Avoid over collecting in Limit or Lucene Operator (#123296)

Currently, we rely on signal propagation for early termination. For 
example, FROM index | LIMIT 10 can be executed by multiple Drivers:
several Drivers to read document IDs and extract fields, and the final
Driver to select at most 10 rows. In this scenario, each Lucene Driver
can independently collect up to 10 rows until the final Driver has
enough rows and signals them to stop collecting. In most cases, this
model works fine, but when extracting fields from indices in the
warm/cold tier, it can impact performance. This change introduces a
Limiter used between LimitOperator and LuceneSourceOperator to avoid
over-collecting. We will also need a follow-up to ensure that we do not
over-collect between multiple stages of query execution.

* Fix compilation after #123784

* fix compile

* fix compile
elasticsearchmachine pushed a commit that referenced this pull request Mar 3, 2025
A follow-up to #123296 to address a potential block leak that may occur
when a circuit-breaking exception is triggered while truncating the docs
or scores blocks.

Relates  #123296
dnhatn added a commit to dnhatn/elasticsearch that referenced this pull request Mar 3, 2025
A follow-up to elastic#123296 to address a potential block leak that may occur
when a circuit-breaking exception is triggered while truncating the docs
or scores blocks.

Relates  elastic#123296

(cherry picked from commit 7560e2e)
dnhatn added a commit to dnhatn/elasticsearch that referenced this pull request Mar 3, 2025
A follow-up to elastic#123296 to address a potential block leak that may occur
when a circuit-breaking exception is triggered while truncating the docs
or scores blocks.

Relates  elastic#123296

(cherry picked from commit 7560e2e)
dnhatn added a commit that referenced this pull request Mar 3, 2025
A follow-up to #123296 to address a potential block leak that may occur
when a circuit-breaking exception is triggered while truncating the docs
or scores blocks.

Relates  #123296

(cherry picked from commit 7560e2e)
dnhatn added a commit that referenced this pull request Mar 3, 2025
A follow-up to #123296 to address a potential block leak that may occur
when a circuit-breaking exception is triggered while truncating the docs
or scores blocks.

Relates  #123296

(cherry picked from commit 7560e2e)
dnhatn added a commit that referenced this pull request Mar 3, 2025
A follow-up to #123296 to address a potential block leak that may occur
when a circuit-breaking exception is triggered while truncating the docs
or scores blocks.

Relates  #123296
/**
* A shared limiter used by multiple drivers to collect hits in parallel without exceeding the output limit.
* For example, if the query `FROM test-1,test-2 | LIMIT 100` is run with two drivers, and one driver (e.g., querying `test-1`)
* has collected 60 hits, then the other driver querying `test-2` should collect at most 40 hits.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice idea!

I wonder if we should make it explicit that this works as long as test-1 and test-2 are on the same node?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL auto-backport Automatically create backport pull requests when merged >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.18.1 v8.19.0 v9.0.1 v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants