Improve downsample performance by buffering docids and do bulk processing. #124477

martijnvg · 2025-03-10T12:45:32Z

No description provided.

elasticsearchmachine · 2025-03-10T12:45:58Z

Hi @martijnvg, I've created a changelog YAML for you.

martijnvg · 2025-03-11T11:22:25Z

The result without this change downsampling tsdb index to 1m interval buckets:

[2025-03-11T12:13:51,906][INFO ][o.e.x.d.DownsampleShardIndexer] [runTask-0] Shard [[tsdb][0]] successfully sent [116633696], received source doc [7089492], indexed downsampled doc [7089492], failed [0], took [2.8m]

The result without this change downsampling tsdb index to 1h interval buckets:

[2025-03-11T12:16:53,664][INFO ][o.e.x.d.DownsampleShardIndexer] [runTask-0] Shard [[tsdb][0]] successfully sent [116633696], received source doc [229256], indexed downsampled doc [229256], failed [0], took [1.5m]

The result with this change downsampling tsdb index to 1m interval buckets:

[2025-03-11T12:07:38,856][INFO ][o.e.x.d.DownsampleShardIndexer] [runTask-0] Shard [[tsdb][0]] successfully sent [116633696], received source doc [7089492], indexed downsampled doc [7089492], failed [0], took [2.4m]

The result with this change downsampling tsdb index to 1h interval buckets:

[2025-03-11T12:09:31,252][INFO ][o.e.x.d.DownsampleShardIndexer] [runTask-0] Shard [[tsdb][0]] successfully sent [116633696], received source doc [229256], indexed downsampled doc [229256], failed [0], took [56.6s]

… involved.

elasticsearchmachine · 2025-03-11T12:08:52Z

Pinging @elastic/es-storage-engine (Team:StorageEngine)

kkrik-es · 2025-03-12T12:54:47Z

...ugin/downsample/src/main/java/org/elasticsearch/xpack/downsample/DownsampleShardIndexer.java

+        }
+
+        void bulkCollection() throws IOException {
+            // The leaf bucked collectors newer timestamp go first, other we capture the incorrect last value for counters and labels.


The leaf bucket collectors with newer timestamp go first, to correctly capture the last value for counters and labels.

kkrik-es · 2025-03-12T13:01:45Z

...k/plugin/downsample/src/main/java/org/elasticsearch/xpack/downsample/LabelFieldProducer.java

+                if (docValuesCount == 1) {
+                    label.collect(docValues.nextValue());
+                } else {
+                    var values = new Object[docValuesCount];


How large can this be? Do we need to use bigarrays to track its memory?

Seems like it's smaller than DOCID_BUFFER_SIZE.. Should be fine.

It is for multivalued fields. If a document has a json array, then this is the size of the json array (actually the number of all unique values this document has for the field being read). Typically this is small.

Note that this has been this way also before this change. I don't recall issues around this.

kkrik-es · 2025-03-12T13:04:32Z

...ugin/downsample/src/main/java/org/elasticsearch/xpack/downsample/DownsampleShardIndexer.java

+                    firstTimeStampForBulkCollection = aggCtx.getTimestamp();
+                }
+                // buffer.add() always delegates to system.arraycopy() and checks buffer size for resizing purposes:
+                buffer.buffer[buffer.elementsCount++] = docId;


Nit: let's rename buffer to docIdBuffer, to avoid the buffer.buffer occurrence..

kkrik-es

LGTM, added Nhat too to play it safe.

…sing. (elastic#124477)

elasticsearchmachine · 2025-03-13T06:47:59Z

💚 Backport successful

Status	Branch	Result
✅	8.18
✅	8.x
✅	9.0

…sing. (elastic#124477)

…sing. (#124477) (#124698)

…sing. (#124477) (#124697)

…sing. (#124477) (#124696)

…sing. (elastic#124477)

martijnvg added >enhancement :StorageEngine/Downsampling Downsampling (replacement for rollups) - Turn fine-grained time-based data into coarser-grained data v8.18.1 v8.19.0 v9.0.1 v9.1.0 labels Mar 10, 2025

martijnvg mentioned this pull request Mar 10, 2025

A number of improvement to improve downsampling performance #124450

Closed

martijnvg force-pushed the downsample_bulk_processing branch 2 times, most recently from ff95cf7 to 0f70d99 Compare March 11, 2025 10:58

martijnvg added 3 commits March 11, 2025 13:07

Improve downsample performance by buffering docids and bulk processing.

c637900

Update docs/changelog/124477.yaml

34fda3d

Handle buffering / bulk procossing correctly if multiple segments are…

04c909d

… involved.

martijnvg force-pushed the downsample_bulk_processing branch from 0f70d99 to 04c909d Compare March 11, 2025 12:08

martijnvg marked this pull request as ready for review March 11, 2025 12:08

elasticsearchmachine added the Team:StorageEngine label Mar 11, 2025

martijnvg added the auto-backport Automatically create backport pull requests when merged label Mar 11, 2025

martijnvg requested a review from kkrik-es March 12, 2025 08:17

Merge remote-tracking branch 'es/main' into downsample_bulk_processing

2dc38f1

kkrik-es reviewed Mar 12, 2025

View reviewed changes

kkrik-es requested a review from dnhatn March 12, 2025 13:08

kkrik-es approved these changes Mar 12, 2025

View reviewed changes

rename and update comment in code.

5b32d82

martijnvg merged commit ce3a778 into elastic:main Mar 13, 2025
17 checks passed

martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request Mar 13, 2025

Improve downsample performance by buffering docids and do bulk proces…

b03e313

…sing. (elastic#124477)

martijnvg mentioned this pull request Mar 13, 2025

[8.18] Improve downsample performance by buffering docids and do bulk processing. (#124477) #124696

Merged

This was referenced Mar 13, 2025

[8.x] Improve downsample performance by buffering docids and do bulk processing. (#124477) #124697

Merged

[9.0] Improve downsample performance by buffering docids and do bulk processing. (#124477) #124698

Merged

martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request Mar 13, 2025

Improve downsample performance by buffering docids and do bulk proces…

fd0e731

…sing. (elastic#124477)

martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request Mar 13, 2025

Improve downsample performance by buffering docids and do bulk proces…

33f528f

…sing. (elastic#124477)

elasticsearchmachine pushed a commit that referenced this pull request Mar 13, 2025

Improve downsample performance by buffering docids and do bulk proces…

34ec9f2

…sing. (#124477) (#124698)

elasticsearchmachine pushed a commit that referenced this pull request Mar 13, 2025

Improve downsample performance by buffering docids and do bulk proces…

79be659

…sing. (#124477) (#124697)

elasticsearchmachine pushed a commit that referenced this pull request Mar 13, 2025

Improve downsample performance by buffering docids and do bulk proces…

74f402c

…sing. (#124477) (#124696)

albertzaharovits pushed a commit to albertzaharovits/elasticsearch that referenced this pull request Mar 13, 2025

Improve downsample performance by buffering docids and do bulk proces…

5c404f0

…sing. (elastic#124477)

martijnvg added the backport pending label Mar 13, 2025

jfreden pushed a commit to jfreden/elasticsearch that referenced this pull request Mar 13, 2025

Improve downsample performance by buffering docids and do bulk proces…

5a408f7

…sing. (elastic#124477)

martijnvg removed the backport pending label Mar 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve downsample performance by buffering docids and do bulk processing. #124477

Improve downsample performance by buffering docids and do bulk processing. #124477

Uh oh!

martijnvg commented Mar 10, 2025

Uh oh!

elasticsearchmachine commented Mar 10, 2025

Uh oh!

martijnvg commented Mar 11, 2025

Uh oh!

elasticsearchmachine commented Mar 11, 2025

Uh oh!

kkrik-es Mar 12, 2025 •

edited

Loading

Uh oh!

kkrik-es Mar 12, 2025

Uh oh!

kkrik-es Mar 12, 2025

Uh oh!

martijnvg Mar 12, 2025

Uh oh!

kkrik-es Mar 12, 2025

Uh oh!

kkrik-es left a comment

Uh oh!

Uh oh!

elasticsearchmachine commented Mar 13, 2025

Uh oh!

Uh oh!

Improve downsample performance by buffering docids and do bulk processing. #124477

Improve downsample performance by buffering docids and do bulk processing. #124477

Uh oh!

Conversation

martijnvg commented Mar 10, 2025

Uh oh!

elasticsearchmachine commented Mar 10, 2025

Uh oh!

martijnvg commented Mar 11, 2025

Uh oh!

elasticsearchmachine commented Mar 11, 2025

Uh oh!

kkrik-es Mar 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kkrik-es Mar 12, 2025

Choose a reason for hiding this comment

Uh oh!

kkrik-es Mar 12, 2025

Choose a reason for hiding this comment

Uh oh!

martijnvg Mar 12, 2025

Choose a reason for hiding this comment

Uh oh!

kkrik-es Mar 12, 2025

Choose a reason for hiding this comment

Uh oh!

kkrik-es left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

elasticsearchmachine commented Mar 13, 2025

💚 Backport successful

Uh oh!

Uh oh!

kkrik-es Mar 12, 2025 •

edited

Loading