[Elasticsearch][Ingest pipeline] Stop truncating the elasticsearch server log messages #12813

consulthys · 2025-02-17T14:27:16Z

Proposed commit message

This PR fixes the ingest pipeline called logs-elasticsearch.server-<version>-pipeline-json that parses elasticsearch server logs.

The pipeline was inherited from the Filebeat elasticsearch module and hasn't changed in several years. The main issue is that the pipeline makes the assumption that if the message field value starts with square brackets (e.g. [xyz] some log message), then xyz is considered to be an index name (indexed in the elasticsearch.index.name field) and the message is truncated to only what comes after the square brackets (i.e. some log message). This assumption might have been true at some point in the past, but isn't the case anymore, i.e. the square brackets can contain literally anything, such as component names, class names, etc. Truncating the message field breaks downstream processes that expect to find the full log message in that field.

For instance, when applied on the following log message

[co.elastic.elasticsearch.metering.sampling.SampledStorageMetricsProvider] is not ready for collect yet

the truncated message field will only contain is not ready for collect yet, which now lacks context and is unsuable.

As it is not easy to find out all (internal and external) downstream processes that rely on the index name to be extracted from the log message, we need to keep extracting whatever is in the square brackets, but without truncating the message field. This PR suggests a non-breaking change that will keep extracting the index name (if it doesn't already exist in the document), while leaving the message field alone.

Checklist

I have reviewed tips for building integrations and this pull request is aligned with them.
I have added an entry to my package's changelog.yml file.

How to test this PR locally

Install the elasticsearch integration update
Run the following ingest pipeline simulation in Dev Tools and make sure that the message field is unaltered

POST _ingest/pipeline/logs-elasticsearch.server-1.17.2-pipeline-json/_simulate
{
  "docs": [
    {
      "_source": {
        "@timestamp": "2025-01-22T15:50:04.517Z",
        "log.level": "INFO",
        "message": "[co.elastic.elasticsearch.metering.sampling.SampledStorageMetricsProvider] is not ready for collect yet",
        "ecs.version": "1.2.0",
        "service.name": "ES_ECS",
        "event.dataset": "elasticsearch.server",
        "process.thread.name": "elasticsearch[es-es-search-c886b6975-9tsbz][metering_reporter][T#1]",
        "log.logger": "co.elastic.elasticsearch.metering.usagereports.UsageReportCollector",
        "elasticsearch.cluster.uuid": "vaYFhXk-Q0WFOdSjyaylnA",
        "elasticsearch.node.id": "fnqA_AXpTFm6uZdtXDNIPA",
        "elasticsearch.node.name": "es-es-search-c886b6975-9tsbz",
        "elasticsearch.cluster.name": "es"
      }
    }
  ]
}

Response:

{
  "docs": [
    {
      "doc": {
        "_source": {
          "elasticsearch": {
            "index": {
>>>>          "name": "co.elastic.elasticsearch.metering.sampling.SampledStorageMetricsProvider"
            }
          },
          ...
>>>>      "message": "[co.elastic.elasticsearch.metering.sampling.SampledStorageMetricsProvider] is not ready for collect yet",
          ...
        }
      }
    }
  ]
}

If the elasticsearch.index.name field is already present in the log document, then it will not be overridden

POST _ingest/pipeline/logs-elasticsearch.server-1.17.2-pipeline-json/_simulate
{
  "docs": [
    {
      "_source": {
        "@timestamp": "2025-01-22T15:50:04.517Z",
        "log.level": "INFO",
        "message": "[co.elastic.elasticsearch.metering.sampling.SampledStorageMetricsProvider] is not ready for collect yet",
        "ecs.version": "1.2.0",
        "service.name": "ES_ECS",
        "event.dataset": "elasticsearch.server",
        "process.thread.name": "elasticsearch[es-es-search-c886b6975-9tsbz][metering_reporter][T#1]",
        "log.logger": "co.elastic.elasticsearch.metering.usagereports.UsageReportCollector",
        "elasticsearch.cluster.uuid": "vaYFhXk-Q0WFOdSjyaylnA",
        "elasticsearch.node.id": "fnqA_AXpTFm6uZdtXDNIPA",
        "elasticsearch.node.name": "es-es-search-c886b6975-9tsbz",
        "elasticsearch.cluster.name": "es",
>>>     "elasticsearch.index.name": "index-123"
      }
    }
  ]
}

Response:

{
  "docs": [
    {
      "doc": {
        "_source": {
          "elasticsearch": {
            "index": {
>>>>          "name": "index-123"
            }
          },
          ...
>>>>      "message": "[co.elastic.elasticsearch.metering.sampling.SampledStorageMetricsProvider] is not ready for collect yet",
          ...
        }
      }
    }
  ]
}

Related issues

Closes #12501

…og.yml)

elastic-vault-github-plugin-prod · 2025-02-17T15:47:38Z

🚀 Benchmarks report

To see the full report comment with /test benchmark fullreport

elastic-sonarqube · 2025-02-18T17:37:31Z

Quality Gate passed

Issues
0 New issues
0 Fixed issues
0 Accepted issues

Measures
0 Security Hotspots
100.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube

elasticmachine · 2025-02-18T17:42:19Z

💚 Build Succeeded

Buildkite Build
Commit: b479c12

History

💔 Build #22457 failed b6a2b95
💚 Build #22418 succeeded 9eb9015
💔 Build #22415 failed ee0b087

cc @consulthys

pickypg

LGTM

elastic-vault-github-plugin-prod · 2025-02-18T19:51:19Z

Package elasticsearch - 1.17.2 containing this change is available at https://epr.elastic.co/package/elasticsearch/1.17.2/

…rver log messages (#12813) * Stop truncating the elasticsearch server log messages * Add more test cases

Stop truncating the elasticsearch server log messages

fd53352

consulthys added Integration:elasticsearch Elasticsearch Feature:Stack Monitoring Stack Monitoring Feature bugfix Pull request that fixes a bug issue Team:Stack Monitoring Stack Monitoring team [elastic/stack-monitoring] labels Feb 17, 2025

consulthys self-assigned this Feb 17, 2025

consulthys requested a review from a team as a code owner February 17, 2025 14:27

Stop truncating the elasticsearch server log messages (Update changel…

d1e49d2

…og.yml)

3kt mentioned this pull request Feb 17, 2025

Fixed ingest pipeline used in Elasticsearch transform #12757

Merged

consulthys mentioned this pull request Feb 17, 2025

[Elasticsearch]: Ingest pipeline created to process Elasticserver logs truncates log messages #12501

Closed

Stop truncating the elasticsearch server log messages (Fixed test)

ee0b087

consulthys requested a review from pickypg February 17, 2025 14:40

Stop truncating the elasticsearch server log messages (bump version)

9eb9015

consulthys added 3 commits February 18, 2025 17:02

Add more test cases

b6a2b95

Add more test cases (fix)

d398679

Add more test cases (fix)

b479c12

pickypg approved these changes Feb 18, 2025

View reviewed changes

consulthys merged commit 29006f1 into main Feb 18, 2025
6 checks passed

consulthys deleted the 12501-es-server-logs-truncation branch February 18, 2025 19:36

flexitrev pushed a commit that referenced this pull request Mar 20, 2025

[Elasticsearch][Ingest pipeline] Stop truncating the elasticsearch se…

961d8f9

…rver log messages (#12813) * Stop truncating the elasticsearch server log messages * Add more test cases

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Elasticsearch][Ingest pipeline] Stop truncating the elasticsearch server log messages #12813

[Elasticsearch][Ingest pipeline] Stop truncating the elasticsearch server log messages #12813

Uh oh!

consulthys commented Feb 17, 2025 •

edited

Loading

Uh oh!

elastic-vault-github-plugin-prod bot commented Feb 17, 2025

Uh oh!

elastic-sonarqube bot commented Feb 18, 2025

Uh oh!

elasticmachine commented Feb 18, 2025

Uh oh!

pickypg left a comment

Uh oh!

Uh oh!

elastic-vault-github-plugin-prod bot commented Feb 18, 2025

Uh oh!

Uh oh!

[Elasticsearch][Ingest pipeline] Stop truncating the elasticsearch server log messages #12813

[Elasticsearch][Ingest pipeline] Stop truncating the elasticsearch server log messages #12813

Uh oh!

Conversation

consulthys commented Feb 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Proposed commit message

Checklist

How to test this PR locally

Related issues

Uh oh!

elastic-vault-github-plugin-prod bot commented Feb 17, 2025

🚀 Benchmarks report

Uh oh!

elastic-sonarqube bot commented Feb 18, 2025

Quality Gate passed

Uh oh!

elasticmachine commented Feb 18, 2025

💚 Build Succeeded

History

Uh oh!

pickypg left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

elastic-vault-github-plugin-prod bot commented Feb 18, 2025

Uh oh!

Uh oh!

consulthys commented Feb 17, 2025 •

edited

Loading