Skip to content

[Kubernetes][Container Logs] Add ability to configure data_stream.dataset & "Custom configuration" #5555

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Mar 22, 2023

Conversation

BenB196
Copy link
Contributor

@BenB196 BenB196 commented Mar 15, 2023

Enhancement

What does this PR do?

This PR does 2 things for the Kubernetes Container Log integration:

  1. Adds the ability to configure data_stream.dataset.
  2. Adds the ability to add custom configurations to container logs

Both changes are intended to bring this integration closer inline with the Custom Log integration.

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • [ ] I have verified that Kibana version constraints are current according to guidelines.

Author's Checklist

  • data_stream.dataset is now configurable
  • "Custom configuration" section injects additional yaml configuration.
Before Upgrade
inputs:
  - id: filestream-container-logs-329e9d5f-dd99-438e-8271-3e3598c8300a
    name: kubernetes-1
    revision: 1
    type: filestream
    use_output: default
    meta:
      package:
        name: kubernetes
        version: 1.32.0
    data_stream:
      namespace: default
    package_policy_id: 329e9d5f-dd99-438e-8271-3e3598c8300a
    streams:
      - id: >-
          kubernetes-container-logs-${kubernetes.pod.name}-${kubernetes.container.id}
        data_stream:
          dataset: kubernetes.container_logs
          type: logs
        paths:
          - '/var/log/containers/*${kubernetes.container.id}.log'
        prospector.scanner.symlinks: true
        parsers:
          - container:
              stream: all
              format: auto
After Upgrade (No modifications)
inputs:
  - id: filestream-container-logs-329e9d5f-dd99-438e-8271-3e3598c8300a
    name: kubernetes-1
    revision: 2
    type: filestream
    use_output: default
    meta:
      package:
        name: kubernetes
        version: 1.33.0
    data_stream:
      namespace: default
    package_policy_id: 329e9d5f-dd99-438e-8271-3e3598c8300a
    streams:
      - id: >-
          kubernetes-container-logs-${kubernetes.pod.name}-${kubernetes.container.id}
        data_stream:
          dataset: kubernetes.container_logs
        prospector.scanner.symlinks: true
        paths:
          - '/var/log/containers/*${kubernetes.container.id}.log'
        parsers:
          - container:
              stream: all
              format: auto

Note: After the upgrade, the policy configuration no longer is setting streams.data_stream.type: logs. I don't believe this matters as I check the Custom logs integration and it does the same thing, as well as the static tests passed. But if desired, I can have it add streams.data_stream.type: logs.

After Upgrade (With modifications)
inputs:
  - id: filestream-container-logs-329e9d5f-dd99-438e-8271-3e3598c8300a
    name: kubernetes-1
    revision: 4
    type: filestream
    use_output: default
    meta:
      package:
        name: kubernetes
        version: 1.33.0
    data_stream:
      namespace: default
    package_policy_id: 329e9d5f-dd99-438e-8271-3e3598c8300a
    streams:
      - id: >-
          kubernetes-container-logs-${kubernetes.pod.name}-${kubernetes.container.id}
        data_stream:
          dataset: custom.dataset
        prospector.scanner.symlinks: true
        paths:
          - '/var/log/containers/*${kubernetes.container.id}.log'
        file_identity.inode_marker.path: /logs/.filebeat-marker #<-- Custom configuration line here.
        parsers:
          - container:
              stream: all
              format: auto

How to test this PR locally

  1. Install current integration
  2. Upgrade integration to this PR
  3. Observe that policy is similar to before the upgrade and the integration works
  4. Change data_stream.dataset value, and observe change is reflected in agent policy
  5. Change Custom configuration value, and observe change is reflected in agent policy

Related issues

Screenshots

image

@BenB196 BenB196 requested a review from a team as a code owner March 15, 2023 23:44
@BenB196 BenB196 requested review from ChrsMark and devamanv March 15, 2023 23:44
@elasticmachine
Copy link

elasticmachine commented Mar 15, 2023

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2023-03-21T10:59:58.792+0000

  • Duration: 27 min 9 sec

Test stats 🧪

Test Results
Failed 0
Passed 92
Skipped 0
Total 92

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

Copy link
Member

@ChrsMark ChrsMark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These additions look good to me.
@gizas @tommyers-elastic raising this with you too in case you have any additional comment or objection.

@gizas
Copy link
Contributor

gizas commented Mar 21, 2023

These additions look good to me. @gizas @tommyers-elastic raising this with you too in case you have any additional comment or objection.

As long as we introduce the dataset change, I would only advise to add the same functionality to all rest datastreams. It would be strange not to have the same experience to all.

@gizas
Copy link
Contributor

gizas commented Mar 21, 2023

We had another internal round of discussions and I would like to remove my comment of #5555 (comment)

The scenario that the user might only retrieve logs from specific application/ pods etc and would like to create a specific dataset to only include eg nginx logs is valid. But lets not expose the change to metrics, as those will still be coming from kubernetes and would create more confusion.

Sorry for the mess here.

So also this change #5555 (comment), to be done only for logs

@elasticmachine
Copy link

🌐 Coverage report

Name Metrics % (covered/total) Diff
Packages 100.0% (0/0) 💚
Files 100.0% (0/0) 💚
Classes 100.0% (0/0) 💚
Methods 96.154% (75/78) 👍 7.063
Lines 100.0% (0/0) 💚 10.742
Conditionals 100.0% (0/0) 💚

Copy link
Member

@ChrsMark ChrsMark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks!

@ChrsMark ChrsMark merged commit 4fcc6b5 into elastic:main Mar 22, 2023
@ChrsMark
Copy link
Member

Thank you @BenB196 for contributing this!

@elasticmachine
Copy link

Package kubernetes - 1.33.0 containing this change is available at https://epr.elastic.co/search?package=kubernetes

@BenB196 BenB196 deleted the add-configs-k8s-logs branch November 29, 2023 12:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow customizing the data_stream.dataset of Kubernetes container logs
5 participants