Skip to content

[O11y][Apache] Update grok pattern for access and error log data streams #10228

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Jul 22, 2024

Conversation

harnish-crest-data
Copy link
Contributor

@harnish-crest-data harnish-crest-data commented Jun 24, 2024

  • Enhancement

Proposed commit message

  • Update grok pattern to support combined log format for access data stream.
    • %h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"
    • %A:%p %h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"
    • %h:%p %l %u %t \"%{req}i %U %H\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"
  • Update grok pattern to support new logs for error data stream. Refer

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • I have verified that Kibana version constraints are current according to guidelines.

Related issues

@elasticmachine
Copy link

elasticmachine commented Jun 24, 2024

🚀 Benchmarks report

To see the full report comment with /test benchmark fullreport

@harnish-crest-data harnish-crest-data marked this pull request as ready for review July 8, 2024 10:42
@harnish-crest-data harnish-crest-data requested a review from a team as a code owner July 8, 2024 10:42
@harnish-crest-data harnish-crest-data changed the title [O11y][Apache] Update grok pattern to support combined log format [O11y][Apache] Update grok pattern for access and error log data streams Jul 9, 2024
…pache-access-logs

Conflicts:
	packages/apache/_dev/build/docs/README.md
	packages/apache/changelog.yml
	packages/apache/data_stream/access/_dev/test/pipeline/test-access-basic.log-expected.json
	packages/apache/data_stream/access/_dev/test/pipeline/test-access-darwin.log-expected.json
	packages/apache/data_stream/access/_dev/test/pipeline/test-access-ssl-request.log-expected.json
	packages/apache/data_stream/access/_dev/test/pipeline/test-access-ubuntu.log-expected.json
	packages/apache/data_stream/access/_dev/test/pipeline/test-access-vhost.log-expected.json
	packages/apache/data_stream/access/fields/ecs.yml
	packages/apache/data_stream/error/_dev/test/pipeline/test-error-basic.log-expected.json
	packages/apache/docs/README.md
Copy link
Contributor

@kush-elastic kush-elastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

ignore_missing: true
on_failure:
- set:
field: tmp_host
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we removing the tmp_host field after processing is done?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it will be automatically removed by null processor as we are setting it's value to empty string. The reason behind this thing is, we need to do swapping up the source.address value!

@@ -2,3 +2,4 @@
[Mon Dec 26 16:15:55.103786 2016] [core:notice] [pid 11379] AH00094: Command line: '/usr/local/Cellar/httpd24/2.4.23_2/bin/httpd'
[Fri Sep 09 10:42:29.902022 2011] [core:error] [pid 35708:tid 4328636416] [client 89.160.20.156] File does not exist: /usr/local/apache2/htdocs/favicon.ico
[Thu Jun 27 06:58:09.169510 2019] [include:warn] [pid 15934] [client 89.160.20.156:12345] AH01374: mod_include: Options +Includes (or IncludesNoExec) wasn't set, INCLUDES filter removed: /test.html
AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 127.0.1.1. Set the 'ServerName' directive globally to suppress this message
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a valid error log message? Does this entry doesn't log the timestamp?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please checkout this quote that I have already attached in the description. This is warning log that is coming in the error log file and this log is the main reason of failure in the grok pattern!

@@ -184,6 +184,24 @@
"tags": [
"preserve_original_event"
]
},
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we try loading this document without @timestamp field in the dashboard? What happens if the user is trying to filter data for a specific time range in discover or in the dashboard?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we have tried this and if there are no timestamp present, it will take the doc ingestion time.

image

@muthu-mps
Copy link
Contributor

Are we supporting this log format as well?

@harnish-crest-data
Copy link
Contributor Author

Are we supporting this log format as well?

Yes we are supporting this log format. Let me update the same in pipeline tests!

@muthu-mps
Copy link
Contributor

Are we supporting this log format as well?

Yes we are supporting this log format. Let me update the same in pipeline tests!

Can you update the sample log in the pipeline test?

@harnish-crest-data
Copy link
Contributor Author

Are we supporting this log format as well?

Yes we are supporting this log format. Let me update the same in pipeline tests!

Can you update the sample log in the pipeline test?

Updated, thanks!

@@ -23,30 +25,49 @@ Please refer to the following [document](https://www.elastic.co/guide/en/ecs/cur
Supported format for the access logs are:

- [Common Log Format](https://en.wikipedia.org/wiki/Common_Log_Format)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Please change the reference link to point to the official Apache documentation.

  • Verify the link below, if its not specific to a release version we can include the below link for reference.
    common log format

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, thanks!

harnish-crest-data and others added 4 commits July 17, 2024 18:13
Co-authored-by: muthu-mps <101238137+muthu-mps@users.noreply.github.com>
Co-authored-by: muthu-mps <101238137+muthu-mps@users.noreply.github.com>
Co-authored-by: muthu-mps <101238137+muthu-mps@users.noreply.github.com>
Copy link
Contributor

@muthu-mps muthu-mps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@andrewkroh andrewkroh added Integration:apache Apache HTTP Server Team:Obs-InfraObs Observability Infrastructure Monitoring team [elastic/obs-infraobs-integrations] labels Jul 19, 2024
…pache-access-logs

Conflicts:
	packages/apache/changelog.yml
@elasticmachine
Copy link

💚 Build Succeeded

History

cc @harnish-elastic

Copy link

@harnish-crest-data harnish-crest-data merged commit 35cd691 into elastic:main Jul 22, 2024
5 checks passed
@elasticmachine
Copy link

Package apache - 1.23.0 containing this change is available at https://epr.elastic.co/search?package=apache

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Integration:apache Apache HTTP Server Team:Obs-InfraObs Observability Infrastructure Monitoring team [elastic/obs-infraobs-integrations]
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Apache integration Provided Grok expressions do not match field value
5 participants