Skip to content

Updating grok pattern for awss3 access ingest pipeline #13486

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Apr 11, 2025
Merged

Updating grok pattern for awss3 access ingest pipeline #13486

merged 5 commits into from
Apr 11, 2025

Conversation

gizas
Copy link
Contributor

@gizas gizas commented Apr 9, 2025

  • Enhancement

Proposed commit message

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • I have verified that Kibana version constraints are current according to guidelines.
  • I have verified that any added dashboard complies with Kibana's Dashboard good practices

How to test this PR locally

Use the Grok debugger

Screenshot 2025-04-09 at 6 03 17 PM

Signed-off-by: Andreas Gkizas <andreas.gkizas@elastic.co>
@gizas gizas requested review from a team as code owners April 9, 2025 15:09
Signed-off-by: Andreas Gkizas <andreas.gkizas@elastic.co>
Copy link
Contributor

@Kavindu-Dodan Kavindu-Dodan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@andrewkroh andrewkroh added Integration:aws AWS Team:obs-ds-hosted-services Observability Hosted Services team [elastic/obs-ds-hosted-services] labels Apr 9, 2025
Signed-off-by: Andreas Gkizas <andreas.gkizas@elastic.co>
@elastic-vault-github-plugin-prod
Copy link

elastic-vault-github-plugin-prod bot commented Apr 10, 2025

🚀 Benchmarks report

Package aws 👍(10) 💚(3) 💔(6)

Expand to view
Data stream Previous EPS New EPS Diff (%) Result
route53_resolver_logs 8547.01 4366.81 -4180.2 (-48.91%) 💔
apigateway_logs 10526.32 6535.95 -3990.37 (-37.91%) 💔
cloudwatch_logs 333333.33 250000 -83333.33 (-25%) 💔
ec2_logs 45454.55 28571.43 -16883.12 (-37.14%) 💔
emr_logs 16129.03 10526.32 -5602.71 (-34.74%) 💔
firewall_logs 3039.51 2493.77 -545.74 (-17.95%) 💔

To see the full report comment with /test benchmark fullreport

Copy link
Contributor

@zmoog zmoog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

I just added a couple of non-blocking nits.

Comment on lines 100 to 103
- name: point_arn
type: keyword
description: |
The Amazon Resource Name (ARN) of the access point of the request.
Copy link
Contributor

@zmoog zmoog Apr 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

I see we named the new field point_arn.

Since the log field name is "Access Point ARN", shouldn't we call this access_point_arn?

I know, by using aws.s3access.access_point_arn feels like we're repeating "acccess", but the two "access"-es names have different semantics:

  • Amazon S3 server access (log format)
  • Access Point ARN (log field)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +8 to +10
67797214d75628047d9c76b18a78cded1a4b069b71f2a9d5a53649c38da8770b flow-log-test [14/Jul/2021:18:57:31 +0000] - svc:delivery.logs.amazonaws.com MVGXZXEVN3IG9S24 REST.PUT.OBJECT AWSLogs/000000000000/vpcflowlogs/us-gov-east-1/2021/07/13/000000000000_vpcflowlogs_us-gov-east-1_fl-_20210713T1855Z_f12aa632.log.gz "PUT /AWSLogs/000000000000/vpcflowlogs/us-gov-east-1/2021/07/13/000000000000_vpcflowlogs_us-gov-east-1_fl-0e7c13bf00cf15bfe_20210713T1855Z_f12aa632.log.gz HTTP/1.1" 200 - - 773 103 13 "-" "-" - 02SxwfXpO5UysN0GsKGa3uGDQ6E/W7+Hwo/luRH8p1VEexULoe66RCM+nja0dEq2JqLrtgjocvVRRkVt4= SigV4 ECDHE-RSA-AES128-GCM-SHA256 AuthHeader flow-log-test.s3.us-gov-west-1.amazonaws.com TLSv1.2 arn:aws:s3:us-west-1:123456789012:accesspoint/example-AP Yes
67797214d75628047d9c76b18a78cded1a4b069b71f2a9d5a53649c38da8770b flow-log-test [14/Jul/2021:18:57:31 +0000] - svc:delivery.logs.amazonaws.com MVGXZXEVN3IG9S24 REST.PUT.OBJECT AWSLogs/000000000000/vpcflowlogs/us-gov-east-1/2021/07/13/000000000000_vpcflowlogs_us-gov-east-1_fl-_20210713T1855Z_f12aa632.log.gz "PUT /AWSLogs/000000000000/vpcflowlogs/us-gov-east-1/2021/07/13/000000000000_vpcflowlogs_us-gov-east-1_fl-0e7c13bf00cf15bfe_20210713T1855Z_f12aa632.log.gz HTTP/1.1" - - - 773 103 13 "-" "-" - 02SxwfXpO5UysN0GsKGa3uGDQ6E/W7+Hwo/luRH8p1VEexULoe66RCM+nja0dEq2JqLrtgjocvVRRkVt4= SigV4 ECDHE-RSA-AES128-GCM-SHA256 AuthHeader flow-log-test.s3.us-gov-west-1.amazonaws.com TLSv1.2 arn:aws:s3:us-west-1:123456789012:accesspoint/example-AP -
b854390a51155554b82ce2759564a1135bce83133d004f4d2001f157e13985d7 flow-log-test [25/Mar/2025:19:28:02 +0000] - AmazonS3 366DB3C4B325AB11 S3.EXPIRE.OBJECT 0/chum/_vars/logtests/PlannerModule/5f6ea3b7da96ab304a77225d5b2b2a55e54b74e4ddfdf14b9b1d853d77515b88_9febba22f08b11ef8cf6020058a9efab/2024/12/30/164700/kitt_189/_spcu_sride__state_svx__feature__flags.sst "-" - - - 317 - - "-" "-" qsEq9bDa2VyxyZ4cz0c7oBnF67VYTTij DMlPb9al4CvVBck150CgpEIIYgtSI3HC/atetNVYwPtHZffW6jfpg+BrffhbT9/B - - - - - - -
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

While reading the log examples Amazon S3 server access log format, I noticed we can have an aclrequired without access_point_arn:

79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be amzn-s3-demo-bucket1 [06/Feb/2019:00:00:38 +0000] 192.0.2.3 79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be 3E57427F3EXAMPLE REST.GET.VERSIONING - "GET /amzn-s3-demo-bucket1?versioning HTTP/1.1" 200 - 113 - 7 - "-" "S3Console/0.4" - s9lzHYrFp76ZVxRcpX9+5cjAnEH2ROuNkd2BHfIa6UkFVdtjf5mKR3/eTPFvsiP/XV/VLi31234= SigV4 ECDHE-RSA-AES128-GCM-SHA256 AuthHeader amzn-s3-demo-bucket1.s3.us-west-1.amazonaws.com TLSV1.2 arn:aws:s3:us-west-1:123456789012:accesspoint/example-AP Yes
79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be amzn-s3-demo-bucket1 [06/Feb/2019:00:00:38 +0000] 192.0.2.3 79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be 891CE47D2EXAMPLE REST.GET.LOGGING_STATUS - "GET /amzn-s3-demo-bucket1?logging HTTP/1.1" 200 - 242 - 11 - "-" "S3Console/0.4" - 9vKBE6vMhrNiWHZmb2L0mXOcqPGzQOI5XLnCtZNPxev+Hf+7tpT6sxDwDty4LHBUOZJG96N1234= SigV4 ECDHE-RSA-AES128-GCM-SHA256 AuthHeader amzn-s3-demo-bucket1.s3.us-west-1.amazonaws.com TLSV1.2 - -
79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be amzn-s3-demo-bucket1 [06/Feb/2019:00:00:38 +0000] 192.0.2.3 79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be A1206F460EXAMPLE REST.GET.BUCKETPOLICY - "GET /amzn-s3-demo-bucket1?policy HTTP/1.1" 404 NoSuchBucketPolicy 297 - 38 - "-" "S3Console/0.4" - BNaBsXZQQDbssi6xMBdBU2sLt+Yf5kZDmeBUP35sFoKa3sLLeMC78iwEIWxs99CRUrbS4n11234= SigV4 ECDHE-RSA-AES128-GCM-SHA256 AuthHeader amzn-s3-demo-bucket1.s3.us-west-1.amazonaws.com TLSV1.2 - Yes 
79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be amzn-s3-demo-bucket1 [06/Feb/2019:00:01:00 +0000] 192.0.2.3 79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be 7B4A0FABBEXAMPLE REST.GET.VERSIONING - "GET /amzn-s3-demo-bucket1?versioning HTTP/1.1" 200 - 113 - 33 - "-" "S3Console/0.4" - Ke1bUcazaN1jWuUlPJaxF64cQVpUEhoZKEG/hmy/gijN/I1DeWqDfFvnpybfEseEME/u7ME1234= SigV4 ECDHE-RSA-AES128-GCM-SHA256 AuthHeader amzn-s3-demo-bucket1.s3.us-west-1.amazonaws.com TLSV1.2 - -
79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be amzn-s3-demo-bucket1 [06/Feb/2019:00:01:57 +0000] 192.0.2.3 79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be DD6CC733AEXAMPLE REST.PUT.OBJECT s3-dg.pdf "PUT /amzn-s3-demo-bucket1/s3-dg.pdf HTTP/1.1" 200 - - 4406583 41754 28 "-" "S3Console/0.4" - 10S62Zv81kBW7BB6SX4XJ48o6kpcl6LPwEoizZQQxJd5qDSCTLX0TgS37kYUBKQW3+bPdrg1234= SigV4 ECDHE-RSA-AES128-SHA AuthHeader amzn-s3-demo-bucket1.s3.us-west-1.amazonaws.com TLSV1.2 - Yes 

What about adding one more test case with aclrequired and no access_point_arn?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gizas added 2 commits April 11, 2025 14:00
Signed-off-by: Andreas Gkizas <andreas.gkizas@elastic.co>
Signed-off-by: Andreas Gkizas <andreas.gkizas@elastic.co>
Copy link

@elasticmachine
Copy link

💚 Build Succeeded

History

@gizas gizas merged commit c15756e into main Apr 11, 2025
7 checks passed
@gizas gizas deleted the s3grok branch April 11, 2025 13:14
@elastic-vault-github-plugin-prod

Package aws - 2.45.2 containing this change is available at https://epr.elastic.co/package/aws/2.45.2/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Integration:aws AWS Team:obs-ds-hosted-services Observability Hosted Services team [elastic/obs-ds-hosted-services]
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants