-
Notifications
You must be signed in to change notification settings - Fork 474
aws: allow user specification of fields to retain in the cloudtrail data stream #14236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
7d17abf
to
40b1eeb
Compare
🚀 Benchmarks reportPackage
|
Data stream | Previous EPS | New EPS | Diff (%) | Result |
---|---|---|---|---|
waf |
6666.67 | 5649.72 | -1016.95 (-15.25%) | 💔 |
To see the full report comment with /test benchmark fullreport
Pinging @elastic/security-service-integrations (Team:Security-Service Integrations) |
aebfd82
to
f5e0b91
Compare
Cloudtrail `response_elements`, `request_parameters` and `additional_eventdata` data can | ||
be placed in keyword and text fields as JSON, and in flattened fields. Depending on requirements | ||
This configuration determines which fields will be retained in the final document. The Minimal | ||
option retains the minmal set of fields required for the Security Detection Engine rules. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cloudtrail `response_elements`, `request_parameters` and `additional_eventdata` data can | |
be placed in keyword and text fields as JSON, and in flattened fields. Depending on requirements | |
This configuration determines which fields will be retained in the final document. The Minimal | |
option retains the minmal set of fields required for the Security Detection Engine rules. | |
Cloudtrail `response_elements`, `request_parameters` and `additional_eventdata` data can | |
be placed in keyword and text fields as JSON, and in flattened fields. Depending on requirements | |
this configuration determines which fields will be retained in the final document. The Minimal | |
option retains the minimal set of fields required for the Security Detection Engine rules. |
- text: Flattened | ||
value: flattened | ||
- text: Neither | ||
value: none |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is minimal
not applicable for this input?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I just forgot to add it; it came in later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM after resolving conflict. Thanks!
…ata stream Storage of the response_elements, request_parameters and additional_eventdata is a potentially significant cost, but different users have different requirements for their present, so there is no ideal approach. Given that it is likely that this optimisation will be a common desire, provide a UI option to allow users to easily configure this behaviour without the requirement of adding processors to remove the fields in an @Custom pipeline. Note also that there is a TODO in the pipeline addition here to move from a remove after creation model, spending fruitless work, to a non-creation model, which would not be possible to implement in an @Custom pipeline.
The fields were identified by running the following shell script in the the security_detection_engine/kibana/security_rule directory. for f in *; do jq 'select(.attributes.required_fields != null)|.attributes.required_fields|.[]|select(.name != null)|select(.name|contains("cloudtrail.flattened"))|.name'<$f done|sort|uniq The test for this is derived from the test-copy-object-json.log test case which includes one of the required fields and a number of other fields under cloudtrail.flattened. So comparing the test added here to that demonstrates whether is works.
c784f79
to
3253fb8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall.
Question. I'm actually not sure what is the httpjson stream. But why apply hbs changes to s3 and cloudwatch, but not to httpjson?
required_flattened_fields: | ||
- aws.cloudtrail.flattened.additional_eventdata.SSEApplied | ||
- aws.cloudtrail.flattened.request_parameters.cidrIp | ||
- aws.cloudtrail.flattened.request_parameters.dryRun | ||
- aws.cloudtrail.flattened.request_parameters.fromPort | ||
- aws.cloudtrail.flattened.request_parameters.includeDeprecated | ||
- aws.cloudtrail.flattened.request_parameters.policyArn | ||
- aws.cloudtrail.flattened.request_parameters.serialNumber | ||
- aws.cloudtrail.flattened.request_parameters.withDecryption | ||
- aws.cloudtrail.flattened.request_parameters.x-amz-server-side-encryption-customer-algorithm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally worry about this duplicated list from the detection engine. This will be easily missed in future iterations. Do you have thoughts on process to avoid it? Or maybe can we automate fetching this list on pre-commit? Or automate tests to verify consistency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the consequence of the technical debt that has been built up by the unconstrained use of fields in the detection rules. Fixing this would require a significant refactor and I think is outside the scope of this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree fixing is outside. But what is the strategy to keep consistency in the list with the rules over time?
packages/aws/data_stream/cloudtrail/elasticsearch/ingest_pipeline/default.yml
Show resolved
Hide resolved
/test |
💚 Build Succeeded
History
cc @efd6 |
|
Package aws - 3.10.0 containing this change is available at https://epr.elastic.co/package/aws/3.10.0/ |
Follow up issue: #14429 |
…ata stream (elastic#14236) Storage of the response_elements, request_parameters and additional_eventdata is a potentially significant cost, but different users have different requirements for their present, so there is no ideal approach. Given that it is likely that this optimisation will be a common desire, provide a UI option to allow users to easily configure this behaviour without the requirement of adding processors to remove the fields in an @Custom pipeline. Note also that there is a TODO in the pipeline addition here to move from a remove after creation model, spending fruitless work, to a non-creation model, which would not be possible to implement in an @Custom pipeline.
In #14236 we allowed users to select which extended fields they wanted to retain in order to reduce storage costs in cases where they did not what the full set of capacities that the data stream can provide. We did not however prevent the work of collecting those unwanted fields. This change does that, avoiding retaining fields that will ultimately not be kept if possible. It is unfortunate that the wide variety of fields is needed at all, but resolving that depends on improving platform support for the diversity of fields that the data source provides and then making more efficient use of those improvements in the detection rules. Until then, this is what we have.
Proposed commit message
Checklist
changelog.yml
file.Author's Checklist
How to test this PR locally
Related issues
Screenshots