-
Notifications
You must be signed in to change notification settings - Fork 474
[Cloudflare Logpush] Fix data type for http_request.bot.detection_tags
field
#13581
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Cloudflare Logpush] Fix data type for http_request.bot.detection_tags
field
#13581
Conversation
Pinging @elastic/security-service-integrations (Team:Security-Service Integrations) |
🚀 Benchmarks reportTo see the full report comment with |
Shouldn't the version be increased to |
Since this is a breaking change to the current integration, we've increased a major version. |
I see, but this is a requirement? Because this could make the integration version increase too fast, I've just reported another mapping error that would lead to another breaking change, so in a couple of weeks this integration version could jump from There are cases where the PR was |
Got it! I've modified the version to |
@moxarth-rathod, I don't think this should be categorized as a breaking change. This is a mapping issue and is technically a bugfix. Breaking change is something where the schema or rules of mapping change drastically and might even break current indexing rules. It could also be a complete rework of the underlying input or a change that breaks existing state such that it is incompatible with the previous version. This is also not an enhancement hence we should not increment the minor version, only the patch version should be incremented in case of a bugfix. |
Agree with @ShourieG here. In this case the field is never indexed and is being ignored. So it wouldn't result in any mapping conflicts (long vs keyword). |
This change will lead to a mapping conflict in Kibana, the user may not be able to use the field until all older indices with the wrong mapping are removed from the cluster,shouldn't this be considered a breaking change? In this specific case it may not have much impact because this field was added recently, but with the wrong mapping.
Changing the mapping from |
Hi @leandrojmp, could you elaborate a bit more on this mapping conflict and performance impact. I was assuming that it would not be a major issue as this field is not used inside any dashboard/visualizations and it was not even indexed previously. With an index rollover and sufficient time passage won't this issue self-correct ? |
@moxarth-rathod, it will be nice to test this scenario (if mapped but unindexed field can cause conflicts). @leandrojmp, atm we are working on communicating breaking changes to the users better: elastic/package-spec#817 through UI changes. |
@ShourieG sure! The fact that the field was not indexed does not matter for the conflict, the field is explictly mapped, so when the backing index for the data stream is create it will have the wrong mapping. When we send a document with strings for a field that has a If the mapping is corrected to When a data view or index pattern matches multiple indices and a field has conflicting mappings on those indices, the field cannot be used in Kibana, you cannot filter on it, you cannot search on it and you cannot use it in visualizations. There are 2 main ways to solve this:
The performance impact could happen if you use runtime fields with by definition have a performance impact, it is one of their compromises, specially in large datasets, which is normally the case for Cloudflare logs. So in resume, changing the mapping and rolling over the index fixes the mapping issues in Elasticsearch and new index will be able to index this field, but it will result in conflicts in Kibana, which makes the field unusable until the conflicts are solved.
What cause the conflict in Kibana is the mapping, it doesn't matter if there is data on the index or not, if you create two empty indices where a field has different mapping and use those indices on the same index pattern, Kibana will complain about a mapping conflict even without any data on the indices. You can confirm this pretty easly with the following steps: First create 2 indices where a field has a different mapping:
Then on Kibana create a dataview for the index pattern |
@andrewkroh, would be awesome to have your thoughts on this. I feel because the particular field in question is not used in any dashboard/visualization, with an index rollover and passage of time, the conflicts should auto-correct but I also understand the concern here. What would be the best way to move forward ? One non intrusive option is to introduce a new field with the correct data type while keeping the existing field as is, but it feels like a hack/workaround. |
Yeah, the passage of time and removal of old indices from the cluster would solve the Kibana conflicts, also this field was added not long ago on this commit merged on April 2nd and since it has the wrong mapping, there is no one using it, so it is expected to not break anything, Kibana will just start showing a conflict for this field, if people start investigating the reason for that conflict they could easily find this issue which has the explanation on previous comments. I've just noticed the mistake in the mapping earlier because I'm migrating from a custom ingestion with Logstash to the native one, and some fields that we use for Security rules where missing or have the wrong mapping in this case. Also, this requires extra features on the Cloudflare side, which limits even more the number of people that may be affected. |
@leandrojmp Thanks for providing this information. I can see that the introduction of the incorrect field definition falls on my shoulders in #12782; I even documented the type correctly in the owning issue. Unfortunately, this addition was in the set of changes where I was unable to obtain any test samples. Would you be in a position to be able to donate test samples to improve the test coverage for the package? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@moxarth-rathod, approving this since as discussed with @leandrojmp the real world impact will be minimal and the issue will eventually self correct with time. But I'd wait for @andrewkroh or @efd6 to give their sign off on this issue before merging.
I'm happy with this, but I think Andrew should have the final say. |
@efd6 you mean, some sample events from real usage? Sure! I can get some, redact sensitive information and share here. |
@leandrojmp That would be wonderful. Note the concern that I expressed in the PR that introduced this. |
@leandrojmp did you get a chance to collect some sample events? |
@@ -1,4 +1,9 @@ | |||
# newer versions go on top | |||
- version: "1.37.3" | |||
changes: | |||
- description: Fix field type of `http_request.bot.detection_tags` field. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I would like more details in this message. e.g. What type was it? What type is is now? What problem does it fix?
Something like this gives a more complete view of the problem being fixed by the update.
Fix the field type of
http_request.bot.detection_tags
fromlong
tokeyword
, aligning it with the log source type of string array. This resolves an issue where the field was being ignored due to the "ignored_malformed" index setting, making it unusable in searches.
💚 Build Succeeded
History
|
|
Package cloudflare_logpush - 1.37.4 containing this change is available at https://epr.elastic.co/package/cloudflare_logpush/1.37.4/ |
Proposed commit message
Checklist
changelog.yml
file.How to test this PR locally
Related issues