Skip to content

[Modsec] Fix date format & Json issues #3363

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 33 commits into from
Jul 20, 2022
Merged

[Modsec] Fix date format & Json issues #3363

merged 33 commits into from
Jul 20, 2022

Conversation

emnp
Copy link
Contributor

@emnp emnp commented May 17, 2022

What does this PR do?

In the previously released version, some logs are unparsed because of date format and some are because of JSON duplicate keys. I fixed that first and then found one issue #2859 opened.
The previous version was for ModSecurity with Nginx and according to this issue #2859, I also updated for the ModSecurity with apache logs.

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • I have verified that Kibana version constraints are current according to guidelines.

How to test this PR locally

# cd integrations/packages/modsecurity
# elastic-package build
# elastic-package stack up -d -v
# eval "$(elastic-package stack shellinit)"
# elastic-package test  -v

Related issues

Screenshots

@emnp emnp requested a review from a team as a code owner May 17, 2022 07:49
@elasticmachine
Copy link

elasticmachine commented May 17, 2022

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2022-07-20T09:25:34.892+0000

  • Duration: 17 min 47 sec

Test stats 🧪

Test Results
Failed 0
Passed 6
Skipped 0
Total 6

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

@elasticmachine
Copy link

Pinging @elastic/security-external-integrations (Team:Security-External Integrations)

@jamiehynds jamiehynds added the Integration:modsecurity ModSecurity Audit (Community supported) label May 23, 2022
Copy link
Contributor

@r00tu53r r00tu53r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution. I have a few comments around the date format. Also, can you please add a change log entry to packages/modsecurity/changelog.yml and update the version in packages/modsecurity/manifest.ymltoo.

@emnp
Copy link
Contributor Author

emnp commented Jun 1, 2022

Hi @r00tu53r, this issue #2859 is fixed in this version too.

@r00tu53r
Copy link
Contributor

r00tu53r commented Jun 6, 2022

/test

Copy link
Contributor

@efd6 efd6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unclear from the PR description what this is fixing (there is no issue and no comment on what the change is intended to address).

Can you explain why the old test inputs are being completely replaced? Were they wrong in some way? If so, please explain this in the PR. If they are not explicitly incorrect, please leave them and append new cases where needed.

A number of files have had there modes changed to include the execute bit set for all users. Please make sure that regular files are ugo-x unless the need to be otherwise.

@emnp
Copy link
Contributor Author

emnp commented Jun 6, 2022

Hi @efd6, I edited the pr description. Please kindly check.

I used new test inputs with multiple date formats to check all the date formats were working well and also added new test inputs for the ModSecurity with apache.

@efd6
Copy link
Contributor

efd6 commented Jun 6, 2022

Many of the files still show an addition of the x flags on their modes and there are still significant non-append changes to inputs. Please revert the deletions in test cases or explain why they need to be deleted.

@emnp emnp requested a review from efd6 June 24, 2022 03:09
@efd6
Copy link
Contributor

efd6 commented Jun 24, 2022

Thank you, it is looking good now. Please resolve the conflict.

@emnp
Copy link
Contributor Author

emnp commented Jun 24, 2022

Hello @efd6 , the conflicts are resolved.

@efd6
Copy link
Contributor

efd6 commented Jun 24, 2022

/test

emnp and others added 2 commits June 24, 2022 13:37
Co-authored-by: Dan Kortschak <90160302+efd6@users.noreply.github.com>
Co-authored-by: Dan Kortschak <90160302+efd6@users.noreply.github.com>
@efd6
Copy link
Contributor

efd6 commented Jun 27, 2022

Please run elastic-package build.

@emnp
Copy link
Contributor Author

emnp commented Jun 27, 2022

Already run elastic-package build before pushing.

@efd6
Copy link
Contributor

efd6 commented Jun 27, 2022

When I run it I get this

diff --git a/packages/modsecurity/docs/README.md b/packages/modsecurity/docs/README.md
index a76c4ec07..3bb671561 100644
--- a/packages/modsecurity/docs/README.md
+++ b/packages/modsecurity/docs/README.md
@@ -59,6 +59,7 @@ The `Audit Log` dataset collects Modsecurity Audit logs.
 | host.os.family | OS family (such as redhat, debian, freebsd, windows). | keyword |
 | host.os.kernel | Operating system kernel version as a raw string. | keyword |
 | host.os.name | Operating system name, without the version. | keyword |
+| host.os.name.text | Multi-field of `host.os.name`. | text |
 | host.os.platform | Operating system platform (such centos, ubuntu, windows). | keyword |
 | host.os.version | Operating system version as a raw string. | keyword |
 | host.type | Type of host. For Cloud providers this can be the machine type like `t2.medium`. If vm, this could be the container, for example, or other information meaningful in your environment. | keyword |
@@ -85,6 +86,7 @@ The `Audit Log` dataset collects Modsecurity Audit logs.
 | source.address | Some event source addresses are defined ambiguously. The event will sometimes list an IP, a domain or a unix socket.  You should always store the raw address in the `.address` field. Then it should be duplicated to `.ip` or `.domain`, depending on which one it is. | keyword |
 | source.as.number | Unique number allocated to the autonomous system. The autonomous system number (ASN) uniquely identifies each network on the Internet. | long |
 | source.as.organization.name | Organization name. | keyword |
+| source.as.organization.name.text | Multi-field of `source.as.organization.name`. | match_only_text |
 | source.geo.city_name | City name. | keyword |
 | source.geo.continent_name | Name of the continent. | keyword |
 | source.geo.country_iso_code | Country ISO code. | keyword |
@@ -99,16 +101,21 @@ The `Audit Log` dataset collects Modsecurity Audit logs.
 | url.extension | The field contains the file extension from the original request url, excluding the leading dot. The file extension is only set if it exists, as not every url has a file extension. The leading period must not be included. For example, the value must be "png", not ".png". Note that when the file name has multiple extensions (example.tar.gz), only the last one should be captured ("gz", not "tar.gz"). | keyword |
 | url.fragment | Portion of the url after the `#`, such as "top". The `#` is not part of the fragment. | keyword |
 | url.original | Unmodified original url as seen in the event source. Note that in network monitoring, the observed URL may be a full URL, whereas in access logs, the URL is often just represented as a path. This field is meant to represent the URL as it was observed, complete or not. | wildcard |
+| url.original.text | Multi-field of `url.original`. | match_only_text |
 | url.path | Path of the request, such as "/search". | wildcard |
 | url.port | Port of the request, such as 443. | long |
 | url.query | The query field describes the query string of the request, such as "q=elasticsearch". The `?` is excluded from the query string. If a URL contains no `?`, there is no query field. If there is a `?` but no query, the query field exists with an empty string. The `exists` query can be used to differentiate between the two cases. | keyword |
 | url.scheme | Scheme of the request, such as "https". Note: The `:` is not part of the scheme. | keyword |
 | user.name | Short name or login of the user. | keyword |
+| user.name.text | Multi-field of `user.name`. | match_only_text |
 | user_agent.device.name | Name of the device. | keyword |
 | user_agent.name | Name of the user agent. | keyword |
 | user_agent.original | Unparsed user_agent string. | keyword |
+| user_agent.original.text | Multi-field of `user_agent.original`. | match_only_text |
 | user_agent.os.full | Operating system name, including the version or code name. | keyword |
+| user_agent.os.full.text | Multi-field of `user_agent.os.full`. | match_only_text |
 | user_agent.os.name | Operating system name, without the version. | keyword |
+| user_agent.os.name.text | Multi-field of `user_agent.os.name`. | match_only_text |
 | user_agent.os.version | Operating system version as a raw string. | keyword |
 | user_agent.version | Version of the user agent. | keyword |
 

Comment on lines 57 to 65
- foreach:
field: modsec.audit.messages
ignore_missing: true
processor:
grok:
field: '_ingest._value'
ignore_failure: true
patterns:
- '%{GREEDYDATA} \[file "%{GREEDYDATA:modsec.audit.details.file}"\] \[line "%{GREEDYDATA:_temps.line_number}"\] \[id "%{GREEDYDATA:modsec.audit.details.rule_id}"\] \[msg "%{GREEDYDATA:modsec.audit.details.msg}"\] \[data "%{GREEDYDATA:modsec.audit.details.match}"\] \[severity "%{GREEDYDATA:modsec.audit.details.severity}"\] \[ver "%{DATA:modsec.audit.details.version}"\] %{GREEDYDATA:_temps.tags}'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't going to do what we want. I would like to think about this some more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is currently feasible.

Copy link
Contributor

@efd6 efd6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we should defer some of the changes here until they are more able to be handled. The changes that I'm suggesting don't lock us into anything in the future and retain the data in a reasonably searchable format.

@emnp
Copy link
Contributor Author

emnp commented Jun 28, 2022

Thanks for your suggestion <3. I have updated it. Please take a look.

@efd6
Copy link
Contributor

efd6 commented Jun 28, 2022

/test

@efd6
Copy link
Contributor

efd6 commented Jun 29, 2022

Sorry there was a version bump. Can you resolve the conflicts? Looks good to me though after that.

@emnp
Copy link
Contributor Author

emnp commented Jun 29, 2022

Is it necessary to set "ecs.version" to "8.3.0" due to the conflict in this file packages/modsecurity/data_stream/auditlog/_dev/test/pipeline/test-audit.log-expected.json ?

@efd6
Copy link
Contributor

efd6 commented Jun 29, 2022

Yes, if you resolve the conflicts using the current tip, you will have ecs.version set to 8.3.0. This is here.

@efd6
Copy link
Contributor

efd6 commented Jun 29, 2022

/test

@efd6
Copy link
Contributor

efd6 commented Jun 29, 2022

This needs to have elastic-package build run.

@emnp
Copy link
Contributor Author

emnp commented Jun 30, 2022

I already run that command before pushing. What can be the problem?

@efd6
Copy link
Contributor

efd6 commented Jun 30, 2022

When I run that locally, I get this diff

diff --git a/packages/modsecurity/docs/README.md b/packages/modsecurity/docs/README.md
index 77024f1bb..38af18462 100644
--- a/packages/modsecurity/docs/README.md
+++ b/packages/modsecurity/docs/README.md
@@ -59,6 +59,7 @@ The `Audit Log` dataset collects Modsecurity Audit logs.
 | host.os.family | OS family (such as redhat, debian, freebsd, windows). | keyword |
 | host.os.kernel | Operating system kernel version as a raw string. | keyword |
 | host.os.name | Operating system name, without the version. | keyword |
+| host.os.name.text | Multi-field of `host.os.name`. | text |
 | host.os.platform | Operating system platform (such centos, ubuntu, windows). | keyword |
 | host.os.version | Operating system version as a raw string. | keyword |
 | host.type | Type of host. For Cloud providers this can be the machine type like `t2.medium`. If vm, this could be the container, for example, or other information meaningful in your environment. | keyword |
@@ -80,6 +81,7 @@ The `Audit Log` dataset collects Modsecurity Audit logs.
 | source.address | Some event source addresses are defined ambiguously. The event will sometimes list an IP, a domain or a unix socket.  You should always store the raw address in the `.address` field. Then it should be duplicated to `.ip` or `.domain`, depending on which one it is. | keyword |
 | source.as.number | Unique number allocated to the autonomous system. The autonomous system number (ASN) uniquely identifies each network on the Internet. | long |
 | source.as.organization.name | Organization name. | keyword |
+| source.as.organization.name.text | Multi-field of `source.as.organization.name`. | match_only_text |
 | source.geo.city_name | City name. | keyword |
 | source.geo.continent_name | Name of the continent. | keyword |
 | source.geo.country_iso_code | Country ISO code. | keyword |
@@ -94,16 +96,21 @@ The `Audit Log` dataset collects Modsecurity Audit logs.
 | url.extension | The field contains the file extension from the original request url, excluding the leading dot. The file extension is only set if it exists, as not every url has a file extension. The leading period must not be included. For example, the value must be "png", not ".png". Note that when the file name has multiple extensions (example.tar.gz), only the last one should be captured ("gz", not "tar.gz"). | keyword |
 | url.fragment | Portion of the url after the `#`, such as "top". The `#` is not part of the fragment. | keyword |
 | url.original | Unmodified original url as seen in the event source. Note that in network monitoring, the observed URL may be a full URL, whereas in access logs, the URL is often just represented as a path. This field is meant to represent the URL as it was observed, complete or not. | wildcard |
+| url.original.text | Multi-field of `url.original`. | match_only_text |
 | url.path | Path of the request, such as "/search". | wildcard |
 | url.port | Port of the request, such as 443. | long |
 | url.query | The query field describes the query string of the request, such as "q=elasticsearch". The `?` is excluded from the query string. If a URL contains no `?`, there is no query field. If there is a `?` but no query, the query field exists with an empty string. The `exists` query can be used to differentiate between the two cases. | keyword |
 | url.scheme | Scheme of the request, such as "https". Note: The `:` is not part of the scheme. | keyword |
 | user.name | Short name or login of the user. | keyword |
+| user.name.text | Multi-field of `user.name`. | match_only_text |
 | user_agent.device.name | Name of the device. | keyword |
 | user_agent.name | Name of the user agent. | keyword |
 | user_agent.original | Unparsed user_agent string. | keyword |
+| user_agent.original.text | Multi-field of `user_agent.original`. | match_only_text |
 | user_agent.os.full | Operating system name, including the version or code name. | keyword |
+| user_agent.os.full.text | Multi-field of `user_agent.os.full`. | match_only_text |
 | user_agent.os.name | Operating system name, without the version. | keyword |
+| user_agent.os.name.text | Multi-field of `user_agent.os.name`. | match_only_text |
 | user_agent.os.version | Operating system version as a raw string. | keyword |
 | user_agent.version | Version of the user agent. | keyword |
 

@efd6
Copy link
Contributor

efd6 commented Jul 1, 2022

/test

@efd6
Copy link
Contributor

efd6 commented Jul 3, 2022

@emnp Thank you for your patience.

@r00tu53r Would you please take a look since some of the changes are mine.

@efd6
Copy link
Contributor

efd6 commented Jul 20, 2022

@emnp Would you please fix the conflicts and update the manifest version.

Copy link
Contributor

@r00tu53r r00tu53r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Integration:modsecurity ModSecurity Audit (Community supported)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants