Skip to content

ESQL: Initial support for unmapped fields #119886

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 69 commits into from
Feb 13, 2025

Conversation

GalLalouche
Copy link
Contributor

@GalLalouche GalLalouche commented Jan 9, 2025

This PR adds initial support for unmapped fields, using the INSIST clause. For starters, this only support a single unmapped field, without casting. Casting may appear after the INSIST clause, similar to union types.
Note that the INSIST keyword is potentially a placeholder, as the method of defining an unmapped field might change in the future, e.g., use a special magic function.

First stage of #120072.

Specifically, the following features are implemented in this PR:

  • Support for INSIST keyword and a single parameter without a cast. In particular, if the type being INSISTed upon is mapped to anything other than KEYWORD, it will result in an InvalidMappedField. There is no support for union type resolution on top of INSIST. Future PRs will handle these conflicts.
  • Enforcing that INSIST must always be on top of a FROM. While this may change in the future, e.g., in cases like FROM foo | EVAL x = 1 | INSIST bar, it will not be done in this PR, as it makes handling INSIST too complicated.

@GalLalouche GalLalouche force-pushed the feature/unmapped_fields_squashed branch 7 times, most recently from 5f04c1f to fd1abd0 Compare January 12, 2025 12:18
@GalLalouche GalLalouche added >feature Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) auto-backport Automatically create backport pull requests when merged :Analytics/ES|QL AKA ESQL v8.18.0 labels Jan 12, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @GalLalouche, I've created a changelog YAML for you.

@GalLalouche GalLalouche marked this pull request as ready for review January 12, 2025 14:04
@GalLalouche GalLalouche requested a review from a team as a code owner January 12, 2025 14:04
@GalLalouche GalLalouche requested review from craigtaverner and removed request for a team January 12, 2025 14:04
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@GalLalouche GalLalouche requested a review from nik9000 January 12, 2025 16:33
Copy link
Contributor

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heya! I only took a quick glance at this, but I noticed the following:

  • The new syntax and command are immediately released. Can we please hide this behind a SNAPSHOT-only? Otherwise it's gonna be made available in Serverless on the next rollout. Update: sorry, it's already hidden behind SNAPSHOT in the parser, so that's good.
  • I can see some unrelated parser files being changed (SQL, EQL, KQL and Painless). I guess these were accidentally changed due to regenerating the parsers? I'd like to exclude these from the PR, please.
  • If at all possible, it'd be great to include a csv test to showcase the functionality. Or a yaml test if csv tests are too inflexible. In case that'd blow the PR's scope because not all is in place to run the feature end-to-end, let's at least add a short javadoc to the Insist class to describe its purpose. Update: Sorry, you correctly added a csv-spec test which I missed.

@alex-spies alex-spies self-requested a review January 13, 2025 11:07
@GalLalouche GalLalouche force-pushed the feature/unmapped_fields_squashed branch from 4289393 to 0ca5e79 Compare January 13, 2025 11:54
Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some quick comments, without going deep with the code:

  • disabling the _source should be covered so that whatever we do know should still happen after this PR
  • excluding certain fields from _source behavior should also be kept as is. If we already have tests for this somewhere in yaml, csv or other IT tests, that's great and apologies for mentioning it.
  • there is also the scenario where index: false and doc_values: false can be configured for a certain field. In this case, I guess this PR should address loading the value from _source as well (if _source is available, of course, considering my previous two use cases)

@GalLalouche GalLalouche dismissed costin’s stale review February 12, 2025 13:01

He already LGTMed, but can't approve right now

@GalLalouche GalLalouche enabled auto-merge (squash) February 12, 2025 13:05
@GalLalouche GalLalouche added v9.0.0 auto-backport Automatically create backport pull requests when merged v9.0.1 and removed auto-backport Automatically create backport pull requests when merged v8.19.0 v9.0.0 labels Feb 13, 2025
@astefan
Copy link
Contributor

astefan commented Feb 13, 2025

@elasticsearchmachine run elasticsearch-ci/part-1

@GalLalouche GalLalouche removed the test-release Trigger CI checks against release build label Feb 13, 2025
@GalLalouche GalLalouche merged commit d3ed999 into elastic:main Feb 13, 2025
17 checks passed
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
9.0 Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 119886

Comment on lines +237 to +245
private static void checkInsist(LogicalPlan p, Failures failures) {
if (p instanceof Insist i) {
LogicalPlan child = i.child();
if ((child instanceof EsRelation || child instanceof Insist) == false) {
failures.add(fail(i, "[insist] can only be used after [from] or [insist] commands, but was [{}]", child.sourceText()));
}
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more piece that could be refactored: have Insist implement PostAnalysisVerificationAware and move the check there.
Nit: I think we capitalise the commands elsewhere (including tests for this check), we should do the same here.

GalLalouche added a commit to GalLalouche/elasticsearch that referenced this pull request Feb 17, 2025
This PR adds initial support for unmapped fields, using the INSIST clause. For starters, this unmapped fields without a cast.
Note that the INSIST keyword is potentially a placeholder, as the method of defining an unmapped field might change in the future, e.g., use a special magic function. As this is currently under development, the actual syntax is INSIST_🐔.

First stage of elastic#120072.

Specifically, the following features are implemented in this PR:

* Support for INSIST keyword without a cast. In particular, if the type being INSISTed upon is mapped to anything other than KEYWORD, it will result in an InvalidMappedField. There is no support for union type resolution on top of INSIST. Future PRs will handle these conflicts. There is support for multiple parameters, or INSIST on top of INSIST which is equivalent.
* Enforcing that INSIST must always be on top of a FROM or another INSIST. While this may change in the future, e.g., handling cases like `FROM foo | EVAL x = 1 | INSIST bar` will not be done in this PR, as it makes handling INSIST too complicated.
GalLalouche added a commit to GalLalouche/elasticsearch that referenced this pull request Feb 18, 2025
This PR adds initial support for unmapped fields, using the INSIST clause. For starters, this unmapped fields without a cast.
Note that the INSIST keyword is potentially a placeholder, as the method of defining an unmapped field might change in the future, e.g., use a special magic function. As this is currently under development, the actual syntax is INSIST_🐔.

First stage of elastic#120072.

Specifically, the following features are implemented in this PR:

* Support for INSIST keyword without a cast. In particular, if the type being INSISTed upon is mapped to anything other than KEYWORD, it will result in an InvalidMappedField. There is no support for union type resolution on top of INSIST. Future PRs will handle these conflicts. There is support for multiple parameters, or INSIST on top of INSIST which is equivalent.
* Enforcing that INSIST must always be on top of a FROM or another INSIST. While this may change in the future, e.g., handling cases like `FROM foo | EVAL x = 1 | INSIST bar` will not be done in this PR, as it makes handling INSIST too complicated.
elasticsearchmachine pushed a commit that referenced this pull request Feb 18, 2025
#121261, #122607) (#122805)

* ESQL: Support for _index metadata field in CsvTests (#121261)

* ESQL: Support for _index metadata field in CsvTests

* Extract INDEX constant to MetadataAttribute

* Add comment on capability

* ESQL: Initial support for unmapped fields (#119886)

This PR adds initial support for unmapped fields, using the INSIST clause. For starters, this unmapped fields without a cast.
Note that the INSIST keyword is potentially a placeholder, as the method of defining an unmapped field might change in the future, e.g., use a special magic function. As this is currently under development, the actual syntax is INSIST_🐔.

First stage of #120072.

Specifically, the following features are implemented in this PR:

* Support for INSIST keyword without a cast. In particular, if the type being INSISTed upon is mapped to anything other than KEYWORD, it will result in an InvalidMappedField. There is no support for union type resolution on top of INSIST. Future PRs will handle these conflicts. There is support for multiple parameters, or INSIST on top of INSIST which is equivalent.
* Enforcing that INSIST must always be on top of a FROM or another INSIST. While this may change in the future, e.g., handling cases like `FROM foo | EVAL x = 1 | INSIST bar` will not be done in this PR, as it makes handling INSIST too complicated.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL auto-backport Automatically create backport pull requests when merged >feature Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.0.1 v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants