-
Notifications
You must be signed in to change notification settings - Fork 25.4k
Add match support for semantic_text fields #117839
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add match support for semantic_text fields #117839
Conversation
… non-inference fields
…rs for match query builder and semantic text
34af576
to
bcb93c3
Compare
…E_INTERCEPTION_SUPPORTED to Test feature
test/framework/src/main/java/org/elasticsearch/test/AbstractBuilderTestCase.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/plugins/internal/rewriter/QueryRewriteInterceptor.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/plugins/internal/rewriter/QueryRewriteInterceptor.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/index/query/InterceptedQueryBuilderWrapperTests.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! I left a few comments, mostly nits.
server/src/main/java/org/elasticsearch/index/query/InnerHitContextBuilder.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/query/InterceptedQueryBuilderWrapper.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/query/InterceptedQueryBuilderWrapper.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/query/InterceptedQueryBuilderWrapper.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/index/query/InterceptedQueryBuilderWrapperTests.java
Show resolved
Hide resolved
} | ||
|
||
public QueryBuilder interceptAndRewrite(QueryRewriteContext context, QueryBuilder queryBuilder) { | ||
assert (queryBuilder instanceof MatchQueryBuilder); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dumb question: how does this interceptor know to only intercept match queries?
Update: After reviewing the code further, I understand how this interceptor only intercepts match
queries. However, this behavior is based on an implementation detail that I think could easily trip us up later. We use QueryRewriteInterceptor#multi
to create an interceptor lookup map. However, QueryRewriteContext
will accept any QueryRewriteInterceptor
, including one not created by QueryRewriteInterceptor#multi
. Could we update the classes and method signatures to to enforce that that the rewrite context should take a map of interceptors?
We don't need to do it in this PR, but I think it's something good for a follow-up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get it. I think that we certainly could do that, and I'd argue that if we decide to go down that route, maybe we get rid of the CompositeQueryRewriteInterceptor
concept at all in favor of a service class that will do the right thing. I'll think about it, but I'd like to do that in a followup if we decide that's the best direction to go in, as it won't impact transport serialization or other features.
for (String inferenceIndexName : inferenceIndices) { | ||
// Add a separate clause for each semantic query, because they may be using different inference endpoints | ||
// TODO - consolidate this to a single clause once the semantic query supports multiple inference endpoints | ||
boolQueryBuilder.should( | ||
createSemanticSubQuery(inferenceIndexName, matchQueryBuilder.fieldName(), (String) matchQueryBuilder.value()) | ||
); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Match query support with multiple inference endpoints is already broken (due to the index filter clause running as a post-filter), so we can consolidate to a single clause now.
server/src/main/java/org/elasticsearch/plugins/internal/rewriter/QueryRewriteInterceptor.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/query/InterceptedQueryBuilderWrapper.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/query/InterceptedQueryBuilderWrapper.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for implementing the rewriting test!
rewritten = matchQueryBuilder.rewrite(context); | ||
assertTrue(rewritten instanceof InterceptedQueryBuilderWrapper); | ||
assertTrue(((InterceptedQueryBuilderWrapper) rewritten).queryBuilder instanceof MatchQueryBuilder); | ||
MatchQueryBuilder rewrittenMatchQueryBuilder = (MatchQueryBuilder) ((InterceptedQueryBuilderWrapper) rewritten).queryBuilder; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice test! 💯
Should we test that an additional rewrite on an intercepted query returns the same query, so we know the rewrite cycle ends?
server/src/test/java/org/elasticsearch/index/query/InterceptedQueryBuilderWrapperTests.java
Show resolved
Hide resolved
test/framework/src/main/java/org/elasticsearch/test/AbstractBuilderTestCase.java
Show resolved
Hide resolved
…textBuilder.java Co-authored-by: Mike Pellegrini <mike.pellegrini@elastic.co>
💔 Backport failed
You can use sqren/backport to manually backport by running |
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
* Added query name to inference field metadata * Fix build error * Added query builder service * Add query builder service to query rewrite context * Updated match query to support querying semantic text fields * Fix build error * Fix NPE * Update the POC to rewrite to a bool query when combined inference and non-inference fields * Separate clause for each inference index (to avoid inference ID clashes) * Simplify query builder service concept to a single default inference query * Rename QueryBuilderService, remove query name from inference metadata * Fix too many rewrite rounds error by injecting booleans in constructors for match query builder and semantic text * Fix test compilation errors * Fix tests * Add yaml test for semantic match * Add NodeFeature * Fix license headers * Spotless * Updated getClass comparison in MatchQueryBuilder * Cleanup * Add Mock Inference Query Builder Service * Spotless * Cleanup * Update docs/changelog/117839.yaml * Update changelog * Replace the default inference query builder with a query rewrite interceptor * Cleanup * Some more cleanup/renames * Some more cleanup/renames * Spotless * Checkstyle * Convert List<QueryRewriteInterceptor> to Map keyed on query name, error on query name collisions * PR feedback - remove check on QueryRewriteContext class only * PR feedback * Remove intercept flag from MatchQueryBuilder and replace with wrapper * Move feature to test feature * Ensure interception happens only once * Rename InterceptedQueryBuilderWrapper to AbstractQueryBuilderWrapper * Add lenient field to SemanticQueryBuilder * Clean up yaml test * Add TODO comment * Add comment * Spotless * Rename AbstractQueryBuilderWrapper back to InterceptedQueryBuilderWrapper * Spotless * Didn't mean to commit that * Remove static class wrapping the InterceptedQueryBuilderWrapper * Make InterceptedQueryBuilderWrapper part of QueryRewriteInterceptor * Refactor the interceptor to be an internal plugin that cannot be used outside inference plugin * Fix tests * Spotless * Minor cleanup * C'mon spotless * Test spotless * Cleanup InternalQueryRewriter * Change if statement to assert * Simplify template of InterceptedQueryBuilderWrapper * Change constructor of InterceptedQueryBuilderWrapper * Refactor InterceptedQueryBuilderWrapper to extend QueryBuilder * Cleanup * Add test * Spotless * Rename rewrite to interceptAndRewrite in QueryRewriteInterceptor * DOESN'T WORK - for testing * Add comment * Getting closer - match on single typed fields works now * Deleted line by mistake * Checkstyle * Fix over-aggressive IntelliJ Refactor/Rename * And another one * Move SemanticMatchQueryRewriteInterceptor.SEMANTIC_MATCH_QUERY_REWRITE_INTERCEPTION_SUPPORTED to Test feature * PR feedback * Require query name with no default * PR feedback & update test * Add rewrite test * Update server/src/main/java/org/elasticsearch/index/query/InnerHitContextBuilder.java Co-authored-by: Mike Pellegrini <mike.pellegrini@elastic.co> --------- Co-authored-by: Mike Pellegrini <mike.pellegrini@elastic.co> (cherry picked from commit c9a6a2c)
* Added query name to inference field metadata * Fix build error * Added query builder service * Add query builder service to query rewrite context * Updated match query to support querying semantic text fields * Fix build error * Fix NPE * Update the POC to rewrite to a bool query when combined inference and non-inference fields * Separate clause for each inference index (to avoid inference ID clashes) * Simplify query builder service concept to a single default inference query * Rename QueryBuilderService, remove query name from inference metadata * Fix too many rewrite rounds error by injecting booleans in constructors for match query builder and semantic text * Fix test compilation errors * Fix tests * Add yaml test for semantic match * Add NodeFeature * Fix license headers * Spotless * Updated getClass comparison in MatchQueryBuilder * Cleanup * Add Mock Inference Query Builder Service * Spotless * Cleanup * Update docs/changelog/117839.yaml * Update changelog * Replace the default inference query builder with a query rewrite interceptor * Cleanup * Some more cleanup/renames * Some more cleanup/renames * Spotless * Checkstyle * Convert List<QueryRewriteInterceptor> to Map keyed on query name, error on query name collisions * PR feedback - remove check on QueryRewriteContext class only * PR feedback * Remove intercept flag from MatchQueryBuilder and replace with wrapper * Move feature to test feature * Ensure interception happens only once * Rename InterceptedQueryBuilderWrapper to AbstractQueryBuilderWrapper * Add lenient field to SemanticQueryBuilder * Clean up yaml test * Add TODO comment * Add comment * Spotless * Rename AbstractQueryBuilderWrapper back to InterceptedQueryBuilderWrapper * Spotless * Didn't mean to commit that * Remove static class wrapping the InterceptedQueryBuilderWrapper * Make InterceptedQueryBuilderWrapper part of QueryRewriteInterceptor * Refactor the interceptor to be an internal plugin that cannot be used outside inference plugin * Fix tests * Spotless * Minor cleanup * C'mon spotless * Test spotless * Cleanup InternalQueryRewriter * Change if statement to assert * Simplify template of InterceptedQueryBuilderWrapper * Change constructor of InterceptedQueryBuilderWrapper * Refactor InterceptedQueryBuilderWrapper to extend QueryBuilder * Cleanup * Add test * Spotless * Rename rewrite to interceptAndRewrite in QueryRewriteInterceptor * DOESN'T WORK - for testing * Add comment * Getting closer - match on single typed fields works now * Deleted line by mistake * Checkstyle * Fix over-aggressive IntelliJ Refactor/Rename * And another one * Move SemanticMatchQueryRewriteInterceptor.SEMANTIC_MATCH_QUERY_REWRITE_INTERCEPTION_SUPPORTED to Test feature * PR feedback * Require query name with no default * PR feedback & update test * Add rewrite test * Update server/src/main/java/org/elasticsearch/index/query/InnerHitContextBuilder.java Co-authored-by: Mike Pellegrini <mike.pellegrini@elastic.co> --------- Co-authored-by: Mike Pellegrini <mike.pellegrini@elastic.co> (cherry picked from commit c9a6a2c)
…118587) * Added query name to inference field metadata * Fix build error * Added query builder service * Add query builder service to query rewrite context * Updated match query to support querying semantic text fields * Fix build error * Fix NPE * Update the POC to rewrite to a bool query when combined inference and non-inference fields * Separate clause for each inference index (to avoid inference ID clashes) * Simplify query builder service concept to a single default inference query * Rename QueryBuilderService, remove query name from inference metadata * Fix too many rewrite rounds error by injecting booleans in constructors for match query builder and semantic text * Fix test compilation errors * Fix tests * Add yaml test for semantic match * Add NodeFeature * Fix license headers * Spotless * Updated getClass comparison in MatchQueryBuilder * Cleanup * Add Mock Inference Query Builder Service * Spotless * Cleanup * Update docs/changelog/117839.yaml * Update changelog * Replace the default inference query builder with a query rewrite interceptor * Cleanup * Some more cleanup/renames * Some more cleanup/renames * Spotless * Checkstyle * Convert List<QueryRewriteInterceptor> to Map keyed on query name, error on query name collisions * PR feedback - remove check on QueryRewriteContext class only * PR feedback * Remove intercept flag from MatchQueryBuilder and replace with wrapper * Move feature to test feature * Ensure interception happens only once * Rename InterceptedQueryBuilderWrapper to AbstractQueryBuilderWrapper * Add lenient field to SemanticQueryBuilder * Clean up yaml test * Add TODO comment * Add comment * Spotless * Rename AbstractQueryBuilderWrapper back to InterceptedQueryBuilderWrapper * Spotless * Didn't mean to commit that * Remove static class wrapping the InterceptedQueryBuilderWrapper * Make InterceptedQueryBuilderWrapper part of QueryRewriteInterceptor * Refactor the interceptor to be an internal plugin that cannot be used outside inference plugin * Fix tests * Spotless * Minor cleanup * C'mon spotless * Test spotless * Cleanup InternalQueryRewriter * Change if statement to assert * Simplify template of InterceptedQueryBuilderWrapper * Change constructor of InterceptedQueryBuilderWrapper * Refactor InterceptedQueryBuilderWrapper to extend QueryBuilder * Cleanup * Add test * Spotless * Rename rewrite to interceptAndRewrite in QueryRewriteInterceptor * DOESN'T WORK - for testing * Add comment * Getting closer - match on single typed fields works now * Deleted line by mistake * Checkstyle * Fix over-aggressive IntelliJ Refactor/Rename * And another one * Move SemanticMatchQueryRewriteInterceptor.SEMANTIC_MATCH_QUERY_REWRITE_INTERCEPTION_SUPPORTED to Test feature * PR feedback * Require query name with no default * PR feedback & update test * Add rewrite test * Update server/src/main/java/org/elasticsearch/index/query/InnerHitContextBuilder.java Co-authored-by: Mike Pellegrini <mike.pellegrini@elastic.co> --------- Co-authored-by: Mike Pellegrini <mike.pellegrini@elastic.co> (cherry picked from commit c9a6a2c)
…118587) * Added query name to inference field metadata * Fix build error * Added query builder service * Add query builder service to query rewrite context * Updated match query to support querying semantic text fields * Fix build error * Fix NPE * Update the POC to rewrite to a bool query when combined inference and non-inference fields * Separate clause for each inference index (to avoid inference ID clashes) * Simplify query builder service concept to a single default inference query * Rename QueryBuilderService, remove query name from inference metadata * Fix too many rewrite rounds error by injecting booleans in constructors for match query builder and semantic text * Fix test compilation errors * Fix tests * Add yaml test for semantic match * Add NodeFeature * Fix license headers * Spotless * Updated getClass comparison in MatchQueryBuilder * Cleanup * Add Mock Inference Query Builder Service * Spotless * Cleanup * Update docs/changelog/117839.yaml * Update changelog * Replace the default inference query builder with a query rewrite interceptor * Cleanup * Some more cleanup/renames * Some more cleanup/renames * Spotless * Checkstyle * Convert List<QueryRewriteInterceptor> to Map keyed on query name, error on query name collisions * PR feedback - remove check on QueryRewriteContext class only * PR feedback * Remove intercept flag from MatchQueryBuilder and replace with wrapper * Move feature to test feature * Ensure interception happens only once * Rename InterceptedQueryBuilderWrapper to AbstractQueryBuilderWrapper * Add lenient field to SemanticQueryBuilder * Clean up yaml test * Add TODO comment * Add comment * Spotless * Rename AbstractQueryBuilderWrapper back to InterceptedQueryBuilderWrapper * Spotless * Didn't mean to commit that * Remove static class wrapping the InterceptedQueryBuilderWrapper * Make InterceptedQueryBuilderWrapper part of QueryRewriteInterceptor * Refactor the interceptor to be an internal plugin that cannot be used outside inference plugin * Fix tests * Spotless * Minor cleanup * C'mon spotless * Test spotless * Cleanup InternalQueryRewriter * Change if statement to assert * Simplify template of InterceptedQueryBuilderWrapper * Change constructor of InterceptedQueryBuilderWrapper * Refactor InterceptedQueryBuilderWrapper to extend QueryBuilder * Cleanup * Add test * Spotless * Rename rewrite to interceptAndRewrite in QueryRewriteInterceptor * DOESN'T WORK - for testing * Add comment * Getting closer - match on single typed fields works now * Deleted line by mistake * Checkstyle * Fix over-aggressive IntelliJ Refactor/Rename * And another one * Move SemanticMatchQueryRewriteInterceptor.SEMANTIC_MATCH_QUERY_REWRITE_INTERCEPTION_SUPPORTED to Test feature * PR feedback * Require query name with no default * PR feedback & update test * Add rewrite test * Update server/src/main/java/org/elasticsearch/index/query/InnerHitContextBuilder.java Co-authored-by: Mike Pellegrini <mike.pellegrini@elastic.co> --------- Co-authored-by: Mike Pellegrini <mike.pellegrini@elastic.co> (cherry picked from commit c9a6a2c)
Followup on #112166
Implements
match
query support forsemantic_text
fields.The following design changes were made from the original POC:
match
queries on combinedtext
andsemantic_text
field will no longer error but will return results.Example script of how to use it: