[ML] Checking for presence of error object when validating response #118375

jonathan-buttner · 2024-12-10T18:58:47Z

This PR adds functionality to check if the response body is an error object. Normally this will be caught by first checking the status code. For streaming the status code can be 200 and the response can be an error.

To achieve this I refactored how the error parsing logic works. Typically we'd return null if a field within the error object was missing (for example if the message field was missing). The problem is that we want to consider it an error if the json is shaped like an error even if the message field is missing.

I also realized that every *ResponseHandler had this code:

    public void validateResponse(ThrottlerManager throttlerManager, Logger logger, Request request, HttpResult result)
        throws RetryException {
        checkForFailureStatusCode(request, result);
        checkForEmptyBody(throttlerManager, logger, request, result);
    }

So I moved it up into the base class to remove the duplication.

jonathan-buttner · 2024-12-10T20:38:49Z

...ticsearch/xpack/inference/external/alibabacloudsearch/AlibabaCloudSearchResponseHandler.java

@@ -28,21 +24,15 @@ public AlibabaCloudSearchResponseHandler(String requestType, ResponseParser pars
        super(requestType, parseFunction, AlibabaCloudSearchErrorResponseEntity::fromResponse);
    }

-    @Override
-    public void validateResponse(ThrottlerManager throttlerManager, Logger logger, Request request, HttpResult result)


Move up into BaseResponseHandler.

jonathan-buttner · 2024-12-10T20:40:55Z

...src/main/java/org/elasticsearch/xpack/inference/external/http/retry/BaseResponseHandler.java

+
+        // When the response is streamed the status code could be 200 but the error object will be set
+        // so we need to check for that specifically
+        checkForErrorObject(request, result);


I have one potential concern for this. This will be executed for non-streaming and streaming code paths. So if for some reason we get a 200 response back (or some other non failure status code) and the response object is a valid response but also happens to have a field that the error object expects (which depends on each service implementation) then this would fail. I doubt that would happen.

If we were concerned about it we could create a new method validateStreamingResponse and only call checkForErrorObject in that method.

jonathan-buttner · 2024-12-10T20:41:16Z

...src/main/java/org/elasticsearch/xpack/inference/external/http/retry/BaseResponseHandler.java

+
+        if (errorEntity.errorStructureFound()) {
+            // we don't really know what happened because the status code was 200 so we'll just retry
+            throw new RetryException(true, buildError(SERVER_ERROR_OBJECT, request, result, errorEntity));


Should we just retry if we get a failure mid-stream?

Looking at the OpenAI java and python libraries, it seems like they don't retry on error messages.
https://github.com/openai/openai-java/blob/9a6fa4aeea6738190dc1eb28cd52d82fbbf0b18b/openai-java-core/src/main/kotlin/com/openai/core/handlers/SseHandler.kt#L58
https://github.com/openai/openai-python/blob/995cce048f9427bba4f7ac1e5fc60abbf1f8f0b7/src/openai/_streaming.py#L70

The SSE spec does seem to include some allowance for retries, but I'm not sure that it allows for retry on an error returned from the EventSource. It seems that retries mid-stream might only be for disconnections. I'm not totally sure though. https://html.spec.whatwg.org/multipage/server-sent-events.html#event-stream-interpretation

further, I think we need to consider what the UX will be if we retry. With streaming, the user will be seeing the response from the integration in real time. If we retry the request, will we be sending the full request again, and the streaming responses will start over from the beginning? That will probably cause weird messages to be displayed in the UI, unless the UI also handles this error case (and we would be need to communicate the error to the UI in a special way).

I think that it might be safer to simply relay the error message to the user, and let them retry if they want to.

Great points. I'll switch it to skip retrying. I think that's also fine for the non-streaming case too. If we are in the non-streaming code path I'd expect a failure status code first which would cause a throw so we shouldn't even be getting to this code block in that scenario.

jonathan-buttner · 2024-12-10T20:41:50Z

...erence/src/main/java/org/elasticsearch/xpack/inference/external/http/retry/ErrorMessage.java

@@ -1,12 +0,0 @@
-/*


I turned this into the class ErrorResponse below.

jonathan-buttner · 2024-12-10T20:44:04Z

...rence/src/main/java/org/elasticsearch/xpack/inference/external/http/retry/ErrorResponse.java

+public class ErrorResponse {
+
+    // Denotes an error object that was not found
+    public static final ErrorResponse UNDEFINED_ERROR = new ErrorResponse(false);


The reason I need this is to differentiate between an error object being present but a specific field in the object missing (for example message) and when no error exists. Previously we were returning null in both of those scenarios. Now if there's an exception or the main field that defines an error object doesn't exist we'll return this instance to indicate that there was no error present.

If a nested field is missing we return a created instance with errorStructureFound set to true.

jonathan-buttner · 2024-12-10T20:44:30Z

...e/src/main/java/org/elasticsearch/xpack/inference/external/openai/OpenAiResponseHandler.java

@@ -104,7 +95,7 @@ private static boolean isContentTooLarge(HttpResult result) {
        }

        if (statusCode == 400) {
-            var errorEntity = OpenAiErrorResponseEntity.fromResponse(result);
+            var errorEntity = ErrorMessageResponseEntity.fromResponse(result);


These were the same so I removed the OpenAiResponseEntity class.

jonathan-buttner · 2024-12-10T20:44:45Z

...ain/java/org/elasticsearch/xpack/inference/external/response/ErrorMessageResponseEntity.java

    }

-    @Override


Moved to the base class ErrorResponse.

jonathan-buttner · 2024-12-10T20:45:51Z

...ava/org/elasticsearch/xpack/inference/external/response/ErrorMessageResponseEntityTests.java

    public void testFromResponse_noMessage() {
        String responseJson = """
            {
              "error": {
-                "type": "not_found_error",
+                "type": "not_found_error"


This was actually causing a parsing failure instead of a missing field failure because of the trailing comma.

elasticsearchmachine · 2024-12-10T22:02:16Z

Pinging @elastic/ml-core (Team:ML)

elasticsearchmachine · 2024-12-10T22:11:18Z

Hi @jonathan-buttner, I've created a changelog YAML for you.

maxhniebergall

LGTM

maxhniebergall · 2024-12-11T15:48:22Z

docs/changelog/118375.yaml

@@ -0,0 +1,5 @@
+pr: 118375
+summary: Checking for presence of error object when validating response


Since this will end up in the changelog, I think we should make sure to mention streaming and the inference API:
"Check for presence of error object when validating streaming responses from integrations in the inference API"

Thanks forgot to update this 👍

maxhniebergall · 2024-12-11T15:53:37Z

...src/main/java/org/elasticsearch/xpack/inference/external/http/retry/BaseResponseHandler.java

+
+        if (errorEntity.errorStructureFound()) {
+            // we don't really know what happened because the status code was 200 so we'll just retry
+            throw new RetryException(true, buildError(SERVER_ERROR_OBJECT, request, result, errorEntity));


further, I think we need to consider what the UX will be if we retry. With streaming, the user will be seeing the response from the integration in real time. If we retry the request, will we be sending the full request again, and the streaming responses will start over from the beginning? That will probably cause weird messages to be displayed in the UI, unless the UI also handles this error case (and we would be need to communicate the error to the UI in a special way).

I think that it might be safer to simply relay the error message to the user, and let them retry if they want to.

…ming-error-obj

jonathan-buttner · 2024-12-16T20:15:04Z

💚 All backports created successfully

Status	Branch	Result
✅	8.x

Questions ?

Please refer to the Backport tool documentation

…lastic#118375) * Refactoring error handling logic * Refactoring base response handler to remove duplication * Update docs/changelog/118375.yaml * Addressing feedback --------- Co-authored-by: Max Hniebergall <137079448+maxhniebergall@users.noreply.github.com> (cherry picked from commit f4dc716) # Conflicts: # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/alibabacloudsearch/AlibabaCloudSearchResponseHandler.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/anthropic/AnthropicResponseHandler.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/cohere/CohereResponseHandler.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/elastic/ElasticInferenceServiceResponseHandler.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/googleaistudio/GoogleAiStudioResponseHandler.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/googlevertexai/GoogleVertexAiResponseHandler.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/huggingface/HuggingFaceResponseHandler.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/ibmwatsonx/IbmWatsonxResponseHandler.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/openai/OpenAiResponseHandler.java

…118375) (#118795) * Refactoring error handling logic * Refactoring base response handler to remove duplication * Update docs/changelog/118375.yaml * Addressing feedback --------- Co-authored-by: Max Hniebergall <137079448+maxhniebergall@users.noreply.github.com> (cherry picked from commit f4dc716) # Conflicts: # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/alibabacloudsearch/AlibabaCloudSearchResponseHandler.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/anthropic/AnthropicResponseHandler.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/cohere/CohereResponseHandler.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/elastic/ElasticInferenceServiceResponseHandler.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/googleaistudio/GoogleAiStudioResponseHandler.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/googlevertexai/GoogleVertexAiResponseHandler.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/huggingface/HuggingFaceResponseHandler.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/ibmwatsonx/IbmWatsonxResponseHandler.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/openai/OpenAiResponseHandler.java

Refactoring error handling logic

b934664

jonathan-buttner added :ml Machine learning Team:ML Meta label for the ML team Feature:GenAI Features around GenAI v9.0.0 v8.18.0 labels Dec 10, 2024

Refactoring base response handler to remove duplication

4148ca9

jonathan-buttner commented Dec 10, 2024

View reviewed changes

Merge branch 'main' into ml-streaming-error-obj

dc09a1c

jonathan-buttner requested review from davidkyle and maxhniebergall December 10, 2024 22:01

jonathan-buttner marked this pull request as ready for review December 10, 2024 22:01

jonathan-buttner added the >enhancement label Dec 10, 2024

Update docs/changelog/118375.yaml

b70638b

jonathan-buttner and others added 2 commits December 11, 2024 08:31

Merge branch 'main' into ml-streaming-error-obj

3128139

Merge branch 'main' into ml-streaming-error-obj

2f0544f

maxhniebergall approved these changes Dec 11, 2024

View reviewed changes

jonathan-buttner and others added 3 commits December 11, 2024 13:28

Addressing feedback

94ba505

Merge branch 'main' of github.com:elastic/elasticsearch into ml-strea…

6b719de

…ming-error-obj

Merge branch 'main' into ml-streaming-error-obj

ad3ade5

jonathan-buttner merged commit f4dc716 into elastic:main Dec 11, 2024
16 checks passed

jonathan-buttner deleted the ml-streaming-error-obj branch December 11, 2024 21:03

jonathan-buttner mentioned this pull request Dec 16, 2024

[8.x] [ML] Checking for presence of error object when validating response (#118375) #118795

Merged

jonathan-buttner mentioned this pull request Dec 18, 2024

[Inference API] Add Jina AI API to do inference for Embedding and Rerank models #118652

Merged

		@@ -0,0 +1,5 @@
		pr: 118375
		summary: Checking for presence of error object when validating response

[ML] Checking for presence of error object when validating response #118375

[ML] Checking for presence of error object when validating response #118375

Uh oh!

Conversation

jonathan-buttner commented Dec 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

elasticsearchmachine commented Dec 10, 2024

Uh oh!

elasticsearchmachine commented Dec 10, 2024

Uh oh!

maxhniebergall left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jonathan-buttner commented Dec 16, 2024

💚 All backports created successfully

Questions ?

Uh oh!

Uh oh!

jonathan-buttner commented Dec 10, 2024 •

edited

Loading