Skip to content

[ML] Checking for presence of error object when validating response #118375

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

jonathan-buttner
Copy link
Contributor

@jonathan-buttner jonathan-buttner commented Dec 10, 2024

This PR adds functionality to check if the response body is an error object. Normally this will be caught by first checking the status code. For streaming the status code can be 200 and the response can be an error.

To achieve this I refactored how the error parsing logic works. Typically we'd return null if a field within the error object was missing (for example if the message field was missing). The problem is that we want to consider it an error if the json is shaped like an error even if the message field is missing.

I also realized that every *ResponseHandler had this code:

    public void validateResponse(ThrottlerManager throttlerManager, Logger logger, Request request, HttpResult result)
        throws RetryException {
        checkForFailureStatusCode(request, result);
        checkForEmptyBody(throttlerManager, logger, request, result);
    }

So I moved it up into the base class to remove the duplication.

@jonathan-buttner jonathan-buttner added :ml Machine learning Team:ML Meta label for the ML team Feature:GenAI Features around GenAI v9.0.0 v8.18.0 labels Dec 10, 2024
@@ -28,21 +24,15 @@ public AlibabaCloudSearchResponseHandler(String requestType, ResponseParser pars
super(requestType, parseFunction, AlibabaCloudSearchErrorResponseEntity::fromResponse);
}

@Override
public void validateResponse(ThrottlerManager throttlerManager, Logger logger, Request request, HttpResult result)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move up into BaseResponseHandler.


// When the response is streamed the status code could be 200 but the error object will be set
// so we need to check for that specifically
checkForErrorObject(request, result);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have one potential concern for this. This will be executed for non-streaming and streaming code paths. So if for some reason we get a 200 response back (or some other non failure status code) and the response object is a valid response but also happens to have a field that the error object expects (which depends on each service implementation) then this would fail. I doubt that would happen.

If we were concerned about it we could create a new method validateStreamingResponse and only call checkForErrorObject in that method.


if (errorEntity.errorStructureFound()) {
// we don't really know what happened because the status code was 200 so we'll just retry
throw new RetryException(true, buildError(SERVER_ERROR_OBJECT, request, result, errorEntity));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we just retry if we get a failure mid-stream?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the OpenAI java and python libraries, it seems like they don't retry on error messages.
https://github.com/openai/openai-java/blob/9a6fa4aeea6738190dc1eb28cd52d82fbbf0b18b/openai-java-core/src/main/kotlin/com/openai/core/handlers/SseHandler.kt#L58
https://github.com/openai/openai-python/blob/995cce048f9427bba4f7ac1e5fc60abbf1f8f0b7/src/openai/_streaming.py#L70

The SSE spec does seem to include some allowance for retries, but I'm not sure that it allows for retry on an error returned from the EventSource. It seems that retries mid-stream might only be for disconnections. I'm not totally sure though. https://html.spec.whatwg.org/multipage/server-sent-events.html#event-stream-interpretation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

further, I think we need to consider what the UX will be if we retry. With streaming, the user will be seeing the response from the integration in real time. If we retry the request, will we be sending the full request again, and the streaming responses will start over from the beginning? That will probably cause weird messages to be displayed in the UI, unless the UI also handles this error case (and we would be need to communicate the error to the UI in a special way).

I think that it might be safer to simply relay the error message to the user, and let them retry if they want to.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great points. I'll switch it to skip retrying. I think that's also fine for the non-streaming case too. If we are in the non-streaming code path I'd expect a failure status code first which would cause a throw so we shouldn't even be getting to this code block in that scenario.

@@ -1,12 +0,0 @@
/*
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I turned this into the class ErrorResponse below.

public class ErrorResponse {

// Denotes an error object that was not found
public static final ErrorResponse UNDEFINED_ERROR = new ErrorResponse(false);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I need this is to differentiate between an error object being present but a specific field in the object missing (for example message) and when no error exists. Previously we were returning null in both of those scenarios. Now if there's an exception or the main field that defines an error object doesn't exist we'll return this instance to indicate that there was no error present.

If a nested field is missing we return a created instance with errorStructureFound set to true.

@@ -104,7 +95,7 @@ private static boolean isContentTooLarge(HttpResult result) {
}

if (statusCode == 400) {
var errorEntity = OpenAiErrorResponseEntity.fromResponse(result);
var errorEntity = ErrorMessageResponseEntity.fromResponse(result);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These were the same so I removed the OpenAiResponseEntity class.

}

@Override
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to the base class ErrorResponse.

public void testFromResponse_noMessage() {
String responseJson = """
{
"error": {
"type": "not_found_error",
"type": "not_found_error"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was actually causing a parsing failure instead of a missing field failure because of the trailing comma.

@jonathan-buttner jonathan-buttner marked this pull request as ready for review December 10, 2024 22:01
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@elasticsearchmachine
Copy link
Collaborator

Hi @jonathan-buttner, I've created a changelog YAML for you.

Copy link
Contributor

@maxhniebergall maxhniebergall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -0,0 +1,5 @@
pr: 118375
summary: Checking for presence of error object when validating response
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this will end up in the changelog, I think we should make sure to mention streaming and the inference API:
"Check for presence of error object when validating streaming responses from integrations in the inference API"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks forgot to update this 👍


if (errorEntity.errorStructureFound()) {
// we don't really know what happened because the status code was 200 so we'll just retry
throw new RetryException(true, buildError(SERVER_ERROR_OBJECT, request, result, errorEntity));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

further, I think we need to consider what the UX will be if we retry. With streaming, the user will be seeing the response from the integration in real time. If we retry the request, will we be sending the full request again, and the streaming responses will start over from the beginning? That will probably cause weird messages to be displayed in the UI, unless the UI also handles this error case (and we would be need to communicate the error to the UI in a special way).

I think that it might be safer to simply relay the error message to the user, and let them retry if they want to.

@jonathan-buttner jonathan-buttner merged commit f4dc716 into elastic:main Dec 11, 2024
16 checks passed
@jonathan-buttner jonathan-buttner deleted the ml-streaming-error-obj branch December 11, 2024 21:03
@jonathan-buttner
Copy link
Contributor Author

💚 All backports created successfully

Status Branch Result
8.x

Questions ?

Please refer to the Backport tool documentation

jonathan-buttner added a commit to jonathan-buttner/elasticsearch that referenced this pull request Dec 16, 2024
…lastic#118375)

* Refactoring error handling logic

* Refactoring base response handler to remove duplication

* Update docs/changelog/118375.yaml

* Addressing feedback

---------

Co-authored-by: Max Hniebergall <137079448+maxhniebergall@users.noreply.github.com>
(cherry picked from commit f4dc716)

# Conflicts:
#	x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/alibabacloudsearch/AlibabaCloudSearchResponseHandler.java
#	x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/anthropic/AnthropicResponseHandler.java
#	x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/cohere/CohereResponseHandler.java
#	x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/elastic/ElasticInferenceServiceResponseHandler.java
#	x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/googleaistudio/GoogleAiStudioResponseHandler.java
#	x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/googlevertexai/GoogleVertexAiResponseHandler.java
#	x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/huggingface/HuggingFaceResponseHandler.java
#	x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/ibmwatsonx/IbmWatsonxResponseHandler.java
#	x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/openai/OpenAiResponseHandler.java
jonathan-buttner added a commit that referenced this pull request Dec 16, 2024
…118375) (#118795)

* Refactoring error handling logic

* Refactoring base response handler to remove duplication

* Update docs/changelog/118375.yaml

* Addressing feedback

---------

Co-authored-by: Max Hniebergall <137079448+maxhniebergall@users.noreply.github.com>
(cherry picked from commit f4dc716)

# Conflicts:
#	x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/alibabacloudsearch/AlibabaCloudSearchResponseHandler.java
#	x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/anthropic/AnthropicResponseHandler.java
#	x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/cohere/CohereResponseHandler.java
#	x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/elastic/ElasticInferenceServiceResponseHandler.java
#	x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/googleaistudio/GoogleAiStudioResponseHandler.java
#	x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/googlevertexai/GoogleVertexAiResponseHandler.java
#	x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/huggingface/HuggingFaceResponseHandler.java
#	x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/ibmwatsonx/IbmWatsonxResponseHandler.java
#	x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/openai/OpenAiResponseHandler.java
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement Feature:GenAI Features around GenAI :ml Machine learning Team:ML Meta label for the ML team v8.18.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants