-
Notifications
You must be signed in to change notification settings - Fork 25.4k
[ML] Checking for presence of error object when validating response #118375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML] Checking for presence of error object when validating response #118375
Conversation
@@ -28,21 +24,15 @@ public AlibabaCloudSearchResponseHandler(String requestType, ResponseParser pars | |||
super(requestType, parseFunction, AlibabaCloudSearchErrorResponseEntity::fromResponse); | |||
} | |||
|
|||
@Override | |||
public void validateResponse(ThrottlerManager throttlerManager, Logger logger, Request request, HttpResult result) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move up into BaseResponseHandler
.
|
||
// When the response is streamed the status code could be 200 but the error object will be set | ||
// so we need to check for that specifically | ||
checkForErrorObject(request, result); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have one potential concern for this. This will be executed for non-streaming and streaming code paths. So if for some reason we get a 200 response back (or some other non failure status code) and the response object is a valid response but also happens to have a field that the error object expects (which depends on each service implementation) then this would fail. I doubt that would happen.
If we were concerned about it we could create a new method validateStreamingResponse
and only call checkForErrorObject
in that method.
|
||
if (errorEntity.errorStructureFound()) { | ||
// we don't really know what happened because the status code was 200 so we'll just retry | ||
throw new RetryException(true, buildError(SERVER_ERROR_OBJECT, request, result, errorEntity)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we just retry if we get a failure mid-stream?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the OpenAI java and python libraries, it seems like they don't retry on error messages.
https://github.com/openai/openai-java/blob/9a6fa4aeea6738190dc1eb28cd52d82fbbf0b18b/openai-java-core/src/main/kotlin/com/openai/core/handlers/SseHandler.kt#L58
https://github.com/openai/openai-python/blob/995cce048f9427bba4f7ac1e5fc60abbf1f8f0b7/src/openai/_streaming.py#L70
The SSE spec does seem to include some allowance for retries, but I'm not sure that it allows for retry on an error returned from the EventSource. It seems that retries mid-stream might only be for disconnections. I'm not totally sure though. https://html.spec.whatwg.org/multipage/server-sent-events.html#event-stream-interpretation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
further, I think we need to consider what the UX will be if we retry. With streaming, the user will be seeing the response from the integration in real time. If we retry the request, will we be sending the full request again, and the streaming responses will start over from the beginning? That will probably cause weird messages to be displayed in the UI, unless the UI also handles this error case (and we would be need to communicate the error to the UI in a special way).
I think that it might be safer to simply relay the error message to the user, and let them retry if they want to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great points. I'll switch it to skip retrying. I think that's also fine for the non-streaming case too. If we are in the non-streaming code path I'd expect a failure status code first which would cause a throw so we shouldn't even be getting to this code block in that scenario.
@@ -1,12 +0,0 @@ | |||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I turned this into the class ErrorResponse
below.
public class ErrorResponse { | ||
|
||
// Denotes an error object that was not found | ||
public static final ErrorResponse UNDEFINED_ERROR = new ErrorResponse(false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I need this is to differentiate between an error object being present but a specific field in the object missing (for example message
) and when no error exists. Previously we were returning null in both of those scenarios. Now if there's an exception or the main field that defines an error object doesn't exist we'll return this instance to indicate that there was no error present.
If a nested field is missing we return a created instance with errorStructureFound
set to true.
@@ -104,7 +95,7 @@ private static boolean isContentTooLarge(HttpResult result) { | |||
} | |||
|
|||
if (statusCode == 400) { | |||
var errorEntity = OpenAiErrorResponseEntity.fromResponse(result); | |||
var errorEntity = ErrorMessageResponseEntity.fromResponse(result); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These were the same so I removed the OpenAiResponseEntity
class.
} | ||
|
||
@Override |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to the base class ErrorResponse
.
public void testFromResponse_noMessage() { | ||
String responseJson = """ | ||
{ | ||
"error": { | ||
"type": "not_found_error", | ||
"type": "not_found_error" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was actually causing a parsing failure instead of a missing field failure because of the trailing comma.
Pinging @elastic/ml-core (Team:ML) |
Hi @jonathan-buttner, I've created a changelog YAML for you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
docs/changelog/118375.yaml
Outdated
@@ -0,0 +1,5 @@ | |||
pr: 118375 | |||
summary: Checking for presence of error object when validating response |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this will end up in the changelog, I think we should make sure to mention streaming and the inference API:
"Check for presence of error object when validating streaming responses from integrations in the inference API"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks forgot to update this 👍
|
||
if (errorEntity.errorStructureFound()) { | ||
// we don't really know what happened because the status code was 200 so we'll just retry | ||
throw new RetryException(true, buildError(SERVER_ERROR_OBJECT, request, result, errorEntity)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
further, I think we need to consider what the UX will be if we retry. With streaming, the user will be seeing the response from the integration in real time. If we retry the request, will we be sending the full request again, and the streaming responses will start over from the beginning? That will probably cause weird messages to be displayed in the UI, unless the UI also handles this error case (and we would be need to communicate the error to the UI in a special way).
I think that it might be safer to simply relay the error message to the user, and let them retry if they want to.
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
…lastic#118375) * Refactoring error handling logic * Refactoring base response handler to remove duplication * Update docs/changelog/118375.yaml * Addressing feedback --------- Co-authored-by: Max Hniebergall <137079448+maxhniebergall@users.noreply.github.com> (cherry picked from commit f4dc716) # Conflicts: # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/alibabacloudsearch/AlibabaCloudSearchResponseHandler.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/anthropic/AnthropicResponseHandler.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/cohere/CohereResponseHandler.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/elastic/ElasticInferenceServiceResponseHandler.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/googleaistudio/GoogleAiStudioResponseHandler.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/googlevertexai/GoogleVertexAiResponseHandler.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/huggingface/HuggingFaceResponseHandler.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/ibmwatsonx/IbmWatsonxResponseHandler.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/openai/OpenAiResponseHandler.java
…118375) (#118795) * Refactoring error handling logic * Refactoring base response handler to remove duplication * Update docs/changelog/118375.yaml * Addressing feedback --------- Co-authored-by: Max Hniebergall <137079448+maxhniebergall@users.noreply.github.com> (cherry picked from commit f4dc716) # Conflicts: # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/alibabacloudsearch/AlibabaCloudSearchResponseHandler.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/anthropic/AnthropicResponseHandler.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/cohere/CohereResponseHandler.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/elastic/ElasticInferenceServiceResponseHandler.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/googleaistudio/GoogleAiStudioResponseHandler.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/googlevertexai/GoogleVertexAiResponseHandler.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/huggingface/HuggingFaceResponseHandler.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/ibmwatsonx/IbmWatsonxResponseHandler.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/openai/OpenAiResponseHandler.java
This PR adds functionality to check if the response body is an error object. Normally this will be caught by first checking the status code. For streaming the status code can be 200 and the response can be an error.
To achieve this I refactored how the error parsing logic works. Typically we'd return null if a field within the error object was missing (for example if the
message
field was missing). The problem is that we want to consider it an error if the json is shaped like an error even if themessage
field is missing.I also realized that every
*ResponseHandler
had this code:So I moved it up into the base class to remove the duplication.