Skip to content

Fixing bug with bedrock client caching #118177

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Dec 10, 2024
Merged

Fixing bug with bedrock client caching #118177

merged 6 commits into from
Dec 10, 2024

Conversation

ymao1
Copy link
Contributor

@ymao1 ymao1 commented Dec 6, 2024

Resolves #117916

Summary

  1. Fixes an order of operations bug with flushing the client cache and retrieving the client. Previously, we look in the cache for a client and return it, regardless of whether it is expired or not, and then we flush the cache of expired clients. This was causing the event executor terminated error. This PR changes the order so that we flush the cache of expired clients first and then we try to retrieve the client (creating a new one if the expired one was flushed).

  2. Updates the expiration date of the client if the client is used. This allows an actively used client to continue being cached. Previously, the client would be flushed every 5 minutes regardless of whether it's in use or not. If this is not the desired behavior, I can revert this change.

@ymao1 ymao1 added :ml Machine learning Team:ML Meta label for the ML team auto-backport Automatically create backport pull requests when merged v8.18.0 labels Dec 6, 2024
@ymao1 ymao1 self-assigned this Dec 6, 2024
@ymao1 ymao1 added the >bug label Dec 6, 2024
@elasticsearchmachine
Copy link
Collaborator

Hi @ymao1, I've created a changelog YAML for you.

@ymao1 ymao1 marked this pull request as ready for review December 6, 2024 19:07
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@@ -0,0 +1,6 @@
pr: 118177
summary: Fixing bug with bedrock client caching
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe something like: Fixing bedrock event executor terminated cache issue

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 5882aa4

builtClient.resetExpiration();
return builtClient;
return clientsCache.compute(modelHash, (hashKey, client) -> {
if (client == null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think we could collapse this to something like:

AmazonBedrockBaseClient clientToUse = client;
if (clientToUse == null) {
  clientToUse = creator.apply(model, timeout);
}

clientToUse.setClock(clock);
clientToUse.resetExpiration();
return clientToUse;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 5882aa4

return builtClient;
} else {
// for testing
client.setClock(clock);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This is completely unrelated to your PR but I wonder if we could refactor the client class' creator function to take the clock as a parameter so we can avoid the setClock() class here 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not tackle this comment in this PR but I left a code comment. Would you like me to open an issue?

@ymao1 ymao1 merged commit 43e298f into elastic:main Dec 10, 2024
16 checks passed
@ymao1 ymao1 deleted the es-117916 branch December 10, 2024 16:15
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.x

ymao1 added a commit to ymao1/elasticsearch that referenced this pull request Dec 10, 2024
* Fixing bug with bedrock client caching

* Update docs/changelog/118177.yaml

* PR feedback
elasticsearchmachine pushed a commit that referenced this pull request Dec 10, 2024
* Fixing bug with bedrock client caching

* Update docs/changelog/118177.yaml

* PR feedback
prwhelan pushed a commit to prwhelan/elasticsearch that referenced this pull request Mar 28, 2025
* Fixing bug with bedrock client caching

* Update docs/changelog/118177.yaml

* PR feedback
prwhelan added a commit that referenced this pull request Mar 28, 2025
* Fixing bug with bedrock client caching

* Update docs/changelog/118177.yaml

* PR feedback

Co-authored-by: Ying Mao <ying.mao@elastic.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged >bug :ml Machine learning Team:ML Meta label for the ML team v8.18.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Inference API returning Unable to execute HTTP request: event executor terminated
3 participants