Skip to content

feat: VoyageAI integration #122134

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 28 commits into from
Feb 20, 2025
Merged

feat: VoyageAI integration #122134

merged 28 commits into from
Feb 20, 2025

Conversation

fzowl
Copy link
Contributor

@fzowl fzowl commented Feb 9, 2025

VoyageAI embeddings and rerank model integration through the API (https://docs.voyageai.com/reference/embeddings-api and https://docs.voyageai.com/reference/reranker-api)

fzowl and others added 15 commits February 5, 2025 17:09
 - embeddings works, tested
 - initial rerank code

What's missing:
 - unit and integration tests
 - rerank request/response mapping and verification
 - embeddings works, tested
 - rerank works, tested (https://www.elastic.co/search-labs/blog/elasticsearch-cohere-rerank)

What's missing:
 - unit and integration tests
 - embeddings works, tested
 - rerank works, tested (https://www.elastic.co/search-labs/blog/elasticsearch-cohere-rerank)

What's missing:
 - unit and integration tests
 - embeddings works, tested
 - rerank works, tested (https://www.elastic.co/search-labs/blog/elasticsearch-cohere-rerank)

What's missing:
 - unit and integration tests
Moving dimensions to ServiceSettings
feat: VoyageAI integration
@elasticsearchmachine
Copy link
Collaborator

@fzowl please enable the option "Allow edits and access to secrets by maintainers" on your PR. For more information, see the documentation.

@elasticsearchmachine elasticsearchmachine added needs:triage Requires assignment of a team area label v9.1.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels Feb 9, 2025
@fzowl
Copy link
Contributor Author

fzowl commented Feb 9, 2025

@davidkyle, can you please take a look? (sorry, this PR became huge with all the integrations and the tests)

@davidkyle davidkyle self-assigned this Feb 10, 2025
@davidkyle davidkyle added :ml Machine learning v8.19.0 and removed needs:triage Requires assignment of a team area label labels Feb 10, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@elasticsearchmachine elasticsearchmachine added the Team:ML Meta label for the ML team label Feb 10, 2025
@davidkyle
Copy link
Member

@elasticmachine test this please

@jonathan-buttner
Copy link
Contributor

@elasticmachine test this please

@jonathan-buttner
Copy link
Contributor

Hmm I think spotless is complaining, running spotless like this might help:

./gradlew :x-pack:plugin:core:spotlessApply :x-pack:plugin:core:precommit :x-pack:plugin:ml:spotlessApply :x-pack:plugin:ml:precommit :x-pack:plugin:inference:spotlessApply :x-pack:plugin:inference:precommit :x-pack:plugin:ml:qa:native-multi-node-tests:precommit :server:precommit

@fzowl
Copy link
Contributor Author

fzowl commented Feb 19, 2025

@jonathan-buttner Thanks! This solved the problem for me: ./gradlew :x-pack:plugin:inference:qa:inference-service-tests:spotlessApply

@jonathan-buttner
Copy link
Contributor

@elasticmachine test this please

@fzowl
Copy link
Contributor Author

fzowl commented Feb 20, 2025

@jonathan-buttner can you please help me here? I found that i can reproduce the test with ./gradlew ":rest-api-spec:yamlRestCompatTest" --tests "org.elasticsearch.test.rest.ClientYamlTestSuiteIT" -Dtests.method="test {yaml=indices.create/21_synthetic_source_stored/index param - field ordering}" -Dtests.seed=29856D0D59D74338 -Dtests.locale=csw -Dtests.timezone=US/Pacific -Druntime.java=23 and i could reproduce....but honestly, i don't know why it fails.

Can you please point me to the right direction to resolve this? Also, I gave you write permission so please feel free to make any adjustment required. Thank you!

@jonathan-buttner
Copy link
Contributor

@elasticmachine merge upstream

@jonathan-buttner
Copy link
Contributor

@jonathan-buttner can you please help me here?

Yeah that's probably a flaky test. Try merging in main and I'll run the tests again.

@jonathan-buttner
Copy link
Contributor

Also, I gave you write permission so please feel free to make any adjustment required. Thank you!

Oh just saw that, thanks! Let me see if I can merge upstream for you

@jonathan-buttner
Copy link
Contributor

jonathan-buttner commented Feb 20, 2025

@fzowl hmm, I'm having trouble pushing directly to the upstream. Normally I can gh pr checkout 122134 and then push directly but doesn't seem like that's working:

❯ git push
fatal: The upstream branch of your current branch does not match
the name of your current branch.  To push to the upstream branch
on the remote, use

    git push upstream HEAD:refs/pull/122134/head

To push to the branch of the same name on the remote, use

    git push upstream HEAD

To avoid automatically configuring an upstream branch when its name
won't match the local branch, see option 'simple' of branch.autoSetupMerge
in 'git help config'.

I merged main, and had to resolve the conflicts in the TransportVersions file. Looks like the number changed so we needed to bump the public static final TransportVersion VOYAGE_AI_INTEGRATION_ADDED_BACKPORT_8_X = def(8_841_0_05); note the 8_841_0_05 instead of 8_840_0_05.

Could you try merging upstream?

@jonathan-buttner
Copy link
Contributor

Never mind, I think I was able to do it!

@jonathan-buttner
Copy link
Contributor

@elasticmachine test this please

@fzowl
Copy link
Contributor Author

fzowl commented Feb 20, 2025

I merged main, and had to resolve the conflicts in the TransportVersions file. Looks like the number changed so we needed to bump the public static final TransportVersion VOYAGE_AI_INTEGRATION_ADDED_BACKPORT_8_X = def(8_841_0_05); note the 8_841_0_05 instead of 8_840_0_05.

Yeah, i noticed that a new minor version is created very frequently (for every merged PR?), that's why i wanted to modify TransportVersions right before you merge my PR: #122134

@jonathan-buttner
Copy link
Contributor

@elasticmachine test this please

@jonathan-buttner
Copy link
Contributor

@elasticmachine test this please

@fzowl
Copy link
Contributor Author

fzowl commented Feb 20, 2025

@jonathan-buttner This looks good now!

@jonathan-buttner jonathan-buttner merged commit 521f855 into elastic:main Feb 20, 2025
18 checks passed
@fzowl
Copy link
Contributor Author

fzowl commented Feb 20, 2025

@jonathan-buttner Thank you very much for your help and support!

@jonathan-buttner
Copy link
Contributor

Thanks for all the work to get this implemented! Glad I could help.

afoucret pushed a commit to afoucret/elasticsearch that referenced this pull request Feb 21, 2025
* VoyageAI embeddings and rerank:
 - embeddings works, tested
 - initial rerank code

What's missing:
 - unit and integration tests
 - rerank request/response mapping and verification

* VoyageAI embeddings and rerank:
 - embeddings works, tested
 - rerank works, tested (https://www.elastic.co/search-labs/blog/elasticsearch-cohere-rerank)

What's missing:
 - unit and integration tests

* VoyageAI embeddings and rerank:
 - embeddings works, tested
 - rerank works, tested (https://www.elastic.co/search-labs/blog/elasticsearch-cohere-rerank)

What's missing:
 - unit and integration tests

* VoyageAI embeddings and rerank:
 - embeddings works, tested
 - rerank works, tested (https://www.elastic.co/search-labs/blog/elasticsearch-cohere-rerank)

What's missing:
 - unit and integration tests

* Adding initial tests
Moving dimensions to ServiceSettings

* Correcting the TransportVersions.java

* Correcting due to comments

* Adding BIT support

* Initial tests

* More tests

* More tests/corrections

* Removing warnings

* Further tests

* Transport version correction

* Adding changelog and correcting TransportVersions

* Spotless tests

* Changes due to the comments

* Changes due to the comments

* Correcting QA tests

* Correcting QA tests

---------

Co-authored-by: Jonathan Buttner <jonathan.buttner@elastic.co>
Co-authored-by: Jonathan Buttner <56361221+jonathan-buttner@users.noreply.github.com>
@jonathan-buttner
Copy link
Contributor

💚 All backports created successfully

Status Branch Result
8.x

Questions ?

Please refer to the Backport tool documentation

jonathan-buttner pushed a commit to jonathan-buttner/elasticsearch that referenced this pull request Feb 28, 2025
* VoyageAI embeddings and rerank:
 - embeddings works, tested
 - initial rerank code

What's missing:
 - unit and integration tests
 - rerank request/response mapping and verification

* VoyageAI embeddings and rerank:
 - embeddings works, tested
 - rerank works, tested (https://www.elastic.co/search-labs/blog/elasticsearch-cohere-rerank)

What's missing:
 - unit and integration tests

* VoyageAI embeddings and rerank:
 - embeddings works, tested
 - rerank works, tested (https://www.elastic.co/search-labs/blog/elasticsearch-cohere-rerank)

What's missing:
 - unit and integration tests

* VoyageAI embeddings and rerank:
 - embeddings works, tested
 - rerank works, tested (https://www.elastic.co/search-labs/blog/elasticsearch-cohere-rerank)

What's missing:
 - unit and integration tests

* Adding initial tests
Moving dimensions to ServiceSettings

* Correcting the TransportVersions.java

* Correcting due to comments

* Adding BIT support

* Initial tests

* More tests

* More tests/corrections

* Removing warnings

* Further tests

* Transport version correction

* Adding changelog and correcting TransportVersions

* Spotless tests

* Changes due to the comments

* Changes due to the comments

* Correcting QA tests

* Correcting QA tests

---------

Co-authored-by: Jonathan Buttner <jonathan.buttner@elastic.co>
Co-authored-by: Jonathan Buttner <56361221+jonathan-buttner@users.noreply.github.com>
(cherry picked from commit 521f855)

# Conflicts:
#	server/src/main/java/org/elasticsearch/TransportVersions.java
#	x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/results/TextEmbeddingResultsTests.java
jonathan-buttner added a commit that referenced this pull request Mar 4, 2025
* feat: VoyageAI integration (#122134)

* VoyageAI embeddings and rerank:
 - embeddings works, tested
 - initial rerank code

What's missing:
 - unit and integration tests
 - rerank request/response mapping and verification

* VoyageAI embeddings and rerank:
 - embeddings works, tested
 - rerank works, tested (https://www.elastic.co/search-labs/blog/elasticsearch-cohere-rerank)

What's missing:
 - unit and integration tests

* VoyageAI embeddings and rerank:
 - embeddings works, tested
 - rerank works, tested (https://www.elastic.co/search-labs/blog/elasticsearch-cohere-rerank)

What's missing:
 - unit and integration tests

* VoyageAI embeddings and rerank:
 - embeddings works, tested
 - rerank works, tested (https://www.elastic.co/search-labs/blog/elasticsearch-cohere-rerank)

What's missing:
 - unit and integration tests

* Adding initial tests
Moving dimensions to ServiceSettings

* Correcting the TransportVersions.java

* Correcting due to comments

* Adding BIT support

* Initial tests

* More tests

* More tests/corrections

* Removing warnings

* Further tests

* Transport version correction

* Adding changelog and correcting TransportVersions

* Spotless tests

* Changes due to the comments

* Changes due to the comments

* Correcting QA tests

* Correcting QA tests

---------

Co-authored-by: Jonathan Buttner <jonathan.buttner@elastic.co>
Co-authored-by: Jonathan Buttner <56361221+jonathan-buttner@users.noreply.github.com>
(cherry picked from commit 521f855)

# Conflicts:
#	server/src/main/java/org/elasticsearch/TransportVersions.java
#	x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/results/TextEmbeddingResultsTests.java

* Using correct transport version and fixing errors

* Fixing errors and test failures

---------

Co-authored-by: fzowl <160063452+fzowl@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement external-contributor Pull request authored by a developer outside the Elasticsearch team :ml Machine learning Team:ML Meta label for the ML team v8.19.0 v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants