So, based on the emphasis on "retrieval" in the blog title, are these models narrowly focused on retrieval tasks only? E.g. I shouldn't use them for clustering, STS, and so on?
How does voyage-3.5 compare against Gemini Embedding (GE)? I thought GE had top spot for retrieval tasks on MMTEB. Is Voyage saying here that voyage-3.5 now has the top spot? Or is it just that, for the 10 specified datasets, voyage-3.5 outperforms the specified OpenAI and Cohere models.
There's the interesting question here - has embedding performance reached saturation? In practice, most people are pulling in 25 to 100 candidates and reranking the results. Does it really matter if a model is 1 - 3% better on pulling in the top 10 when it's probably going to be captured in the top 50? I think at this point the real frontier is making these models as small as possible to minimize hosting costs.
I think it really depends on the use case. It is well known that most users really only look and engage with the top few (1-3) results in a search. If you can get the most relevant result from position, let’s say 7 to 2, that can have a big impact on the user experience. And I know they market this for RAG, but I think that’s just marketing and this is as relevant for traditional search.
Voyage models are great in my experience and I am planing to test 3.5. Almost more interested in 3.5-lite though. Great price.
My concern: voyage api has been unreliable. They were bought by mango db, which makes me a little uneasy.
Gemini embeddings look like a great model but it’s in preview and there haven’t been any updates for a while (including at io). Also not sure how committed Google is to embeddings models.
So, based on the emphasis on "retrieval" in the blog title, are these models narrowly focused on retrieval tasks only? E.g. I shouldn't use them for clustering, STS, and so on?
How does voyage-3.5 compare against Gemini Embedding (GE)? I thought GE had top spot for retrieval tasks on MMTEB. Is Voyage saying here that voyage-3.5 now has the top spot? Or is it just that, for the 10 specified datasets, voyage-3.5 outperforms the specified OpenAI and Cohere models.
If a competitor model is conspicuously absent, it is likely better
There's the interesting question here - has embedding performance reached saturation? In practice, most people are pulling in 25 to 100 candidates and reranking the results. Does it really matter if a model is 1 - 3% better on pulling in the top 10 when it's probably going to be captured in the top 50? I think at this point the real frontier is making these models as small as possible to minimize hosting costs.
I think it really depends on the use case. It is well known that most users really only look and engage with the top few (1-3) results in a search. If you can get the most relevant result from position, let’s say 7 to 2, that can have a big impact on the user experience. And I know they market this for RAG, but I think that’s just marketing and this is as relevant for traditional search.
Voyage models are great in my experience and I am planing to test 3.5. Almost more interested in 3.5-lite though. Great price.
My concern: voyage api has been unreliable. They were bought by mango db, which makes me a little uneasy.
Gemini embeddings look like a great model but it’s in preview and there haven’t been any updates for a while (including at io). Also not sure how committed Google is to embeddings models.
How do these models compare to BGE M3