Question 1

What is a text embedding model?

Accepted Answer

A text embedding model converts text (words, sentences, or documents) into dense numerical vectors in a high-dimensional space. Texts with similar meanings produce vectors that are close together in this space. Embeddings are the foundation of semantic search, retrieval-augmented generation (RAG), clustering, classification, and recommendation systems.

Question 2

What do the dimensions in the reference mean?

Accepted Answer

Dimensions refers to the length of the vector produced by the model. text-embedding-3-large produces 3072-dimensional vectors, while all-MiniLM-L6-v2 produces 384-dimensional vectors. Higher dimensions can capture more semantic nuance but require more storage and computation. For many applications, 768 or 1024 dimensions provide a good balance.

Question 3

What is the max tokens limit?

Accepted Answer

Max tokens is the maximum number of tokens the model can process in a single embedding request. OpenAI models support 8,191 tokens (~6,000 words), while some older models like all-MiniLM-L6-v2 support only 256. For long documents, you need to chunk them into segments that fit within the model's token limit before embedding.

Question 4

Which model should I choose for a RAG application?

Accepted Answer

For production RAG systems, text-embedding-3-small (OpenAI, $0.02/1M tokens, 1536 dims) is a popular cost-effective choice. For open-source options, BGE-M3 (BAAI, free, 1024 dims, 8192 max tokens) is highly capable. For lightweight on-device use, all-MiniLM-L6-v2 (384 dims, free) is fast and resource-efficient.

Question 5

What is the difference between open-source and commercial models?

Accepted Answer

Open-source models (BGE, E5, nomic-embed, all-MiniLM, GTE) are free to use and can be self-hosted on your own infrastructure. Commercial models (OpenAI, Voyage AI, Cohere) are accessed via API and charge per token. Open-source models give you data privacy and no per-query costs, while commercial models are often easier to integrate and may offer higher quality.

Question 6

How does BGE-M3 differ from BGE-large-en-v1.5?

Accepted Answer

BGE-large-en-v1.5 is optimized for English with a 512 max token limit, while BGE-M3 is a multilingual model supporting 100+ languages with an 8,192 token limit. BGE-M3 is generally preferred for multilingual applications or when longer context windows are needed.

Question 7

What is the pricing model for commercial embeddings?

Accepted Answer

Commercial embedding APIs charge per million input tokens. OpenAI's text-embedding-3-large costs $0.13/1M tokens and text-embedding-3-small costs $0.02/1M tokens. Voyage AI's voyage-3 costs $0.06/1M tokens. Cohere's embed-v3.5 costs $0.10/1M tokens. All other models in this reference are free to use.

Question 8

Can I use this reference to decide on vector database dimensionality?

Accepted Answer

Yes. The dimensions column directly tells you what index size to configure in your vector database (Pinecone, Weaviate, Qdrant, Milvus, pgvector, etc.). For example, if you choose text-embedding-3-large, configure your index for 3072 dimensions. Choosing a model with fewer dimensions reduces storage and improves query speed.

Model	Provider	Dims	Max Tokens	MTEB	Multi	Price	Open
`text-embedding-3-large`	OpenAI	3072	8,191	64.6	Yes	$0.13/1M	No
`text-embedding-3-small`	OpenAI	1536	8,191	62.3	Yes	$0.02/1M	No
`text-embedding-4`	OpenAI	3072	8,191	66.4	Yes	$0.10/1M	No
`voyage-3`	Voyage AI	1024	32,000	67.3	Yes	$0.06/1M	No
`embed-v3.5`	Cohere	1024	512	65.0	Yes	$0.10/1M	No
`mistral-embed`	Mistral	1024	8,192	63.2	Yes	$0.10/1M	No
`jina-embeddings-v3`	Jina AI	1024	8,192	65.5	Yes	$0.02/1M	No
`BGE-large-en-v1.5`	BAAI	1024	512	64.2	No	Free	Yes
`BGE-M3`	BAAI	1024	8,192	68.1	Yes	Free	Yes
`nomic-embed-text-v1.5`	Nomic	768	8,192	62.3	No	Free	Yes
`e5-mistral-7b-instruct`	Microsoft	4096	32,768	66.6	Yes	Free	Yes
`e5-large-v2`	Microsoft	1024	512	62.0	No	Free	Yes
`GTE-Qwen2-7B-instruct`	Alibaba	3584	32,768	67.2	Yes	Free	Yes
`GTE-large`	Alibaba	1024	8,192	63.1	No	Free	Yes
`all-MiniLM-L6-v2`	SBERT	384	256	56.3	No	Free	Yes

Embedding Model Reference

Related Tools

About Embedding Model Reference

Key Features

Frequently Asked Questions