Embedding Model Reference
Free web tool: Embedding Model Reference
| Model | Provider | Dims | Max Tokens | MTEB | Multi | Price | Open |
|---|---|---|---|---|---|---|---|
text-embedding-3-large | OpenAI | 3072 | 8,191 | 64.6 | Yes | $0.13/1M | No |
text-embedding-3-small | OpenAI | 1536 | 8,191 | 62.3 | Yes | $0.02/1M | No |
text-embedding-4 | OpenAI | 3072 | 8,191 | 66.4 | Yes | $0.10/1M | No |
voyage-3 | Voyage AI | 1024 | 32,000 | 67.3 | Yes | $0.06/1M | No |
embed-v3.5 | Cohere | 1024 | 512 | 65.0 | Yes | $0.10/1M | No |
mistral-embed | Mistral | 1024 | 8,192 | 63.2 | Yes | $0.10/1M | No |
jina-embeddings-v3 | Jina AI | 1024 | 8,192 | 65.5 | Yes | $0.02/1M | No |
BGE-large-en-v1.5 | BAAI | 1024 | 512 | 64.2 | No | Free | Yes |
BGE-M3 | BAAI | 1024 | 8,192 | 68.1 | Yes | Free | Yes |
nomic-embed-text-v1.5 | Nomic | 768 | 8,192 | 62.3 | No | Free | Yes |
e5-mistral-7b-instruct | Microsoft | 4096 | 32,768 | 66.6 | Yes | Free | Yes |
e5-large-v2 | Microsoft | 1024 | 512 | 62.0 | No | Free | Yes |
GTE-Qwen2-7B-instruct | Alibaba | 3584 | 32,768 | 67.2 | Yes | Free | Yes |
GTE-large | Alibaba | 1024 | 8,192 | 63.1 | No | Free | Yes |
all-MiniLM-L6-v2 | SBERT | 384 | 256 | 56.3 | No | Free | Yes |
About Embedding Model Reference
The Embedding Model Reference is a searchable comparison table of 10 widely used text embedding models from OpenAI, Voyage AI, Cohere, BAAI, Microsoft, Nomic, SBERT, and Alibaba. For each model it shows the provider, vector dimensions, maximum input token limit, pricing per million tokens, and whether the model is open-source. This makes it easy to compare models at a glance when choosing an embedding backend for a RAG pipeline, semantic search system, or similarity-based application.
The reference covers a carefully curated mix of commercial and open-source models: text-embedding-3-large (3072 dimensions, OpenAI), voyage-3 (Voyage AI), embed-v3.5 (Cohere), BGE-large-en-v1.5 and BGE-M3 (BAAI, free), nomic-embed-text-v1.5 (Nomic, free), e5-large-v2 (Microsoft, free), all-MiniLM-L6-v2 (SBERT, 384 dimensions, free), and GTE-large (Alibaba, free). This covers the full spectrum from lightweight models suitable for edge use to high-dimensional commercial models for enterprise search.
Use the search bar to filter models by name or provider, and use the All / Open Source / Commercial toggle to narrow results. All filtering is performed client-side — the dataset is embedded in the page bundle, so searches are instant and require no network request. The reference is particularly useful for ML engineers evaluating embedding cost and quality trade-offs, developers integrating vector databases, and researchers comparing open-weight models.
Key Features
- Covers 10 embedding models: OpenAI, Voyage AI, Cohere, BAAI (BGE), Microsoft (E5), Nomic, SBERT, Alibaba (GTE)
- Displays vector dimensions, max input tokens, and per-million token pricing for each model
- Open-source / commercial filter toggle to quickly narrow to free or paid models
- Real-time search by model name or provider name
- Result count shows how many models match the current filter
- Open-source badge (Yes/No) for quick identification of freely available models
- 100% client-side — all data is bundled in the page, no API calls needed
- Dark mode support and responsive table layout for desktop and mobile use
Frequently Asked Questions
What is a text embedding model?
A text embedding model converts text (words, sentences, or documents) into dense numerical vectors in a high-dimensional space. Texts with similar meanings produce vectors that are close together in this space. Embeddings are the foundation of semantic search, retrieval-augmented generation (RAG), clustering, classification, and recommendation systems.
What do the dimensions in the reference mean?
Dimensions refers to the length of the vector produced by the model. text-embedding-3-large produces 3072-dimensional vectors, while all-MiniLM-L6-v2 produces 384-dimensional vectors. Higher dimensions can capture more semantic nuance but require more storage and computation. For many applications, 768 or 1024 dimensions provide a good balance.
What is the max tokens limit?
Max tokens is the maximum number of tokens the model can process in a single embedding request. OpenAI models support 8,191 tokens (~6,000 words), while some older models like all-MiniLM-L6-v2 support only 256. For long documents, you need to chunk them into segments that fit within the model's token limit before embedding.
Which model should I choose for a RAG application?
For production RAG systems, text-embedding-3-small (OpenAI, $0.02/1M tokens, 1536 dims) is a popular cost-effective choice. For open-source options, BGE-M3 (BAAI, free, 1024 dims, 8192 max tokens) is highly capable. For lightweight on-device use, all-MiniLM-L6-v2 (384 dims, free) is fast and resource-efficient.
What is the difference between open-source and commercial models?
Open-source models (BGE, E5, nomic-embed, all-MiniLM, GTE) are free to use and can be self-hosted on your own infrastructure. Commercial models (OpenAI, Voyage AI, Cohere) are accessed via API and charge per token. Open-source models give you data privacy and no per-query costs, while commercial models are often easier to integrate and may offer higher quality.
How does BGE-M3 differ from BGE-large-en-v1.5?
BGE-large-en-v1.5 is optimized for English with a 512 max token limit, while BGE-M3 is a multilingual model supporting 100+ languages with an 8,192 token limit. BGE-M3 is generally preferred for multilingual applications or when longer context windows are needed.
What is the pricing model for commercial embeddings?
Commercial embedding APIs charge per million input tokens. OpenAI's text-embedding-3-large costs $0.13/1M tokens and text-embedding-3-small costs $0.02/1M tokens. Voyage AI's voyage-3 costs $0.06/1M tokens. Cohere's embed-v3.5 costs $0.10/1M tokens. All other models in this reference are free to use.
Can I use this reference to decide on vector database dimensionality?
Yes. The dimensions column directly tells you what index size to configure in your vector database (Pinecone, Weaviate, Qdrant, Milvus, pgvector, etc.). For example, if you choose text-embedding-3-large, configure your index for 3072 dimensions. Choosing a model with fewer dimensions reduces storage and improves query speed.