Embedding Providers
Configure an embedding provider in lucerna.config.ts to enable semantic (vector) search. Without one, only lexical (BM25) search is available.
Each example below shows the lucerna.config.ts snippet for that provider. Pass credentials via process.env to keep them out of source control.
Voyage ⭐ recommended for code
Section titled “Voyage ⭐ recommended for code”Best-in-class for code retrieval. voyage-code-3 is the recommended model for code search.
import { defineConfig } from "@upstart.gg/lucerna";export default defineConfig({ embedding: { provider: "voyage", model: "voyage-code-3", apiKey: process.env.VOYAGE_API_KEY! },});All Voyage models are Matryoshka-trained. Lucerna defaults to a reduced dimensionality (see “Default dim” below) for a ~50% smaller index with negligible quality loss; pass dimensions to override.
| Model | Native dim | Default dim | Price per 1M tokens | Notes |
|---|---|---|---|---|
voyage-4 | 1024 | 512 | — | Latest generation |
voyage-4-lite | 512 | 512 | — | Faster, lower cost |
voyage-code-3 | 1024 | 512 | $0.18 | Code-optimized, 32K ctx — top pick |
voyage-3-large | 2048 | 1024 | — | Highest quality general-purpose |
voyage-3.5 | 1024 | 512 | — | Improved general quality |
voyage-3.5-lite | 1024 | 512 | — | Faster, lower cost |
voyage-3 | 1024 | 512 | $0.06 | General text |
voyage-3-lite | 512 | 512 | $0.02 | Faster, lower cost |
Lucerna uses asymmetric retrieval automatically: indexed chunks use input_type: "document" and queries use input_type: "query", matching Voyage’s recommendation for best retrieval quality.
OpenAI
Section titled “OpenAI”import { defineConfig } from "@upstart.gg/lucerna";export default defineConfig({ embedding: { provider: "openai", model: "text-embedding-3-small", apiKey: process.env.OPENAI_API_KEY! },});OpenAI’s text-embedding-3 models are Matryoshka-trained. Lucerna passes a reduced dimensions to the API by default (512 for small, 768 for large) — at these sizes the MTEB numbers still beat text-embedding-ada-002 at its full 1536, per OpenAI’s own benchmarks. Override via the dimensions option:
embedding: { provider: "openai", model: "text-embedding-3-large", apiKey: process.env.OPENAI_API_KEY!, dimensions: 3072 },| Model | Native dim | Default dim | Price per 1M tokens | Notes |
|---|---|---|---|---|
text-embedding-3-small | 1536 | 512 | $0.02 | Good quality/cost balance |
text-embedding-3-large | 3072 | 768 | $0.13 | Higher quality |
text-embedding-ada-002 | 1536 | 1536 | $0.10 | Legacy, not Matryoshka |
Cohere
Section titled “Cohere”import { defineConfig } from "@upstart.gg/lucerna";export default defineConfig({ embedding: { provider: "cohere", model: "embed-english-v3.0", apiKey: process.env.COHERE_API_KEY! },});| Model | Dimensions | Price per 1M tokens | Notes |
|---|---|---|---|
embed-english-v3.0 | 1024 | $0.10 | English only |
embed-multilingual-v3.0 | 1024 | $0.10 | Multilingual codebases |
First 10M tokens free per API key.
import { defineConfig } from "@upstart.gg/lucerna";export default defineConfig({ embedding: { provider: "jina", model: "jina-embeddings-v3", apiKey: process.env.JINA_API_KEY! },});jina-embeddings-v3 is Matryoshka-trained and supports truncation down to 32 dimensions with minimal quality loss (~92% retention at 64 per Jina’s benchmarks). Lucerna defaults to 512 for a ~50% smaller index at near-full quality — override via dimensions. Asymmetric retrieval is used automatically: task: "retrieval.passage" for indexed chunks and task: "retrieval.query" for queries.
| Model | Native dim | Default dim | Price per 1M tokens | Notes |
|---|---|---|---|---|
jina-embeddings-v3 | 1024 | 512 | ~$0.02 | 8K ctx, Matryoshka, good for long code chunks |
jina-embeddings-v2-base-code | 768 | 768 | — | Code-specific, no Matryoshka |
jina-embeddings-v2-base-en | 768 | 768 | — | English only, no Matryoshka |
Mistral
Section titled “Mistral”import { defineConfig } from "@upstart.gg/lucerna";export default defineConfig({ embedding: { provider: "mistral", model: "codestral-embed", apiKey: process.env.MISTRAL_API_KEY! },});Both Mistral embedding models are Matryoshka-trained. Lucerna defaults to 512 for a ~50% smaller index at negligible quality loss — override via dimensions if you need more.
| Model | Native dim | Default dim | Price per 1M tokens | Notes |
|---|---|---|---|---|
codestral-embed | 1024 | 512 | $0.15 | Code-specific — recommended |
mistral-embed | 1024 | 512 | $0.10 | Mixed text + code |
Google Gemini
Section titled “Google Gemini”import { defineConfig } from "@upstart.gg/lucerna";export default defineConfig({ embedding: { provider: "gemini", model: "gemini-embedding-2-preview", apiKey: process.env.GOOGLE_API_KEY!, // dimensions defaults to 256 for gemini-* models; override with e.g. `dimensions: 3072` for full native dim },});| Model | Native dim | Default dim | Price per 1M tokens | Notes |
|---|---|---|---|---|
gemini-embedding-2-preview | 3072 (truncatable 128–3072) | 256 | — | 8K ctx, multi-modal, latest |
gemini-embedding-001 | 3072 (truncatable 128–3072) | 256 | $0.15 (batch $0.075) | 2K ctx, text only |
text-embedding-004 | 768 | 768 | $0.15 | Legacy, 2K ctx |
Lucerna uses asymmetric code retrieval automatically: indexed chunks are embedded with RETRIEVAL_DOCUMENT and queries with CODE_RETRIEVAL_QUERY. For gemini-embedding-2-preview — which does not accept task_type — the equivalent prompt prefix is applied automatically, so no configuration is needed.
Both gemini-embedding-* models support Matryoshka truncation via dimensions (128–3072), and Lucerna defaults to 256 to keep the index ~12× smaller and faster to search with minimal quality loss. When dimensions is below the native 3072, Lucerna L2-normalizes each returned vector as required by the API.
GCP Vertex AI
Section titled “GCP Vertex AI”Requires a GCP project. Authentication uses Application Default Credentials (ADC) — tokens are refreshed automatically with no expiry concerns. location defaults to us-central1.
Local development — run once, then Lucerna handles token refresh automatically:
gcloud auth application-default loginCI/CD — set the GOOGLE_APPLICATION_CREDENTIALS environment variable to the path of a service account JSON key file. Lucerna picks it up automatically via ADC, or you can pass it explicitly with keyFile.
import { defineConfig } from "@upstart.gg/lucerna";export default defineConfig({ embedding: { provider: "vertex", model: "text-embedding-005", project: process.env.GOOGLE_CLOUD_PROJECT!, location: "us-central1", // optional // keyFile: process.env.GOOGLE_APPLICATION_CREDENTIALS, // explicit service account key (optional) },});| Model | Native dim | Default dim | Price per 1M tokens | Notes |
|---|---|---|---|---|
gemini-embedding-001 | 3072 (truncatable 128–3072) | 256 | — | Highest quality, 1 input per request |
text-embedding-005 | 768 | 768 | $0.10 | Latest text model, up to 250 inputs per request |
text-embedding-004 | 768 | 768 | $0.10 | Previous generation |
text-multilingual-embedding-002 | 768 | 768 | $0.10 | Multilingual |
Lucerna uses asymmetric code retrieval automatically: indexed chunks are embedded with RETRIEVAL_DOCUMENT and queries with CODE_RETRIEVAL_QUERY, so no configuration is needed. When dimensions is set below the model’s native dim, Lucerna L2-normalizes returned vectors as required by the API.
Note: gemini-embedding-001 on Vertex accepts only one text per request (the API does not batch), so indexing is slower than text-embedding-005. gemini-embedding-2-preview is not available on Vertex — use the Gemini provider for it.
Cloudflare Workers AI
Section titled “Cloudflare Workers AI”Free tier: 10K neurons/day (~9M tokens/day for bge-m3).
import { defineConfig } from "@upstart.gg/lucerna";export default defineConfig({ embedding: { provider: "cloudflare", accountId: process.env.CLOUDFLARE_ACCOUNT_ID!, apiKey: process.env.CLOUDFLARE_API_TOKEN!, model: "@cf/baai/bge-m3", // optional, this is the default },});| Model | Price per 1M tokens | Notes |
|---|---|---|
@cf/baai/bge-m3 | ~$0.012 (free tier applies) | Multilingual, good baseline |
Ollama (local)
Section titled “Ollama (local)”No API key required. Requires a running Ollama server. host defaults to http://localhost:11434.
import { defineConfig } from "@upstart.gg/lucerna";export default defineConfig({ embedding: { provider: "ollama", model: "nomic-embed-text" }, // embedding: { provider: "ollama", model: "nomic-embed-text", host: "http://my-server:11434" },});| Model | Dimensions | Price | Notes |
|---|---|---|---|
nomic-embed-text | 768 | Free | Popular, fast |
mxbai-embed-large | 1024 | Free | Higher quality, GPU recommended |
For custom models not in the table, add a dimensions field:
embedding: { provider: "ollama", model: "my-custom-model", dimensions: 768 },LM Studio (local)
Section titled “LM Studio (local)”No API key required. Requires LM Studio running with the local server enabled (Settings → Local Server → Start Server). baseUrl defaults to http://localhost:1234.
import { defineConfig } from "@upstart.gg/lucerna";export default defineConfig({ embedding: { provider: "lmstudio", model: "nomic-embed-text" }, // embedding: { provider: "lmstudio", model: "nomic-embed-text", baseUrl: "http://my-server:1234" },});| Model | Dimensions | Price | Notes |
|---|---|---|---|
nomic-embed-text | 768 | Free | Popular, fast |
mxbai-embed-large | 1024 | Free | Higher quality, GPU recommended |
bge-m3 | 1024 | Free | Multilingual |