Skip to content

Search Modes

ModeWhen used
Hybrid (default)Embedding function available — runs vector search and BM25, fuses via Reciprocal Rank Fusion
Semantic onlysearchSemantic()
Lexical onlysearchLexical() or embeddingFunction: false
Graph-contextsearchWithContext() — hybrid search + BFS graph expansion

Both vector search (top-K by cosine similarity) and BM25 (top-K by text relevance) run in parallel. Results are merged with Reciprocal Rank Fusion:

score = Σ 1 / (k + rank)

Default k=45, tuned for code retrieval. Results can then be optionally re-ranked by a cross-encoder.

Identifiers are expanded before indexing — getUserById becomes getUserById get user by id, parse_json becomes parse_json parse json — so BM25 matches sub-word tokens without sacrificing exact-identifier recall.

searchWithContext() runs hybrid search and then expands the result set by following knowledge graph edges (BFS up to graphDepth hops). Direct search hits come first (in rank order); graph-expanded neighbour chunks follow in traversal order. Duplicates are collapsed.

Results are always returned sorted by relevance (best first). The internal ordering score is not exposed on the result type — it is not comparable across paths (semantic cosine similarity, BM25 rank, RRF fused score, reranker output all live on different scales). Callers should rely on the result order; the matchType field ('semantic' | 'lexical' | 'hybrid') indicates which path produced each result.