Skip to content

CodeChunk

A CodeChunk is the atomic unit of the index. Every chunk represents a single logical unit of code or documentation extracted from a source file.

FieldDescription
idStable 16-char hex hash of projectId + filePath + startLine
projectIdIsolates chunks belonging to this project
filePathPath relative to projectRoot
languageAny language supported by @kreuzberg/tree-sitter-language-pack (e.g. typescript, python, rust, go, …)
typeSee Chunk types below
nameSymbol name (function/class/method/heading), if applicable
contentRaw source text
contextContentEmbedding input: content prefixed with breadcrumb, file imports, and (for methods) the class header
startLine / endLine1-based line range in the original file
metadataExtra data, e.g. { className: "UserService", breadcrumb: "// Class: UserService" } on method chunks
TypeProduced from
functionFunction declarations, generator functions, arrow functions, Rust/Go/Python/Bash functions
classClass declarations; also Rust struct and impl blocks, Ruby/Python module
methodIndividual methods inside a class (Java, C#, PHP, Go receiver methods, …)
interfaceTypeScript interface; also Rust trait, Swift protocol
typeTypeScript type alias; also Rust/Java/Kotlin enum
variableVariable declarations
importAll import statements in a file, grouped as one chunk
sectionMarkdown heading section (H1–H3)
fileWhole-file fallback for small or structureless files (e.g. SQL)