Skip to content

CodeChunk

A CodeChunk is the atomic unit of the index. Every chunk represents a single logical unit of code or documentation extracted from a source file.

FieldDescription
idStable 16-char hex hash of projectId + filePath + startLine
projectIdIsolates chunks belonging to this project
filePathPath relative to projectRoot
languageAny language supported by @kreuzberg/tree-sitter-language-pack (e.g. typescript, python, rust, go, …)
typeSee Chunk types below
nameSymbol name (function/class/method/heading), if applicable
contentRaw source text
contextContentEmbedding input: content prefixed with breadcrumb, file imports, and (for methods) the class header
startLine / endLine1-based line range in the original file
metadataExtra data, e.g. { className: "UserService", breadcrumb: "// Class: UserService" } on method chunks

The type field distinguishes the kind of construct each chunk represents. Lucerna emits a fine-grained set of types so you can filter searches and repo-maps precisely (e.g. types: ["enum", "trait"]).

TypeProduced from
functionFree-standing function declarations across all languages
classClass declarations
methodFunctions/methods defined inside a class/struct/protocol/trait
interfaceJava/Kotlin/C#/TypeScript interface
typeTypeScript type alias and other one-off type definitions
variableTop-level variable declarations
importAll import / use / using / open / library / require statements in a file, grouped into one chunk
sectionMarkdown heading section (H1–H3); SCSS @media blocks; large CSS rule groups
fileWhole-file fallback for small or structureless files
TypeProduced from
enumTS/JS, Java, C#, Kotlin, Swift, Rust, Dart, Solidity, PHP 8.1, Scala 3 enums
structRust, C, C++, Swift, Solidity, Julia, Zig structs (and C union)
recordJava, C#, Scala case-class, Clojure defrecord/deftype, Erlang records
protocolSwift, Objective-C, Elixir defprotocol, Clojure defprotocol
traitRust, PHP, Scala traits
mixinDart mixin
extensionSwift, Dart extensions; Objective-C categories
objectKotlin object, Scala object
actorSwift actor
typealiasTS/JS type, Rust type, C typedef, Kotlin/Swift/Scala/Dart aliases, Julia/Haskell type aliases, OCaml module type, C++ using aliases
newtypeHaskell newtype
moduleRust mod, OCaml module, Elixir defmodule, Julia module, Ruby module
module_typeOCaml module types
functorOCaml functors
namespaceTS/JS namespace, C# / C++ namespaces
librarySolidity library
TypeProduced from
constTop-level constants (TS/JS export const, Python module assignment, Rust const/static, Go const/var, C #define, Clojure def, Julia const, Bash variable, SCSS variable). Filtered by minimum size to skip trivial values.
propertyClass/struct properties (C#, Kotlin, Swift, Objective-C, PHP, Matlab, Perl Moose has)
macroRust macro_rules!, C #define, Elixir defmacro, Clojure defmacro, Julia macro
instanceElixir defimpl, Scala 3 given, Haskell instance
eventSolidity events; C# events
modifierSolidity modifiers
errorSolidity custom errors
state_variableSolidity state variables
TypeProduced from
testZig test blocks
param_blockPowerShell script-scope param( ... ) blocks
dsl_callRuby Rails-style class-body DSLs (has_many, validates, before_action, attr_accessor, …)

Doc comments, decorators, annotations & attributes

Section titled “Doc comments, decorators, annotations & attributes”

Leading doc comments, decorators, annotations, and attributes are absorbed into the chunk that follows them — they are not emitted as standalone chunks. This means the content field of a function or class includes its preceding JSDoc, Python decorators, Rust #[derive(...)] attributes, Java annotations, C# XML doc, etc. The absorb scan is capped at 80 lines to avoid pulling in license headers.