Language Support
Lucerna uses tree-sitter grammars to parse files before indexing. Files in unsupported languages are skipped entirely.
.gitignore is always respected. Lucerna discovers all .gitignore files in the project tree (root and subdirectories) and applies their patterns to both indexing and file watching.
Chunking strategies
Section titled “Chunking strategies”Programming languages — symbol-level chunks
Section titled “Programming languages — symbol-level chunks”For every supported programming language, Lucerna walks the AST and emits one chunk per logical symbol: functions, classes, methods, interfaces, type aliases, and a single grouped import chunk per file. This means search results map directly to reviewable, navigable code units rather than arbitrary line ranges.
Each chunk’s embedding input (contextContent) is enriched with:
- A breadcrumb prefix (e.g.
// File: src/auth/middleware.ts / Class: AuthMiddleware / Method: verify) - The file’s import block, so the model understands which external dependencies are in scope
- For method chunks: the parent class header
Graph edges (IMPORTS, DEFINES, EXTENDS, IMPLEMENTS, CALLS) are extracted at the same time and stored in a separate graph store for traversal.
Data, config, and markup formats — structure-level chunks
Section titled “Data, config, and markup formats — structure-level chunks”For non-code files, Lucerna uses the file’s inherent structure instead of a symbol table:
| Format | Chunking approach |
|---|---|
| JSON, XML, HTML, YAML | Top-level keys or elements; single chunk for small files |
| TOML | [section] / [[array]] table headers |
| Markdown / MDX | Heading-based sections (H1–H3) with breadcrumb trail |
| CSS / SCSS | One chunk per rule-set or @mixin/@function definition |
| Vue / Svelte | One chunk per top-level block (<script>, <template>, <style>) |
| SQL | One chunk per statement (CREATE TABLE, SELECT, etc.) |
No graph edges are emitted for these formats.
Full language table
Section titled “Full language table”Programming languages
Section titled “Programming languages”All produce symbol-level chunks and graph edges.
Leading doc comments, decorators, annotations, and attributes are absorbed into the chunk that follows them, so the natural-language signal stays attached to the symbol it describes.
| Language | Extensions | Chunks |
|---|---|---|
| TypeScript / JavaScript | .ts, .tsx, .js, .jsx | Imports, functions, generators, arrow functions, classes, methods, interfaces, type aliases, enums, namespaces, top-level const (objects/arrays/Zod schemas/route tables) |
| Python | .py, .pyw | Imports, classes, functions (including class methods), PEP 695 type aliases, module-level const |
| Java | .java | Imports, classes, interfaces, enums, records, methods |
| Go | .go | Imports, functions, receiver methods, structs, interfaces, type declarations, top-level const/var |
| Rust | .rs | Imports (use), structs, enums, traits, impl blocks (class), functions, modules, macro_rules! macros, consts/statics, type aliases |
| C# | .cs | Imports (using), classes, records, structs, interfaces, enums, methods, properties, events, delegates (typealias) |
| Swift | .swift | Imports, classes, structs, protocols, extensions, actors, enums, functions/methods, type aliases, properties |
| Kotlin | .kt, .kts | Imports, classes, interfaces, objects, enums, functions/methods, type aliases, top-level properties |
| Ruby | .rb | Imports, classes, modules, methods, DSL calls (has_many, validates, before_action, attr_accessor, …) |
| PHP | .php | Imports (use), functions, classes, interfaces, traits, methods, enums, consts, properties |
| C | .c, .h | Includes, functions, structs, enums, unions, typedefs, #define macros |
| C++ | .cpp, .cc, .cxx, .hpp | Includes, classes, functions/methods, structs, enums, unions, namespaces, type aliases, template declarations |
| Bash / Shell | .sh, .bash | source/. imports, functions, top-level variable assignments (const) |
| Scala | .scala | Imports, classes, objects, traits, methods, case-class records, enums (Scala 3), type aliases, given instances (Scala 3) |
| Lua | .lua | require() imports, functions, local functions |
| R | .r, .R | library()/require() imports, functions, S4 setClass/setMethod/setGeneric |
| Dart | .dart | Imports, classes, functions/methods, mixins, extensions, enums, type aliases |
| Haskell | .hs | Imports, functions, data types, type classes, instances, newtypes, type aliases |
| Elixir | .ex, .exs | Aliases/imports, modules, def/defp functions/methods, defmacro, defprotocol, defimpl instances, defstruct structs |
| Clojure | .clj, .cljs, .cljc | ns form (import chunk), defn/defn- functions, defmacro, defprotocol, defrecord/deftype, defmulti/defmethod, def consts |
| Groovy | .groovy, .gradle | Imports, classes, methods, top-level closures (def name = { ... }) |
| Perl | .pl, .pm | use imports, subroutines, Moose has properties, POD absorbed |
| PowerShell | .ps1, .psm1 | using imports, functions, classes/methods, script-scope param( ... ) blocks |
| MATLAB | .m | Functions, classdefs and methods, properties blocks |
| Zig | .zig | @import declarations, functions, container declarations (struct/enum/union), test blocks |
| Solidity | .sol | Imports, contracts (class), interfaces, functions/methods, libraries, structs, enums, events, modifiers, errors, state variables |
| Julia | .jl | Imports, functions, modules, structs, macros, abstract types, consts |
| OCaml | .ml, .mli | Imports (open), let bindings, type definitions, modules, module types, functors |
| Erlang | .erl | Module/export attributes (import), functions, records |
| Objective-C | .m, .mm | Imports (#import), interfaces, implementations, methods, protocols, categories (extensions), properties |
Graph edges: DEFINES links the import chunk to every named symbol in the file. EXTENDS/IMPLEMENTS are emitted when inheritance syntax is present in the grammar. CALLS tracks direct invocations.
Data, config, and markup
Section titled “Data, config, and markup”No graph edges are emitted for these formats.
| Language | Extensions | Chunking |
|---|---|---|
| JSON | .json | Top-level key splitting for large files; single chunk for small files |
| YAML | .yaml, .yml | Top-level key splitting |
| TOML | .toml | Table splitting by [section] / [[array]] headers |
| XML | .xml, .xsd, .xsl | Top-level element splitting for large files; single chunk for small files |
| HTML | .html, .htm | Top-level element splitting |
| SQL | .sql | One chunk per statement (CREATE TABLE, CREATE VIEW, SELECT, etc.) |
| Markdown | .md, .mdx | Heading-based sections (H1–H3) with full breadcrumb trail |
| CSS | .css | One chunk per rule-set |
| SCSS | .scss | Rule-set splitting; named @mixin/@function definitions; $variable constants; @keyframes; @media sections |
| Vue | .vue | <script> blocks are re-parsed as TS/JS so component props (defineProps), composables (useFoo), and top-level functions become individual chunks. <template> and <style> remain single sections. |
| Svelte | .svelte | <script> blocks are re-parsed as TS/JS for per-symbol chunks. <style> is a single section; remaining markup becomes the template chunk. |
All other file types are not indexed.
Lazy grammar loading
Section titled “Lazy grammar loading”Grammar modules are loaded on first encounter — no configuration needed. Lucerna automatically initializes any language the first time it sees a file of that type.