Skip to content

Language Support

Lucerna uses tree-sitter grammars to parse files before indexing. Files in unsupported languages are skipped entirely.

.gitignore is always respected. Lucerna discovers all .gitignore files in the project tree (root and subdirectories) and applies their patterns to both indexing and file watching.

Programming languages — symbol-level chunks

Section titled “Programming languages — symbol-level chunks”

For every supported programming language, Lucerna walks the AST and emits one chunk per logical symbol: functions, classes, methods, interfaces, type aliases, and a single grouped import chunk per file. This means search results map directly to reviewable, navigable code units rather than arbitrary line ranges.

Each chunk’s embedding input (contextContent) is enriched with:

  • A breadcrumb prefix (e.g. // File: src/auth/middleware.ts / Class: AuthMiddleware / Method: verify)
  • The file’s import block, so the model understands which external dependencies are in scope
  • For method chunks: the parent class header

Graph edges (IMPORTS, DEFINES, EXTENDS, IMPLEMENTS, CALLS) are extracted at the same time and stored in a separate graph store for traversal.

Data, config, and markup formats — structure-level chunks

Section titled “Data, config, and markup formats — structure-level chunks”

For non-code files, Lucerna uses the file’s inherent structure instead of a symbol table:

FormatChunking approach
JSON, XML, HTML, YAMLTop-level keys or elements; single chunk for small files
TOML[section] / [[array]] table headers
Markdown / MDXHeading-based sections (H1–H3) with breadcrumb trail
CSS / SCSSOne chunk per rule-set or @mixin/@function definition
Vue / SvelteOne chunk per top-level block (<script>, <template>, <style>)
SQLOne chunk per statement (CREATE TABLE, SELECT, etc.)

No graph edges are emitted for these formats.


All produce symbol-level chunks and graph edges.

Leading doc comments, decorators, annotations, and attributes are absorbed into the chunk that follows them, so the natural-language signal stays attached to the symbol it describes.

LanguageExtensionsChunks
TypeScript / JavaScript.ts, .tsx, .js, .jsxImports, functions, generators, arrow functions, classes, methods, interfaces, type aliases, enums, namespaces, top-level const (objects/arrays/Zod schemas/route tables)
Python.py, .pywImports, classes, functions (including class methods), PEP 695 type aliases, module-level const
Java.javaImports, classes, interfaces, enums, records, methods
Go.goImports, functions, receiver methods, structs, interfaces, type declarations, top-level const/var
Rust.rsImports (use), structs, enums, traits, impl blocks (class), functions, modules, macro_rules! macros, consts/statics, type aliases
C#.csImports (using), classes, records, structs, interfaces, enums, methods, properties, events, delegates (typealias)
Swift.swiftImports, classes, structs, protocols, extensions, actors, enums, functions/methods, type aliases, properties
Kotlin.kt, .ktsImports, classes, interfaces, objects, enums, functions/methods, type aliases, top-level properties
Ruby.rbImports, classes, modules, methods, DSL calls (has_many, validates, before_action, attr_accessor, …)
PHP.phpImports (use), functions, classes, interfaces, traits, methods, enums, consts, properties
C.c, .hIncludes, functions, structs, enums, unions, typedefs, #define macros
C++.cpp, .cc, .cxx, .hppIncludes, classes, functions/methods, structs, enums, unions, namespaces, type aliases, template declarations
Bash / Shell.sh, .bashsource/. imports, functions, top-level variable assignments (const)
Scala.scalaImports, classes, objects, traits, methods, case-class records, enums (Scala 3), type aliases, given instances (Scala 3)
Lua.luarequire() imports, functions, local functions
R.r, .Rlibrary()/require() imports, functions, S4 setClass/setMethod/setGeneric
Dart.dartImports, classes, functions/methods, mixins, extensions, enums, type aliases
Haskell.hsImports, functions, data types, type classes, instances, newtypes, type aliases
Elixir.ex, .exsAliases/imports, modules, def/defp functions/methods, defmacro, defprotocol, defimpl instances, defstruct structs
Clojure.clj, .cljs, .cljcns form (import chunk), defn/defn- functions, defmacro, defprotocol, defrecord/deftype, defmulti/defmethod, def consts
Groovy.groovy, .gradleImports, classes, methods, top-level closures (def name = { ... })
Perl.pl, .pmuse imports, subroutines, Moose has properties, POD absorbed
PowerShell.ps1, .psm1using imports, functions, classes/methods, script-scope param( ... ) blocks
MATLAB.mFunctions, classdefs and methods, properties blocks
Zig.zig@import declarations, functions, container declarations (struct/enum/union), test blocks
Solidity.solImports, contracts (class), interfaces, functions/methods, libraries, structs, enums, events, modifiers, errors, state variables
Julia.jlImports, functions, modules, structs, macros, abstract types, consts
OCaml.ml, .mliImports (open), let bindings, type definitions, modules, module types, functors
Erlang.erlModule/export attributes (import), functions, records
Objective-C.m, .mmImports (#import), interfaces, implementations, methods, protocols, categories (extensions), properties

Graph edges: DEFINES links the import chunk to every named symbol in the file. EXTENDS/IMPLEMENTS are emitted when inheritance syntax is present in the grammar. CALLS tracks direct invocations.

No graph edges are emitted for these formats.

LanguageExtensionsChunking
JSON.jsonTop-level key splitting for large files; single chunk for small files
YAML.yaml, .ymlTop-level key splitting
TOML.tomlTable splitting by [section] / [[array]] headers
XML.xml, .xsd, .xslTop-level element splitting for large files; single chunk for small files
HTML.html, .htmTop-level element splitting
SQL.sqlOne chunk per statement (CREATE TABLE, CREATE VIEW, SELECT, etc.)
Markdown.md, .mdxHeading-based sections (H1–H3) with full breadcrumb trail
CSS.cssOne chunk per rule-set
SCSS.scssRule-set splitting; named @mixin/@function definitions; $variable constants; @keyframes; @media sections
Vue.vue<script> blocks are re-parsed as TS/JS so component props (defineProps), composables (useFoo), and top-level functions become individual chunks. <template> and <style> remain single sections.
Svelte.svelte<script> blocks are re-parsed as TS/JS for per-symbol chunks. <style> is a single section; remaining markup becomes the template chunk.

All other file types are not indexed.

Grammar modules are loaded on first encounter — no configuration needed. Lucerna automatically initializes any language the first time it sees a file of that type.