Each language is owned by a single per-language analyzer crate atDocumentation Index
Fetch the complete documentation index at: https://mehen.ophi.dev/llms.txt
Use this file to discover all available pages before exploring further.
crates/mehen-<lang>/. The analyzer:
- Pins its own parser (a language-specific parser like Ruff / Oxc / Mago / Prism /
ra_ap_syntax, or a tree-sitter grammar). - Owns metric interpretation for that language’s syntax — what counts as a decision, an operator, a method, a comment, etc.
- Returns
LanguageAnalysis(owned,Send + 'static) somehen-enginecan analyze files in parallel and never holds onto parser arenas.
grammar.rs kind enum, generated from the
pinned grammar’s node-kind table.
Choosing a parser
The first decision is the parser. mehen prefers a language-specific parser when one exists with mature Rust bindings and rich AST/semantic coverage; tree-sitter is the default fallback when no such parser is available.| Parser style | Examples in mehen | When to pick |
|---|---|---|
| Language-specific | Ruff (Python), Oxc (TS/JS), Mago (PHP), Prism (Ruby), ra_ap_syntax (Rust), pulldown-cmark (Markdown) | A maintained Rust crate with a typed AST, semantic model, or rich grammar coverage exists. |
| Tree-sitter | Go, Kotlin, C, PowerShell | No mature language-specific parser is available, or tree-sitter’s grammar quality is the best fit. |
Adding a language-specific parser
Add the parser to the workspace
Add the parser crate(s) to
crates/mehen-<lang>/Cargo.toml. Pin an exact version (or a git
revision tagged for the release) so mehen’s behavior is reproducible.Implement the analyzer
In
crates/mehen-<lang>/src/lib.rs:- Define
<Lang>Analyzerand implementmehen_core::LanguageAnalyzerfor it. - Walk the parser’s typed AST and emit metrics through
mehen_metrics::{State, MetricTreeBuilder, …}and the per-metric helpers. - Make sure the parser’s arena, source buffer, and parser state do not escape the
analyzecall —LanguageAnalysismust beSend + 'static.
Register the analyzer
Add it to
mehen-engine’s registry (crates/mehen-engine/src/registry.rs) so
Language::<YourLang> dispatches to it.Adding a tree-sitter-backed language
Prerequisite: atree-sitter-<lang> crate compatible with the tree-sitter version pinned in the
workspace (Cargo.toml [workspace.dependencies]).
Pin the grammar
Two files must stay in sync — both reference the grammar at compile time:
xtask/Cargo.toml— the kind-enum generator imports the grammar at codegen time.crates/mehen-<lang>/Cargo.toml— the analyzer imports the grammar at runtime to drivetree_sitter::Parser.
Cargo.toml [workspace.dependencies]) can be
referenced as { workspace = true } from both places. Inline-pinned grammars must be kept in
lockstep manually.Implement the analyzer
Same pattern as a language-specific parser, but use the generated
crate::grammar::<Lang> enum
for kind-id matching — it deduplicates positional kinds and exposes mnemonic identifiers
(PLUS, EQ_EQ, etc.).Bumping a pinned grammar
When dependabot bumps atree-sitter-<lang> version (or you do it manually):
- Update both
xtask/Cargo.tomlandcrates/mehen-<lang>/Cargo.tomlto the new version. Theregenerate-grammarsworkflow does this automatically for inline-pinned grammars. - Run
cargo xtask tree-sitter generate --alland commit the regeneratedgrammar.rsfiles. - CI’s
cargo xtask tree-sitter check-generatedwill fail until the regenerated files are committed.
Validation
See also
- Update grammars — bumping pinned tree-sitter versions.
- Implement LoC — example of a metric trait implementation.