Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mehen.ophi.dev/llms.txt

Use this file to discover all available pages before exploring further.

Each language is owned by a single per-language analyzer crate at crates/mehen-<lang>/. The analyzer:
  • Pins its own parser (a language-specific parser like Ruff / Oxc / Mago / Prism / ra_ap_syntax, or a tree-sitter grammar).
  • Owns metric interpretation for that language’s syntax — what counts as a decision, an operator, a method, a comment, etc.
  • Returns LanguageAnalysis (owned, Send + 'static) so mehen-engine can analyze files in parallel and never holds onto parser arenas.
For tree-sitter-backed languages the analyzer also owns its grammar.rs kind enum, generated from the pinned grammar’s node-kind table.

Choosing a parser

The first decision is the parser. mehen prefers a language-specific parser when one exists with mature Rust bindings and rich AST/semantic coverage; tree-sitter is the default fallback when no such parser is available.
Parser styleExamples in mehenWhen to pick
Language-specificRuff (Python), Oxc (TS/JS), Mago (PHP), Prism (Ruby), ra_ap_syntax (Rust), pulldown-cmark (Markdown)A maintained Rust crate with a typed AST, semantic model, or rich grammar coverage exists.
Tree-sitterGo, Kotlin, C, PowerShellNo mature language-specific parser is available, or tree-sitter’s grammar quality is the best fit.

Adding a language-specific parser

1

Add the parser to the workspace

Add the parser crate(s) to crates/mehen-<lang>/Cargo.toml. Pin an exact version (or a git revision tagged for the release) so mehen’s behavior is reproducible.
2

Implement the analyzer

In crates/mehen-<lang>/src/lib.rs:
  • Define <Lang>Analyzer and implement mehen_core::LanguageAnalyzer for it.
  • Walk the parser’s typed AST and emit metrics through mehen_metrics::{State, MetricTreeBuilder, …} and the per-metric helpers.
  • Make sure the parser’s arena, source buffer, and parser state do not escape the analyze call — LanguageAnalysis must be Send + 'static.
3

Register the analyzer

Add it to mehen-engine’s registry (crates/mehen-engine/src/registry.rs) so Language::<YourLang> dispatches to it.
4

Add tests

Per-metric integration tests under crates/mehen-<lang>/tests/ — typically one file per metric family, snapshotting the rendered metric JSON via insta.

Adding a tree-sitter-backed language

Prerequisite: a tree-sitter-<lang> crate compatible with the tree-sitter version pinned in the workspace (Cargo.toml [workspace.dependencies]).
1

Pin the grammar

Two files must stay in sync — both reference the grammar at compile time:
  • xtask/Cargo.toml — the kind-enum generator imports the grammar at codegen time.
  • crates/mehen-<lang>/Cargo.toml — the analyzer imports the grammar at runtime to drive tree_sitter::Parser.
Workspace-managed grammars (those listed in root Cargo.toml [workspace.dependencies]) can be referenced as { workspace = true } from both places. Inline-pinned grammars must be kept in lockstep manually.
2

Register the language for codegen

Add a GeneratorTarget to xtask/src/tree_sitter.rs::TARGETS:
GeneratorTarget {
    slug: "go",
    enum_name: "Go",
    crate_dir: "crates/mehen-go/src",
    language: || tree_sitter_go::LANGUAGE.into(),
},
3

Generate the kind enum

cargo xtask tree-sitter generate go
The result lands at crates/mehen-go/src/grammar.rs.
Never edit grammar.rs directly. cargo xtask tree-sitter check-generated runs in CI and fails the build if the checked-in file doesn’t match a fresh render.
4

Implement the analyzer

Same pattern as a language-specific parser, but use the generated crate::grammar::<Lang> enum for kind-id matching — it deduplicates positional kinds and exposes mnemonic identifiers (PLUS, EQ_EQ, etc.).
5

Register the analyzer

Same step as above: register in crates/mehen-engine/src/registry.rs.
6

Add tests

Per-metric integration tests under crates/mehen-<lang>/tests/.

Bumping a pinned grammar

When dependabot bumps a tree-sitter-<lang> version (or you do it manually):
  1. Update both xtask/Cargo.toml and crates/mehen-<lang>/Cargo.toml to the new version. The regenerate-grammars workflow does this automatically for inline-pinned grammars.
  2. Run cargo xtask tree-sitter generate --all and commit the regenerated grammar.rs files.
  3. CI’s cargo xtask tree-sitter check-generated will fail until the regenerated files are committed.

Validation

cargo check --workspace
cargo nextest run --all-features
cargo insta test --workspace --all-features --check --unreferenced reject \
    --test-runner nextest --no-test-runner-fallback --disable-nextest-doctest
cargo clippy --all-targets --all-features --locked
cargo xtask tree-sitter check-generated

See also