Prose layer overview

The structural Markdown layer (Markdown Metrics) is deliberately language-opaque. The prose layer adds language-aware signals — readability formulas, lexical diversity, wording quality, Japanese script composition and JTF conformance — on top of the same AST.

Architectural constraints

Layered, not folded. Prose metrics are a separate top-level section in the output schema. They do not modify DMI, MCC, MRPC, or Filler / Lazy Risk weights silently.
Per-block language tag. Language detection runs per Markdown block (paragraph, heading, list item, blockquote), not per document.
Structural artifacts stay excluded. Code fences, inline code, link destinations, image alt-text, YAML/TOML/JSON front matter, HTML/MDX, and table delimiters are stripped before any readability or wording calculation.
Short-text refusal. Grade-level formulas are suppressed when words < 100 OR sentences < 5.
Feature-gated dictionaries. Dictionary-dependent features ship behind Cargo --features flags so the default binary stays small.
Deterministic and reproducible. No network, no cloud, no sampling.

Tier model

Tier	Cargo features	Adds	Binary cost
0 (default)	none	Unicode-block language detection; UAX #29 segmentation; vowel-group syllables; Tateishi RS; basic wording heuristics; JTF mechanical checks	~100–300 KB
1a	`syllables-cmu`	CMU Pronouncing Dictionary	+1–2 MB
1b	`japanese-jouyou`	Jōyō grade proxy, hyōgai ratio	+10 KB
1c	`japanese-jlpt`	JLPT N5–N1 word and kanji bands	+300 KB
1d	`lingua`	High-accuracy trigram language detection	+2–5 MB
2a	`japanese-morph`	Lindera + IPADIC, bunsetsu, POS, Shibasaki grade	+50 MB
2b	`japanese-unidic`	Vibrato + UniDic; jReadability	external dict
2c	`lexical-diversity`	MTLD, HD-D, Yule’s K	+50 KB
2d	`vale-rules`	Parse vale-compatible YAML rule packs	+200 KB

Pages

Page	Purpose
Block-level language detection	Per-block English/Japanese/other tagging.
English readability ensemble	Flesch, FKGL, Fog, SMOG, ARI, Coleman-Liau, Dale-Chall, FORCAST, LIX/RIX.
Lexical diversity	MATTR, hapax, density, sentence/word moments.
Wording quality	Passive, hedges, weasels, wordy, adverbs, nominalizations, cliches, illusions.
Inclusive language	alex / retext-equality flags.
Japanese script composition	Kanji/hiragana/katakana ratios, registers.
Tateishi RS + Jōyō grade	Japanese readability formulas.
JTF rules	Japan Translation Federation 12 rules.
textlint-ja subset	Selected `textlint-rule-preset-ja-technical-writing` rules.

​Architectural constraints

​Tier model

​Pages

Architectural constraints

Tier model

Pages