PR comment design

The primary consumption surface for mehen on a pull request is a sticky GitHub comment posted by the GitHub Action. This page is the user-facing reference for that surface.

mehen sticky PR comment with a Source Code Metrics table and a Documentation Metrics table.

When a PR touches one or more Markdown files, mehen diff emits a Documentation Metrics section below the existing source-code table. The non-negotiable constraint: every character of output is mechanically derivable from the AST, the metric tables, and the threshold bands defined in Markdown metrics and Markdown prose metrics. No LLM call, no “likely cause” inference, no speculative phrasing. Re-running mehen diff on the same commits produces byte-identical output.

Anchor and when it fires

The Documentation section is a sibling of the source-code section inside the existing sticky comment, demarcated by an HTML comment anchor so upserts replace the right region:

<!-- mehen-docs -->
## Documentation Metrics (this PR vs `main`)

…table…

…callouts…

…drill-down <details>…

> Generated by [mehen](https://github.com/ophi-dev/mehen) — the code quality watcher.

The section is emitted only when at least one Markdown file is present in the PR diff. On code-only PRs the section is suppressed entirely.

Headline table — five columns

Column	Source	Why it’s in the headline
DMI (0–100)	Documentation Maintainability Index	Single overall maintainability score.
Words	LOC family (W)	Size sanity.
FKGL (English) or Tateishi RS (Japanese)	English readability / Tateishi RS	Most recognizable readability number.
Link Debt (0–1)	Link Debt	Objective defects.
Filler Risk (0–1)	Filler / Lazy Risk	AI-era flag for “big but vacuous”.

RCI, MCC, MRPC, WQS, Evidence Coverage, and Repository Grounding all matter but are second-glance signals. They live in the <details> drill-down. For Japanese-dominant docs, the third column header flips to Tateishi RS and the value uses the simplified formula. Mixed-language docs report the dominant-language score and tag the file with a 🌏 suffix.

Cell format — four canonical shapes

Shape	Example	Meaning
`new (main: old) indicator`	`74 (main: 71) 🟢`	Modified file: before/after + delta category.
`value 🆕`	`58 🆕`	New file: no `main` baseline exists.
`value ⚪`	`0 ⚪`	Deleted metric or undefined for this file type.
`— footnote-mark`	`— ²`	Suppressed by guard.

Fixed precision per column:

DMI, RCI: integer.
Words, sentence counts, diagram/table/link counts: integer with thousands separators.
Ratios and scores (0–1): 2 decimal places.
FKGL, Fog, ARI, Tateishi RS: 1 decimal place.

Delta indicators

Indicator	Rule
🟢 improvement	Delta crosses a band boundary in the “better” direction, OR `\|delta\| ≥ noticeable_threshold` better.
🔴 regression	Delta crosses a band boundary in the “worse” direction, OR `\|delta\| ≥ noticeable_threshold` worse.
⚠️ attention	Value is in a “warn” or worse band AND did not improve.
🆕 new	File is new in the PR.
⚪ unchanged	None of the above.

Per-metric noticeable thresholds:

Metric	Direction	`noticeable_threshold`
DMI	↑ better	3 points
RCI	informational	never emits 🟢 / 🔴 by itself
FKGL	profile target	0.5 grade
Tateishi RS	↑ better	2 points
Fog	profile target	0.5 grade
Link Debt	↓ better	0.05 OR any new broken link
Filler Risk	↓ better	0.05; ⚠️ when ≥ 0.60 regardless of delta
Evidence Coverage	↑ better	0.05
MCC	↓ better	5 points OR band crossing
MRPC	profile target	3 points AND profile-exceeded
Passive ratio	profile target	0.05 absolute
Long-sentence count	↓ better	any new instance is 🔴
Inclusive-language flags	↓ better	any new flag is 🔴
Repository Grounding	↑ better	0.05
Jukugo / kanji-run warnings (JA)	profile target	any new violation is 🔴

Word count never emits 🟢 or 🔴 — it is informational only.

Callout templates

The callout block tells reviewers what specifically to look at, with exact document locations. Every callout must come from the template catalog. No free-text generation is permitted. Callouts are ranked by severity class, then by magnitude within class. Default cap: 8 callouts; overflow goes into a <details> expander.

Severity 1 — objective defects

`rule_id`	Template
`broken_relative_link_added`	`🔴 {file} — {n} unresolved relative link(s) added: {s₁} ({L:N₁}){, s₂ (L:N₂)…}`
`broken_anchor_added`	`🔴 {file} — {n} unresolved internal anchor(s) added: …`
`broken_external_link_added`	`🔴 {file} — {n} broken external link(s) added (link-check enabled): …`
`diagram_parse_error_added`	`🔴 {file} — {lang} diagram parse error at {L:N}`
`inclusive_language_flag_added`	`🔴 {file} — {n} inclusive-language flag(s) added: …`
`nonword_added`	`🔴 {file} — non-word {s} at {L:N} (suggest: {replacement})`
`lexical_illusion_added`	`🔴 {file} — doubled word {s} at {L:N}`

Severity 2 — band crossings

`rule_id`	Template
`filler_risk_high`	`⚠️ {file} — filler/lazy risk {new} ({band}); top contributors: {label₁} {v₁}, {label₂} {v₂}, {label₃} {v₃}`
`dmi_band_drop`	`🔴 {file} — DMI {old} → {new}, crossed {old_band} → {new_band}`
`evidence_band_drop`	`🔴 {file} — evidence coverage {old} → {new}, crossed {old_band} → {new_band}`
`repo_grounding_band_drop`	`🔴 {file} — repository grounding {old} → {new}, crossed {old_band} → {new_band}`

Severity 3 — readability / wording

`rule_id`	Template
`long_sentences_added`	`🔴 {file} — {n} sentence(s) exceed {threshold} words (new): {L:N₁}{, L:N₂…}`
`readability_target_breach`	`🔴 {file} — {formula} {old} → {new}, above {profile} target {target}`
`tateishi_band_drop`	`🔴 {file} — Tateishi RS {old} → {new} (harder)`
`passive_ratio_breach`	`🔴 {file} — passive ratio {old} → {new}, above {profile} max {max}`
`heading_skip_added`	`🔴 {file} — heading skip {old_level} → {new_level} at {L:N}`
`table_burden_hard`	`⚠️ {file} — table at {L:N} has {cells} cells / {cols} columns / {rows} rows (hard warning)`

Severity 4 — artifact hygiene

`rule_id`	Template
`code_fence_unlabeled_added`	`⚠️ {file} — unlabelled code fence at {L:N}`
`diagram_missing_caption_added`	`⚠️ {file} — {lang} diagram at {L:N} has no caption or nearby explanation`
`image_missing_alt_added`	`⚠️ {file} — image {s} at {L:N} has no alt text`
`artifact_unexplained_added`	`⚠️ {file} — {artifact_type} at {L:N} has no explanatory prose within ±2 blocks`

Severity 5 — improvements

`rule_id`	Template
`dmi_band_improve`	`🟢 {file} — DMI {old} → {new}, crossed {old_band} → {new_band}`
`filler_risk_band_improve`	`🟢 {file} — filler/lazy risk {old} → {new}, crossed {old_band} → {new_band}`
`broken_links_resolved`	`🟢 {file} — {n} previously broken link(s) resolved`
`long_sentences_resolved`	`🟢 {file} — {n} sentence(s) previously over {threshold} words now under`
`readability_target_recovered`	`🟢 {file} — {formula} {old} → {new}, now within {profile} target {target}`

Severity 6 — new file summary

`rule_id`	Template
`new_file_summary`	`🆕 {file} — {words} words, {headings} headings, {code_fences} code fence(s), {diagrams} diagram(s), {tables} table(s); DMI {dmi}, filler risk {filler} ({band})`

Permitted and forbidden language

The callout grammar is deliberately thin. Permitted verbs and connectors: added, resolved, exceed, crossed, above, below, has, missing, unresolved, broken, previously, now, within, no caption, no alt text, →, ;, ,, (, ). Forbidden: after, because, due to, caused by, following, since, likely, probably, appears to, seems, may indicate, suggests, possibly. Anything that implies causation or intent about the author’s edits. This rule is the hard line between “CI metrics report” and “automated review feedback that over-reaches”.

Drill-down tables

Below the callouts, a collapsed <details> block holds deeper tables for reviewers who want them.

<details>
<summary>Full metric breakdown (structural · wording · lexical · readability)</summary>

Inside, four tables in this order:

Structural / review — RCI, MCC, MRPC, Evidence Coverage, Repository Grounding.
English wording quality (suppressed if no English file) — WQS, passive %, hedges/100w, long-sentence count, nominalization density.
English lexical & readability ensemble — MATTR₅₀, hapax ratio, Fog, SMOG, ARI, Coleman-Liau.
Japanese composition & register (suppressed if no Japanese file) — kanji %, hiragana %, katakana %, avg sentence chars, comma/period ratio, politeness dominant.

Reference mock

Canonical shape for a PR that modifies one README.md, adds one architecture doc, regresses one API reference, leaves one generated file unchanged but on-alert, and touches the changelog:

<!-- mehen-docs -->
## Documentation Metrics (this PR vs `main`)

| File | DMI | Words | FKGL | Link Debt | Filler Risk |
|---|---:|---:|---:|---:|---:|
| [README.md](…) | 74 (main: 71) 🟢 | 1,240 (main: 1,180) ⚪ | 9.4 (main: 10.1) 🟢 | 0.08 (main: 0.12) 🟢 | 0.15 (main: 0.18) 🟢 |
| [docs/architecture/runtime.md](…) | 58 🆕 | 2,840 🆕 | 11.8 🆕 | 0.04 🆕 | 0.09 🆕 |
| [docs/api/auth.md](…) | 62 (main: 68) 🔴 | 1,670 (main: 1,540) ⚪ | 12.1 (main: 11.6) 🔴 | 0.22 (main: 0.15) 🔴 | 0.14 (main: 0.12) ⚪ |

**Callouts**

- 🔴 **docs/api/auth.md** — 2 unresolved relative link(s) added: `../../guide/sessions.md` (L47), `./tokens.md#refresh` (L112)
- 🔴 **docs/api/auth.md** — 3 sentence(s) exceed 35 words (new): L83, L104, L156
- 🔴 **docs/api/auth.md** — FKGL 11.6 → 12.1, above API-reference target 12.0
- ⚠️ **docs/architecture/runtime.md** — mermaid diagram at L171 has no caption or nearby explanation
- 🟢 **README.md** — DMI 71 → 74 (within "Good"); FKGL 10.1 → 9.4 (≥ 0.5 grade improvement)

> Legend: 🟢 improvement · 🔴 regression · ⚠️ attention · 🆕 new file · ⚪ no material change

> Generated by [mehen](https://github.com/ophi-dev/mehen) — the code quality watcher.

What is deliberately NOT in scope

No causal explanations. The report never says “after X”, “because of Y”, “due to Z”.
No author-intent inference. Never speculates about what the author “meant” or “should have done”.
No LLM summaries. Not now, not behind a flag, not as a plugin.
No trend lines or history. Per-metric trendlines deserve a separate design pass.
No suggested edits. Belongs in a separate mehen doc lint command.
No scoring gates by default. The PR comment is advisory. Gating requires explicit --fail-on.

`--fail-on` gating

The PR comment is advisory by default. To turn documentation regressions into a CI failure, pass mehen diff --fail-on with one or more band-crossing rule IDs — dmi-drop, new-broken-link, filler-high, or all — which exit non-zero (code 2) when crossed. These align with the severity-1 / severity-2 callouts above. See Concepts → Thresholds and diffs for the source-code threshold path.

​Anchor and when it fires

​Headline table — five columns

​Cell format — four canonical shapes

​Delta indicators

​Callout templates

​Severity 1 — objective defects

​Severity 2 — band crossings

​Severity 3 — readability / wording

​Severity 4 — artifact hygiene

​Severity 5 — improvements

​Severity 6 — new file summary

​Permitted and forbidden language

​Drill-down tables

​Reference mock

​What is deliberately NOT in scope

​--fail-on gating

​See also