Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mehen.ophi.dev/llms.txt

Use this file to discover all available pages before exploring further.

This metric addresses the AI-era documentation problem:
The document is just filler: structure is lazy, there are no references, it is large but useless.
This is not AI-authorship detection. It reports structural evidence — unanchored prose, low artifact density, weak repository grounding, lazy sectioning, repetition, specificity scarcity, hollow references, and placeholder density — without making any claim about how the text was written.

Sub-scores (.1–17.8)

Sub-scoreWhat it captures
UnanchoredProseMassFraction of words living in sections with no evidence anchors.
LowArtifactDensity1 − sat(A / (W/800); 0.5, 2.0) — too few code, tables, diagrams.
LowRepoGrounding1 − RepositoryGroundingScore.
LazySectioningHeading density, large-section rate, “shallow big doc” flag (W > 2,500 AND max heading depth ≤ 2).
RepetitionDensityToken-shingle Jaccard > 0.82 detects near-duplicate paragraphs.
SpecificityScarcityIdentifiers + paths + version tokens + inline code tokens relative to W.
ReferenceHollownessBibliography entries without verifiable DOI/arXiv/RFC/URL anchors.
PlaceholderDensityTODO/TBD/FIXME/XXX/lorem and empty links per 1,000 words.

Formula

FillerLazyRisk = clamp01(
    0.20 · UnanchoredProseMass
  + 0.15 · LowArtifactDensity
  + 0.20 · LowRepoGrounding
  + 0.15 · LazySectioning
  + 0.12 · RepetitionDensity
  + 0.12 · SpecificityScarcity
  + 0.04 · ReferenceHollowness
  + 0.02 · PlaceholderDensity
)

Bands

ScoreBand
0.00 – 0.20Low.
0.21 – 0.40Mild.
0.41 – 0.60Review.
0.61 – 0.80High.
0.81 – 1.00Severe.

Diagnostic labels

High scores attach stable string labels reviewers can act on:
  • large-unanchored-prose
  • low-repository-grounding
  • lazy-sectioning
  • low-artifact-density
  • near-duplicate-paragraphs
  • specificity-scarcity
  • hollow-references
  • placeholder-heavy
The PR comment quotes these labels verbatim instead of paraphrasing.

Example output

Filler / Lazy Structure Risk: 0.73 HIGH

Top contributors:
  - 71% of prose is in sections without evidence anchors
  - 3,420 words, only 1 relative link and 0 code examples
  - max heading depth = 2 with 4 sections > 1,200 words
  - specificity density = 1.8% (threshold: 3%-15%)

References

  • Pirolli, P. & Card, S. (1999). Information Foraging. Psychological Review 106(4): 643–675 — motivates the evidence-anchor and specificity-scarcity sub-scores. DOI.
  • Halliday, M. A. K. (1985). Spoken and Written Language. Oxford University Press — lexical-density basis used by SpecificityScarcity.
  • Manning, C. D., Raghavan, P. & Schütze, H. (2008). Introduction to Information Retrieval, ch. 6. Cambridge University Press — Jaccard / token-shingle methods used by RepetitionDensity. Stanford online edition.

See also