mehen emits every formula’s raw score with provenance rather than averaging. Two formulas on the same text routinely disagree by 2–4 grade levels because they target different comprehension thresholds (SMOG ~100%, FKGL ~75%, Dale-Chall in between). Averaging them is statistically wrong.Documentation Index
Fetch the complete documentation index at: https://mehen.ophi.dev/llms.txt
Use this file to discover all available pages before exploring further.
Formulas
| Formula | Syllables | Key notes |
|---|---|---|
| Flesch Reading Ease | yes | 206.835 − 1.015·ASL − 84.6·ASW. Higher = easier. |
| Flesch-Kincaid Grade | yes | 0.39·ASL + 11.8·ASW − 15.59. MIL-M-38784A standard. |
| Gunning Fog | yes | 0.4·(ASL + 100·P_complex). Target grade 7–12 for business writing. |
| SMOG | yes | 1.0430·sqrt(poly·30/sentences) + 3.1291. null below 30 sentences. |
| ARI | no | 4.71·CPW + 0.5·ASL − 21.43. Syllable-free. |
| Coleman-Liau | no | 0.0588·L − 0.296·S − 15.8. Syllable-free. |
| New Dale-Chall | no | 0.1579·PDW + 0.0496·ASL (+ 3.6365 if PDW > 5%). |
| FORCAST | counts 1-syllable | 20 − (N/10). Non-narrative text. |
| LIX | no | ASL + 100·(long_words/words). |
| RIX | no | long_words / sentences. |
Ensemble reporting
- Emit every formula with provenance.
- Compute an ensemble grade band as
[min(FKGL, Fog, ARI, CLI), max(…)]— the interval where those four “running-prose” formulas agree. - Emit FORCAST separately as the preferred single score for non-narrative docs.
- Suppress SMOG when
sentences < 30. - Report Dale-Chall only with an explicit
list:provenance tag (NGSL 1.2 by default — Browne et al., 2013).
Syllable counting
Tier 0 default is a vowel-group heuristic (~85% agreement with CMU on open-domain text). Behind--features syllables-cmu, mehen links the CMU Pronouncing Dictionary for exact counts on ~134k
words with the heuristic as an OOV fallback.
Sentence segmentation
UAX #29 (unicode-segmentation) plus:
- A bundled ~150-entry English abbreviation list (
Mr.,e.g.,i.e.,U.S.,v1.2.3). - No split when the period is followed by a lowercase letter, a digit, or
<space><digit>. - Markdown block boundaries (blank line, heading, fence open/close, list item start) are hard terminators regardless of punctuation.
Doc-type thresholds
| Doc type | FKGL | Fog | Passive max | Max sentence words |
|---|---|---|---|---|
| README / overview | ≤ 10 | ≤ 12 | 15 % | 30 |
| Tutorial | ≤ 9 | ≤ 11 | 10 % | 25 |
| API reference | ≤ 12 | ≤ 14 | 20 % | 35 |
| ADR / design | ≤ 12 | ≤ 14 | 25 % | 40 |
| Error messages | ≤ 7 | ≤ 9 | 5 % | 15 |
| Release notes | ≤ 11 | ≤ 13 | 15 % | 30 |
References
- Flesch, R. (1948). A new readability yardstick. Journal of Applied Psychology 32(3): 221–233. DOI.
- Kincaid, J. P., Fishburne, R. P., Rogers, R. L. & Chissom, B. S. (1975). Derivation of new readability formulas (Automated Readability Index, Fog Count and Flesch Reading Ease Formula) for Navy enlisted personnel. Research Branch Report 8-75, Naval Technical Training Command. DTIC PDF.
- McLaughlin, G. H. (1969). SMOG grading — a new readability formula. Journal of Reading 12(8): 639–646. JSTOR.
- Gunning, R. (1952). The Technique of Clear Writing. McGraw-Hill.
- Coleman, M. & Liau, T. L. (1975). A computer readability formula designed for machine scoring. Journal of Applied Psychology 60(2): 283–284. DOI.
- Senter, R. J. & Smith, E. A. (1967). Automated Readability Index. AMRL-TR-66-220. DTIC PDF.
- Chall, J. S. & Dale, E. (1995). Readability Revisited: The New Dale-Chall Readability Formula. Brookline Books.
- Caylor, J. S. & Sticht, T. G. (1973). Development of a Simple Readability Index for Job Reading Material. HumRRO Professional Paper 1-73 (FORCAST). DTIC.
- Anderson, J. (1983). Lix and Rix: Variations on a little-known readability index. Journal of Reading 26(6): 490–496. JSTOR.
See also
- Lexical diversity — formula-independent vocabulary measures.
- Wording quality — passive voice, hedges, wordy phrases.