Add Diacritics to Text
Add diacritical marks to plain text - accents, umlauts, cedillas, carons, tildes & more. Supports 15+ diacritic types and language presets.
About
Stripping diacritics during data migration or ASCII normalization is common. Restoring them is not. Incorrect or missing diacritical marks change meaning: résumé becomes resume, café becomes cafe, and the Polish łódź becomes unrecognizable. This tool maps each base Latin character (a - z, A - Z) to its diacritical variant using a dictionary of over 200 Unicode code points across 15 diacritic categories. You select a diacritic type or a language preset. The tool applies the transformation character-by-character. Characters without a known diacritical form pass through unchanged. The output is standard UTF-8 text safe for HTML, databases, and print. Note: this tool applies diacritics uniformly or by preset rules. It does not perform linguistic analysis and cannot determine contextual correctness within a sentence.
Formulas
The transformation applies a deterministic character-level mapping function:
where inputi is the i-th character of the source string and D is the selected diacritic type. The mapping function is defined as:
where dict is the complete lookup table containing over 200 mappings across 15 diacritic categories. For language presets, a composite mapping is used. Each preset defines a set of character-specific diacritic rules Rlang = {(c1, D1), (c2, D2), …} that maps each character c to its language-appropriate diacritic D. The total character space is the Latin alphabet: |A| = 52 (uppercase + lowercase).
Reference Data
| Diacritic Name | Symbol Example | Unicode Range | Languages | Affected Letters |
|---|---|---|---|---|
| Acute (´) | á é ó | U+00C1 - U+01FF | French, Spanish, Portuguese, Hungarian, Czech, Polish | a, c, e, i, l, n, o, r, s, u, y, z |
| Grave (`) | à è ù | U+00C0 - U+01F9 | French, Italian, Portuguese, Vietnamese | a, e, i, o, u |
| Circumflex (^) | â ê ô | U+00C2 - U+0176 | French, Portuguese, Romanian, Welsh, Vietnamese | a, c, e, g, h, i, j, o, s, u, w, y |
| Tilde (~) | ã ñ õ | U+00C3 - U+0169 | Spanish, Portuguese, Estonian, Vietnamese | a, e, i, n, o, u, v, y |
| Umlaut / Diaeresis (¨) | ä ö ü | U+00C4 - U+0178 | German, Turkish, Finnish, Swedish, Hungarian | a, e, i, o, u, y |
| Cedilla (¸) | ç ş ţ | U+00C7 - U+0163 | French, Portuguese, Turkish, Romanian | c, d, e, g, h, k, l, n, r, s, t |
| Ring (˚) | å ů | U+00C5 - U+016F | Swedish, Danish, Norwegian, Czech | a, u |
| Caron / Háček (ˇ) | č š ž ř | U+010C - U+017E | Czech, Slovak, Slovenian, Croatian, Lithuanian | a, c, d, e, g, h, i, j, k, l, n, o, r, s, t, u, z |
| Macron (¯) | ā ē ī ō ū | U+0100 - U+0233 | Latvian, Lithuanian, Maori, Japanese Romaji, Hawaiian | a, e, g, i, o, u, y |
| Breve (˘) | ă ĕ ğ | U+0102 - U+016D | Romanian, Turkish, Vietnamese, Esperanto | a, e, g, i, o, u |
| Ogonek (˛) | ą ę į ų | U+0104 - U+0173 | Polish, Lithuanian, Navajo | a, e, i, o, u |
| Dot Above (˙) | ċ ė ġ İ ż | U+010A - U+017C | Polish, Lithuanian, Turkish, Maltese | a, b, c, d, e, f, g, h, i, m, n, o, p, r, s, t, w, x, y, z |
| Stroke (Đ/đ) | đ ħ ł ø ŧ | U+00D0 - U+0167 | Polish, Danish, Norwegian, Vietnamese, Croatian | d, h, l, o, t |
| Double Acute (˝) | ő ű | U+0150 - U+0171 | Hungarian | o, u |
| Horn (ơ/ư) | ơ ư | U+01A0 - U+01B0 | Vietnamese | o, u |