User Rating 0.0
Total Usage 0 times
0 / 100,000 characters
0 characters
Quick Examples:
Is this tool helpful?

Your feedback helps us improve.

About

Greek uppercasing is not a trivial toUpperCase call. The Unicode standard and the Greek language impose specific orthographic rules that most software gets wrong. When a lowercase accented vowel like α with tonos (á → U+03AC) is converted to uppercase, the accent must be removed entirely - uppercase Greek letters do not carry a tonos in modern monotonic orthography. This tool implements those rules correctly. It also handles the critical diphthong case: when a sequence like αι with a tonos on the first vowel is uppercased, the second vowel must receive a dialytika (e.g., αΐ → ΑΪ) to preserve the pronunciation distinction. Mozilla Bug #307039 documented this browser-level failure in 2005. Most browsers still produce incorrect results for CSS text-transform: uppercase on Greek text.

This converter processes final sigma (ςΣ), iota subscript (ypogegrammeni) promotion, and both composed and decomposed Unicode forms. It does not handle polytonic (ancient) Greek with multiple diacritics - that requires a separate normalization pipeline. Input is limited to 100,000 characters. Results match the behavior specified in the Unicode Common Locale Data Repository (CLDR) Greek casing rules.

greek uppercase accented letters greek converter tonos dialytika unicode text formatting greek alphabet

Formulas

The conversion follows a two-pass algorithm. First pass scans for diphthong sequences; second pass converts remaining characters individually.

Pass 1 - Diphthong scan: For each position i in input string S, check if S[i] + S[i + 1] D, where D is the set of Greek diphthong pairs (accented vowel + ι/υ). If match found, replace with upper(S[i]) + dialytika(S[i + 1]) and advance i by 2.

Pass 2 - Single character mapping: For each remaining character c, if c M (accent map), replace with M[c]. Otherwise apply native toUpperCase(c).

Where: S = input string (NFC-normalized), D = diphthong lookup table (14 entries), M = single-character accent map (70+ entries), dialytika(c) = function that adds dialytika to ι or υ (e.g., ι Ϊ, υ Ϋ).

Reference Data

LowercaseUnicodeCorrect UppercaseUnicodeRule Applied
α (with tonos: ά)U+03ACΑU+0391Tonos removal
ε (with tonos: έ)U+03ADΕU+0395Tonos removal
η (with tonos: ή)U+03AEΗU+0397Tonos removal
ι (with tonos: ί)U+03AFΙU+0399Tonos removal
ο (with tonos: ό)U+03CCΟU+039FTonos removal
υ (with tonos: ύ)U+03CDΥU+03A5Tonos removal
ω (with tonos: ώ)U+03CEΩU+03A9Tonos removal
άι (diphthong)U+03AC U+03B9ΑΪU+0391 U+03AATonos removal + dialytika on ι
άυ (diphthong)U+03AC U+03C5ΑΫU+0391 U+03ABTonos removal + dialytika on υ
έι (diphthong)U+03AD U+03B9ΕΪU+0395 U+03AATonos removal + dialytika on ι
όι (diphthong)U+03CC U+03B9ΟΪU+039F U+03AATonos removal + dialytika on ι
όυ (diphthong)U+03CC U+03C5ΟΫU+039F U+03ABTonos removal + dialytika on υ
ήυ (diphthong)U+03AE U+03C5ΗΫU+0397 U+03ABTonos removal + dialytika on υ
ύι (diphthong)U+03CD U+03B9ΥΪU+03A5 U+03AATonos removal + dialytika on ι
ς (final sigma)U+03C2ΣU+03A3Final sigma → capital sigma
ι (with dialytika: ϊ)U+03CAΪU+03AADialytika preserved
υ (with dialytika: ϋ)U+03CBΫU+03ABDialytika preserved
ΐ (ι dialytika + tonos)U+0390ΪU+03AATonos removed, dialytika kept
ΰ (υ dialytika + tonos)U+03B0ΫU+03ABTonos removed, dialytika kept
α (with ypogegrammeni: ᾳ)U+1FB3ΑΙU+0391 U+0399Iota subscript promoted
η (with ypogegrammeni: ῃ)U+1FC3ΗΙU+0397 U+0399Iota subscript promoted
ω (with ypogegrammeni: ῳ)U+1FF3ΩΙU+03A9 U+0399Iota subscript promoted

Frequently Asked Questions

JavaScript's String.prototype.toUpperCase follows the Unicode Default Case Algorithm, which is locale-independent. Greek uppercasing requires locale-specific rules defined in the Unicode CLDR. Specifically, the default algorithm uppercases ά (U+03AC) to Ά (U+0386) - preserving the tonos - when correct modern Greek typography demands the tonos be removed entirely, producing plain Α (U+0391). The toLocaleUpperCase('el') method should handle this, but browser support is inconsistent. This tool implements the rules directly to guarantee correctness.
In Greek, certain vowel combinations (αι, αυ, ει, οι, ου, ηυ, υι) form diphthongs - they are pronounced as a single syllable. When a tonos (accent) sits on the first vowel of such a pair and the text is uppercased, the accent is removed. However, to signal that the two vowels are still part of the same phonetic unit (and not two separate syllables), a dialytika (diaeresis, ¨) is placed on the second vowel. Example: άι ΑΪ, not ΑΙ. Without the dialytika, a reader might misinterpret the vowel combination.
This tool is designed for modern monotonic Greek, which uses only the tonos (acute accent) and dialytika. Polytonic Greek includes additional diacritics: spiritus asper (rough breathing ἁ), spiritus lenis (smooth breathing ἀ), circumflex/perispomeni (ᾶ), and iota subscript (ᾳ). The tool does handle iota subscript (ypogegrammeni) by promoting it to a full iota on uppercase (e.g., ΑΙ). For full polytonic support with all breathing marks, a more comprehensive normalization pipeline is required.
Greek has two forms of lowercase sigma: medial σ (U+03C3, used within words) and final ς (U+03C2, used at word end). Both map to the single uppercase Σ (U+03A3). This tool correctly maps both forms. Note that when converting back to lowercase, context-dependent logic would be needed to restore the correct sigma form - but this tool only performs the uppercase direction.
Unicode allows the same visual character to be encoded in multiple ways. For example, ά can be a single composed codepoint (U+03AC, NFC form) or a base α (U+03B1) followed by a combining acute accent (U+0301, NFD form). The tool calls String.prototype.normalize("NFC") to collapse decomposed sequences into their composed equivalents before applying the lookup table. Without this step, decomposed characters would slip through the mapping unmodified, producing incorrect mixed-case output.
Non-Greek characters (Latin, Cyrillic, CJK, punctuation, numbers, emoji) pass through unmodified. The tool's lookup map only contains Greek codepoints. For non-Greek alphabetic characters, the native toUpperCase is applied as a fallback, which is correct for Latin and most other scripts. This means you can safely paste mixed-language text and only the Greek portions will receive the specialized uppercasing rules.