Check Unicode Version
Analyze text characters to determine their Unicode version, codepoint, block, and category. Supports Unicode 1.0 through 16.0.
| # | Char | Codepoint | Name | Block | Category | Version |
|---|
About
Every character in digital text belongs to a specific Unicode version. A document mixing characters from Unicode 1.0 (1991) and Unicode 16.0 (2024) may render correctly on modern systems but fail on older terminals, embedded devices, or legacy databases. This tool extracts each codepoint U+XXXX from your input and maps it to the exact Unicode standard version that introduced it. It identifies the assigned block (e.g., "Latin Extended-B", "CJK Unified Ideographs") and general category (L for Letter, N for Number, S for Symbol). Use it to audit compatibility before deploying multilingual content or embedding special symbols in systems with constrained font support.
Limitation: this tool covers assigned codepoint ranges per version. Private Use Area characters (U+E000 - U+F8FF) are reported as version 1.1 per original allocation but carry no standard glyph. Unassigned codepoints return "Unassigned". Surrogate pair codepoints (U+D800 - U+DFFF) are encoding artifacts and not valid characters.
Formulas
Each character in the input string is decomposed into its Unicode codepoint using JavaScript's codePointAt method, which correctly handles surrogate pairs for astral plane characters (codepoints above U+FFFF).
The codepoint cp is then looked up against a sorted array of Unicode version assignment ranges. Each range is a tuple:
A character belongs to Unicode version v if:
Where cp is the decimal codepoint value, start and end define the inclusive range boundary, and version is a string like "6.0". The hex representation displayed is computed via:
The general category is determined by testing the codepoint against category ranges (e.g., L for letters matching alphabetic ranges, N for numbers matching digit/numeric ranges). Block membership is resolved similarly against the ~330 named Unicode blocks.
Reference Data
| Unicode Version | Release Year | Total Characters Added | Notable Additions |
|---|---|---|---|
| 1.0 | 1991 | ~7,129 | Basic Latin, Greek, Cyrillic, CJK core |
| 1.1 | 1993 | ~28,327 | CJK Unified Ideographs (20,902), Tibetan |
| 2.0 | 1996 | ~6,516 | Surrogate mechanism, Hangul Syllables rewrite |
| 3.0 | 1999 | ~10,307 | Cherokee, Ethiopic, Khmer, Mongolian, Myanmar, Sinhala |
| 3.1 | 2001 | ~44,946 | CJK Extension B (42,711), Deseret, Gothic, Old Italic, Musical Symbols |
| 3.2 | 2002 | ~1,016 | Philippine scripts (Buhid, Hanunoo, Tagalog, Tagbanwa) |
| 4.0 | 2003 | ~1,226 | Cypriot, Limbu, Linear B, Osmanya, Shavian, Tai Le, Ugaritic |
| 4.1 | 2005 | ~1,273 | Buginese, Coptic, Glagolitic, New Tai Lue, Old Persian |
| 5.0 | 2006 | ~1,369 | N'Ko, Phags-pa, Phoenician, currency symbols |
| 5.1 | 2008 | ~1,624 | Carian, Lycian, Lydian, Vai, Sundanese |
| 5.2 | 2009 | ~6,648 | CJK Extension C, Egyptian Hieroglyphs, Tai Tham |
| 6.0 | 2010 | ~2,088 | Emoji first batch, Indian Rupee sign βΉ, Mandaic, Batak |
| 6.1 | 2012 | ~732 | Chakma, Miao, Sharada, Sora Sompeng, Takri |
| 6.2 | 2012 | 1 | Turkish Lira sign βΊ |
| 6.3 | 2013 | 5 | Bidirectional formatting characters |
| 7.0 | 2014 | ~2,834 | Ruble sign β½, Bassa Vah, Duployan, Grantha, Khojki, Pau Cin Hau |
| 8.0 | 2015 | ~7,716 | CJK Extension E, Cherokee lowercase, Emoji skin tones (Fitzpatrick) |
| 9.0 | 2016 | ~7,500 | Adlam, Bhaiksuki, Tangut (6,881 chars), 72 new emoji |
| 10.0 | 2017 | ~8,518 | CJK Extension F, Bitcoin sign βΏ, Zanabazar Square, Soyombo, 56 emoji |
| 11.0 | 2018 | ~684 | Dogra, Georgian Mtavruli, Hanifi Rohingya, 66 emoji |
| 12.0 | 2019 | ~554 | Elymaic, Nandinagari, Wancho, 61 emoji |
| 12.1 | 2019 | 1 | Reiwa era square character (Japanese) |
| 13.0 | 2020 | ~5,930 | CJK Extension G, Chorasmian, Dives Akuru, Yezidi, 55 emoji |
| 14.0 | 2021 | ~838 | Toto, Cypro-Minoan, Vithkuqi, Old Uyghur, Tangsa, 37 emoji |
| 15.0 | 2022 | ~4,489 | CJK Extension H, Kawi, Nag Mundari, 31 emoji |
| 15.1 | 2023 | 627 | CJK Extension I (622 ideographs), 5 emoji |
| 16.0 | 2024 | ~5,185 | CJK Extension J, Egyptian Hieroglyphs Extended-A, Garay, Gurung Khema, 7 emoji |