Code Points to Text Converter
Convert Unicode code points (U+0041, 0x41, A, \u0041) to readable text. Supports all 1,114,112 Unicode code points including emoji and CJK.
About
Unicode assigns a unique numerical identifier - a code point - to every character across all writing systems, technical symbols, and emoji. The full range spans from U+0000 to U+10FFFF, covering 1,114,112 possible positions across 17 planes. Misinterpreting a code point format or ignoring supplementary plane characters (anything above U+FFFF) leads to corrupted output, replacement characters (U+FFFD), or silent data loss in databases and APIs. This tool parses six common code point notations - U+XXXX, 0xXXXX, decimal integers, HHHH;, DDD;, and \uXXXX - validates each against the legal Unicode range, rejects lone surrogates (U+D800 - U+DFFF), and reconstructs the original text using String.fromCodePoint. The tool approximates correct rendering assuming your browser and OS have the required fonts installed; missing glyphs will display as placeholder boxes, not conversion errors.
Formulas
Each input token is matched against format-specific regular expressions. The extracted hexadecimal or decimal string is parsed to an integer code point value cp. The conversion rule is:
cp = parseInt(decStr, 10) for decimal formats
Validation requires:
cp β [0xD800, 0xDFFF] (surrogate range is illegal)
Valid code points are converted to characters via String.fromCodePoint(cp). For supplementary plane characters (cp > 0xFFFF), this function internally creates a UTF-16 surrogate pair:
hi = 0xD800 + (cpβ² >> 10)
lo = 0xDC00 + (cpβ² & 0x3FF)
Where hi is the high surrogate and lo is the low surrogate. UTF-8 byte count per code point follows the encoding scheme:
Reference Data
| Unicode Plane | Range | Name | Characters | Common Content |
|---|---|---|---|---|
| 0 | U+0000 - U+FFFF | Basic Multilingual Plane (BMP) | 65,536 | Latin, Cyrillic, Greek, CJK, common symbols |
| 1 | U+10000 - U+1FFFF | Supplementary Multilingual Plane | 65,536 | Emoji, historic scripts, musical symbols |
| 2 | U+20000 - U+2FFFF | Supplementary Ideographic Plane | 65,536 | CJK Unified Ideographs Extension B |
| 3 | U+30000 - U+3FFFF | Tertiary Ideographic Plane | 65,536 | CJK Extension G, H |
| 4-13 | U+40000 - U+DFFFF | Unassigned | 655,360 | Reserved for future use |
| 14 | U+E0000 - U+EFFFF | Supplementary Special-purpose Plane | 65,536 | Tag characters, variation selectors |
| 15 | U+F0000 - U+FFFFF | Supplementary Private Use Area-A | 65,536 | Private-use characters |
| 16 | U+100000 - U+10FFFF | Supplementary Private Use Area-B | 65,536 | Private-use characters |
| Input Format | Example | Regex Pattern | Base | Notes |
|---|---|---|---|---|
| U+XXXX | U+0041 | U\+[0-9A-Fa-f]{1,6} | Hexadecimal | Most common Unicode notation |
| 0xXXXX | 0x0041 | 0x[0-9A-Fa-f]{1,6} | Hexadecimal | Programming hex literal |
| Decimal | 65 | [0-9]+ | Decimal | Raw integer code point value |
| HHHH; | A | &#x[0-9A-Fa-f]+; | Hexadecimal | HTML hex character reference |
| DDD; | A | &#[0-9]+; | Decimal | HTML decimal character reference |
| \uXXXX | \u0041 | \\u[0-9A-Fa-f]{4} | Hexadecimal | JavaScript/Java escape (BMP only) |
| \u{XXXXX} | \u{1F600} | \\u\{[0-9A-Fa-f]{1,6}\} | Hexadecimal | ES6+ extended escape (all planes) |
| Code Point | Character | Name | Block | UTF-8 Bytes |
|---|---|---|---|---|
| U+0041 | A | Latin Capital Letter A | Basic Latin | 1 |
| U+00E9 | Γ© | Latin Small Letter E with Acute | Latin-1 Supplement | 2 |
| U+4E16 | δΈ | CJK Unified Ideograph | CJK Unified Ideographs | 3 |
| U+0410 | Π | Cyrillic Capital Letter A | Cyrillic | 2 |
| U+2603 | β | Snowman | Miscellaneous Symbols | 3 |
| U+1F600 | π | Grinning Face | Emoticons (Plane 1) | 4 |
| U+1F4A9 | π© | Pile of Poo | Miscellaneous Symbols (Plane 1) | 4 |
| U+0000 | NUL | Null Character | Basic Latin (C0 Controls) | 1 |
| U+FEFF | BOM | Byte Order Mark | Arabic Presentation Forms-B | 3 |
| U+FFFD | οΏ½ | Replacement Character | Specials | 3 |
| U+200B | (invisible) | Zero Width Space | General Punctuation | 3 |
| U+20AC | β¬ | Euro Sign | Currency Symbols | 3 |