Any Base to Unicode Converter
Convert numbers between any base (2-36) and Unicode characters. Supports binary, octal, decimal, hex, and code points with real-time conversion.
About
Handling raw code points across numbering systems is error-prone. A single misread hex digit turns U+0041 (Latin A) into U+0141 (Polish Ε), corrupting localization files or protocol payloads. This tool converts integer values between any radix from 2 (binary) to 36 and maps them to their corresponding Unicode code points via String.fromCodePoint. It handles values up to U+10FFFF (1,114,111 in decimal), covering the full Unicode 15.0 range including supplementary planes. Input is validated per-radix: digits exceeding the base are flagged immediately rather than silently producing wrong output.
The converter accepts space-separated or comma-separated tokens, auto-detects common prefixes (0x, 0b, 0o, U+), and outputs in your chosen target format. Conversion is bidirectional: paste Unicode text to extract code points in any base. Note: surrogate pair range D800 - DFFF is intentionally rejected per the Unicode standard. Noncharacters and unassigned code points convert without error but may render as replacement glyphs depending on your font stack.
Formulas
Base conversion relies on positional notation. A number string s of length n in base b represents the decimal value:
Where di is the digit value at position i (rightmost = 0), and b is the source radix. For hex, digits A - F map to 10 - 15. The decimal integer is then converted to target base t via repeated division:
Digits are collected in reverse order until value = 0. For Unicode mapping, the decimal value is treated as a code point cp. Valid range: 0 β€ cp β€ 1,114,111 excluding surrogates 55,296 - 57,343. The character is produced by fromCodePoint(cp). Reverse extraction uses codePointAt(0) to obtain the integer from a character.
Where: b = source base (radix 2 - 36), t = target base, di = digit value at position i, cp = Unicode code point (decimal integer), n = number of digits in source string.
Reference Data
| Base | Name | Digits Used | Prefix | Example (A = 65) | Common Use |
|---|---|---|---|---|---|
| 2 | Binary | 0 - 1 | 0b | 1000001 | CPU instructions, bitfields |
| 8 | Octal | 0 - 7 | 0o | 101 | Unix file permissions |
| 10 | Decimal | 0 - 9 | - | 65 | HTML entities (A) |
| 16 | Hexadecimal | 0 - 9, A - F | 0x / U+ | 41 | Unicode, CSS colors, memory |
| 32 | Base32 | 0 - 9, A - V | - | 21 | Crockford encoding, GeoHash |
| 36 | Base36 | 0 - 9, A - Z | - | 1T | URL shorteners, compact IDs |
| Key Unicode Ranges | |||||
| Basic Latin | U+0000 - 007F | ASCII (128 chars) | |||
| Latin Extended-A | U+0100 - 017F | European diacritics | |||
| Greek & Coptic | U+0370 - 03FF | Math symbols (Ξ±, Ξ², Ο) | |||
| Cyrillic | U+0400 - 04FF | Russian, Ukrainian, etc. | |||
| Arabic | U+0600 - 06FF | RTL script | |||
| CJK Unified | U+4E00 - 9FFF | Chinese/Japanese/Korean | |||
| Emoji (Misc Symbols) | U+1F600 - 1F64F | Emoticons (supplementary plane) | |||
| Math Operators | U+2200 - 22FF | β, β, β, β | |||
| Box Drawing | U+2500 - 257F | Terminal UI borders | |||
| Private Use Area | U+E000 - F8FF | Custom glyphs (icon fonts) | |||
| Surrogate Pairs | U+D800 - DFFF | Reserved (invalid as code points) | |||
| Max Code Point | U+10FFFF | 1,114,111 decimal | |||