About

Handling raw code points across numbering systems is error-prone. A single misread hex digit turns U+0041 (Latin A) into U+0141 (Polish Ł), corrupting localization files or protocol payloads. This tool converts integer values between any radix from 2 (binary) to 36 and maps them to their corresponding Unicode code points via String.fromCodePoint. It handles values up to U+10FFFF (1,114,111 in decimal), covering the full Unicode 15.0 range including supplementary planes. Input is validated per-radix: digits exceeding the base are flagged immediately rather than silently producing wrong output.

The converter accepts space-separated or comma-separated tokens, auto-detects common prefixes (0x, 0b, 0o, U+), and outputs in your chosen target format. Conversion is bidirectional: paste Unicode text to extract code points in any base. Note: surrogate pair range D800 - DFFF is intentionally rejected per the Unicode standard. Noncharacters and unassigned code points convert without error but may render as replacement glyphs depending on your font stack.

Formulas

Base conversion relies on positional notation. A number string s of length n in base b represents the decimal value:

value = n−1∑i=0 d_i ⋅ bⁱ

Where d_i is the digit value at position i (rightmost = 0), and b is the source radix. For hex, digits A - F map to 10 - 15. The decimal integer is then converted to target base t via repeated division:

digit = value mod t , value = ⌊ valuet ⌋

Digits are collected in reverse order until value = 0. For Unicode mapping, the decimal value is treated as a code point cp. Valid range: 0 ≤ cp ≤ 1,114,111 excluding surrogates 55,296 - 57,343. The character is produced by fromCodePoint(cp). Reverse extraction uses codePointAt(0) to obtain the integer from a character.

Where: b = source base (radix 2 - 36), t = target base, d_i = digit value at position i, cp = Unicode code point (decimal integer), n = number of digits in source string.

Reference Data

Base	Name	Digits Used	Prefix	Example (A = 65)	Common Use
2	Binary	0 - 1	0b	1000001	CPU instructions, bitfields
8	Octal	0 - 7	0o	101	Unix file permissions
10	Decimal	0 - 9	-	65	HTML entities (A)
16	Hexadecimal	0 - 9, A - F	0x / U+	41	Unicode, CSS colors, memory
32	Base32	0 - 9, A - V	-	21	Crockford encoding, GeoHash
36	Base36	0 - 9, A - Z	-	1T	URL shorteners, compact IDs
Key Unicode Ranges
Basic Latin		U+0000 - 007F		ASCII (128 chars)
Latin Extended-A		U+0100 - 017F		European diacritics
Greek & Coptic		U+0370 - 03FF		Math symbols (α, β, π)
Cyrillic		U+0400 - 04FF		Russian, Ukrainian, etc.
Arabic		U+0600 - 06FF		RTL script
CJK Unified		U+4E00 - 9FFF		Chinese/Japanese/Korean
Emoji (Misc Symbols)		U+1F600 - 1F64F		Emoticons (supplementary plane)
Math Operators		U+2200 - 22FF		∀, ∃, ∞, ∇
Box Drawing		U+2500 - 257F		Terminal UI borders
Private Use Area		U+E000 - F8FF		Custom glyphs (icon fonts)
Surrogate Pairs		U+D800 - DFFF		Reserved (invalid as code points)
Max Code Point		U+10FFFF		1,114,111 decimal

Frequently Asked Questions

Code points D800 - DFFF are reserved for UTF-16 surrogate pairs. They are not valid Unicode scalar values. The Unicode Standard (Chapter 3, D76) explicitly forbids encoding them as characters. Attempting String.fromCodePoint(0xD800) throws a RangeError in JavaScript. This tool validates against this range and reports the error before conversion.

The parser checks for standard prefixes: 0b indicates binary (base 2), 0o indicates octal (base 8), 0x or U+ indicates hexadecimal (base 16). If no prefix is found, the tool uses the base selected in the source dropdown. Prefix detection is case-insensitive and stripped before digit parsing.

The converter outputs the correct character regardless of font support. If your system lacks a glyph for that code point, the browser renders a replacement character (often a box or U+FFFD ◊). The underlying data is still correct. Copy-pasting into an application with the required font will display it properly. Control characters (U+0000 - 001F) are intentionally shown with their Unicode Control Picture equivalents (U+2400 range) for visibility.

Yes. The tool uses String.fromCodePoint and codePointAt, which correctly handle code points above U+FFFF (the Basic Multilingual Plane limit). For example, entering hex 1F600 produces 😀. In reverse mode, pasting an emoji extracts its full 21-bit code point, not the individual 16-bit surrogate values that older charCodeAt would return.

Base 36 is the maximum. It uses digits 0 - 9 plus letters A - Z, exhausting the standard alphanumeric set. This matches the limit of JavaScript's parseInt and Number.toString. Bases beyond 36 would require non-standard digit symbols, introducing ambiguity.

In Text → Code Points mode, the tool iterates using a code-point-aware loop (spread operator or Array.from), which correctly segments characters by their actual code points rather than UTF-16 code units. A single emoji like 👨‍👩‍👧‍👦 (family ZWJ sequence) is decomposed into its constituent code points: 1F468, 200D, 1F469, 200D, 1F467, 200D, 1F466.