User Rating 0.0 β˜…β˜…β˜…β˜…β˜…
Total Usage 0 times
Is this tool helpful?

Your feedback helps us improve.

β˜… β˜… β˜… β˜… β˜…

About

Handling raw code points across numbering systems is error-prone. A single misread hex digit turns U+0041 (Latin A) into U+0141 (Polish Ł), corrupting localization files or protocol payloads. This tool converts integer values between any radix from 2 (binary) to 36 and maps them to their corresponding Unicode code points via String.fromCodePoint. It handles values up to U+10FFFF (1,114,111 in decimal), covering the full Unicode 15.0 range including supplementary planes. Input is validated per-radix: digits exceeding the base are flagged immediately rather than silently producing wrong output.

The converter accepts space-separated or comma-separated tokens, auto-detects common prefixes (0x, 0b, 0o, U+), and outputs in your chosen target format. Conversion is bidirectional: paste Unicode text to extract code points in any base. Note: surrogate pair range D800 - DFFF is intentionally rejected per the Unicode standard. Noncharacters and unassigned code points convert without error but may render as replacement glyphs depending on your font stack.

unicode converter base converter hex to unicode binary to text code point converter number base conversion radix converter

Formulas

Base conversion relies on positional notation. A number string s of length n in base b represents the decimal value:

value = nβˆ’1βˆ‘i=0 di β‹… bi

Where di is the digit value at position i (rightmost = 0), and b is the source radix. For hex, digits A - F map to 10 - 15. The decimal integer is then converted to target base t via repeated division:

digit = value mod t , value = ⌊ valuet βŒ‹

Digits are collected in reverse order until value = 0. For Unicode mapping, the decimal value is treated as a code point cp. Valid range: 0 ≀ cp ≀ 1,114,111 excluding surrogates 55,296 - 57,343. The character is produced by fromCodePoint(cp). Reverse extraction uses codePointAt(0) to obtain the integer from a character.

Where: b = source base (radix 2 - 36), t = target base, di = digit value at position i, cp = Unicode code point (decimal integer), n = number of digits in source string.

Reference Data

BaseNameDigits UsedPrefixExample (A = 65)Common Use
2Binary0 - 10b1000001CPU instructions, bitfields
8Octal0 - 70o101Unix file permissions
10Decimal0 - 9 - 65HTML entities (A)
16Hexadecimal0 - 9, A - F0x / U+41Unicode, CSS colors, memory
32Base320 - 9, A - V - 21Crockford encoding, GeoHash
36Base360 - 9, A - Z - 1TURL shorteners, compact IDs
Key Unicode Ranges
Basic LatinU+0000 - 007FASCII (128 chars)
Latin Extended-AU+0100 - 017FEuropean diacritics
Greek & CopticU+0370 - 03FFMath symbols (Ξ±, Ξ², Ο€)
CyrillicU+0400 - 04FFRussian, Ukrainian, etc.
ArabicU+0600 - 06FFRTL script
CJK UnifiedU+4E00 - 9FFFChinese/Japanese/Korean
Emoji (Misc Symbols)U+1F600 - 1F64FEmoticons (supplementary plane)
Math OperatorsU+2200 - 22FFβˆ€, βˆƒ, ∞, βˆ‡
Box DrawingU+2500 - 257FTerminal UI borders
Private Use AreaU+E000 - F8FFCustom glyphs (icon fonts)
Surrogate PairsU+D800 - DFFFReserved (invalid as code points)
Max Code PointU+10FFFF1,114,111 decimal

Frequently Asked Questions

Code points D800 - DFFF are reserved for UTF-16 surrogate pairs. They are not valid Unicode scalar values. The Unicode Standard (Chapter 3, D76) explicitly forbids encoding them as characters. Attempting String.fromCodePoint(0xD800) throws a RangeError in JavaScript. This tool validates against this range and reports the error before conversion.
The parser checks for standard prefixes: 0b indicates binary (base 2), 0o indicates octal (base 8), 0x or U+ indicates hexadecimal (base 16). If no prefix is found, the tool uses the base selected in the source dropdown. Prefix detection is case-insensitive and stripped before digit parsing.
The converter outputs the correct character regardless of font support. If your system lacks a glyph for that code point, the browser renders a replacement character (often a box or U+FFFD ◊). The underlying data is still correct. Copy-pasting into an application with the required font will display it properly. Control characters (U+0000 - 001F) are intentionally shown with their Unicode Control Picture equivalents (U+2400 range) for visibility.
Yes. The tool uses String.fromCodePoint and codePointAt, which correctly handle code points above U+FFFF (the Basic Multilingual Plane limit). For example, entering hex 1F600 produces πŸ˜€. In reverse mode, pasting an emoji extracts its full 21-bit code point, not the individual 16-bit surrogate values that older charCodeAt would return.
Base 36 is the maximum. It uses digits 0 - 9 plus letters A - Z, exhausting the standard alphanumeric set. This matches the limit of JavaScript's parseInt and Number.toString. Bases beyond 36 would require non-standard digit symbols, introducing ambiguity.
In Text β†’ Code Points mode, the tool iterates using a code-point-aware loop (spread operator or Array.from), which correctly segments characters by their actual code points rather than UTF-16 code units. A single emoji like πŸ‘¨β€πŸ‘©β€πŸ‘§β€πŸ‘¦ (family ZWJ sequence) is decomposed into its constituent code points: 1F468, 200D, 1F469, 200D, 1F467, 200D, 1F466.