About

Character encoding errors account for a significant share of data corruption in file transfers, serial communication, and legacy system integrations. This converter maps each character to its numeric code point and renders it in binary (base-2), hexadecimal (base-16), octal (base-8), or decimal (base-10) representation. Standard ASCII uses 7 bits per character, covering code points 0 - 127. Extended ASCII and UTF-8 use 8 bits or multi-byte sequences per RFC 3629. The tool performs bidirectional conversion: paste text to get binary, or paste binary to recover the original string.

Misidentifying encoding (e.g., treating UTF-8 multi-byte as ISO-8859-1) produces garbled output known as mojibake. This tool validates every input group against the selected encoding constraints before conversion. It approximates behavior for printable ASCII; control characters (code points < 32) are converted but may not render visibly. Pro tip: when debugging serial protocols, use the space delimiter and compare byte-by-byte against your device specification sheet.

Formulas

Each character is mapped to its Unicode code point, then converted to the target base. The fundamental conversion for a single character c to binary proceeds as follows:

B = pad(toString(charCodeAt(c), 2), 8)

Where charCodeAt(c) returns the decimal code point of character c, toString(n, 2) converts integer n to base-2 string, and pad left-pads with zeros to 8 bits. For the reverse direction:

c = fromCharCode(parseInt(B, 2))

For UTF-8 multi-byte characters (code points > 127), the encoder applies the RFC 3629 scheme:

{

0xxxxxxx if cp ≤ 007F_h (1 byte)110xxxxx 10xxxxxx if cp ≤ 07FF_h (2 bytes)1110xxxx 10xxxxxx 10xxxxxx if cp ≤ FFFF_h (3 bytes)11110xxx 10xxxxxx 10xxxxxx 10xxxxxx if cp ≤ 10FFFF_h (4 bytes)

Where cp is the Unicode code point and x bits are filled from the code point value. The conversion to hexadecimal uses base 16 with digits 0 - 9 and A - F, padded to 2 characters. Octal uses base 8, padded to 3 characters.

Reference Data

Character	Decimal	Hex	Octal	Binary (8-bit)	Description
NUL	0	00	000	00000000	Null
SP	32	20	040	00100000	Space
!	33	21	041	00100001	Exclamation mark
0	48	30	060	00110000	Digit zero
9	57	39	071	00111001	Digit nine
A	65	41	101	01000001	Uppercase A
Z	90	5A	132	01011010	Uppercase Z
a	97	61	141	01100001	Lowercase a
z	122	7A	172	01111010	Lowercase z
@	64	40	100	01000000	At sign
#	35	23	043	00100011	Hash / Number sign
+	43	2B	053	00101011	Plus sign
=	61	3D	075	00111101	Equals sign
/	47	2F	057	00101111	Forward slash
\	92	5C	134	01011100	Backslash
{	123	7B	173	01111011	Left curly brace
}	125	7D	175	01111101	Right curly brace
[	91	5B	133	01011011	Left square bracket
]	93	5D	135	01011101	Right square bracket
~	126	7E	176	01111110	Tilde
DEL	127	7F	177	01111111	Delete
LF	10	0A	012	00001010	Line feed
CR	13	0D	015	00001101	Carriage return
TAB	9	09	011	00001001	Horizontal tab
ESC	27	1B	033	00011011	Escape

Frequently Asked Questions

Standard ASCII defines 128 characters using code points 0 - 127, requiring only 7 bits per character. The 8th bit is traditionally a parity bit. Extended ASCII (ISO 8859-1, Windows-1252) uses the full 8 bits to represent 256 characters, adding accented letters and symbols in positions 128 - 255. This converter defaults to 8-bit representation for consistency. When working with strict 7-bit systems (some legacy serial protocols), verify that all input characters fall within the 0 - 127 range.

Control characters (code points 0 - 31 and 127) are valid ASCII and are converted to their binary equivalents. For example, a line feed (LF, code point 10) becomes 00001010 and a carriage return (CR, code point 13) becomes 00001101. These characters will not render visibly in the character breakdown table but their binary representations are correct. When converting binary back to text, the decoded control characters will be inserted into the output string and may affect formatting.

Characters outside the ASCII range (code points above 127) require multi-byte encoding under UTF-8. A character like € (U+20AC) encodes to 3 bytes (24 bits): 11100010 10000000 10101100. Emoji can require 4 bytes (32 bits). Switch the encoding mode to UTF-8 to handle these correctly. In ASCII-only mode, the converter will flag any character with a code point above 127 as out of range.

The converter must know where one byte ends and the next begins. Without delimiters, a string like 0100000101000010 is ambiguous unless you enforce fixed 8-bit grouping. Using a space delimiter (01000001 01000010) is the most common convention. This tool supports space, comma, dash, pipe, and no-delimiter modes. When using no delimiter, the total bit count must be divisible by 8 (or 7 in strict ASCII mode), otherwise the conversion will fail with a validation error.

Yes. Enter text to produce binary, then paste that binary output back in reverse mode to recover the original text. If the round-trip produces identical output, the conversion is lossless. Data integrity issues arise when: (1) non-binary characters appear in the input during binary-to-text mode, (2) bit groups are not properly delimited, or (3) a code point falls outside the selected encoding range. The converter validates each group and reports specific error positions.

The converter accepts up to 100,000 characters of input. Conversion is an O(n) operation, where n is the character count. For 100,000 characters, the output binary string (with space delimiters) will be approximately 900,000 characters. Performance remains under 50ms on modern hardware for this size. For extremely large payloads, consider splitting the input into chunks.