Byte to String Converter
Convert byte arrays to readable strings and back. Supports decimal, hex, octal, binary input with UTF-8, ASCII, Latin-1, UTF-16 decoding.
About
Misinterpreting a single byte encoding destroys data. A file saved as UTF-8 but read as Latin-1 replaces every multibyte character with garbage - mojibake - and the original content may be unrecoverable. This tool converts raw byte sequences, expressed as decimal, hexadecimal, octal, or binary values, into human-readable strings using the exact encoding you specify: UTF-8, ASCII (0 - 127), ISO-8859-1 (Latin-1, 0 - 255), or UTF-16. It also operates in reverse: paste a string, receive its byte representation in any base.
The converter uses the browser's native TextDecoder and TextEncoder APIs, which implement the WHATWG Encoding Standard. Results match the behavior of production runtimes. Note: ASCII strict mode rejects any byte above 127. UTF-8 sequences with invalid continuation bytes produce the replacement character U+FFFD rather than silent corruption. For UTF-16, byte order matters - select the correct endianness or expect swapped characters.
Formulas
Byte-to-string conversion applies a decoding function D that maps an ordered sequence of byte values to Unicode code points, then renders those code points as characters.
Where S is the output string, Denc is the decoder for encoding enc, and each bi is a byte value in range 0 - 255.
Input parsing converts a text token t in base r to its integer byte value:
For UTF-8, a code point U determines how many bytes encode it:
For the reverse operation (String → Bytes), the TextEncoder API produces UTF-8 byte sequences. For single-byte encodings, each character's code point maps directly: b = charCodeAt(i).
Where b = byte value, t = input token string, r = radix (number base), U = Unicode code point, n = total byte count, S = decoded output string, enc = encoding label.
Reference Data
| Encoding | Byte Range | Max Bytes/Char | BOM | Standard | Typical Use |
|---|---|---|---|---|---|
| ASCII | 0 - 127 | 1 | None | ANSI X3.4-1968 | Legacy protocols, RFC headers |
| ISO-8859-1 (Latin-1) | 0 - 255 | 1 | None | ISO/IEC 8859-1:1998 | Western European text, HTTP default |
| UTF-8 | 0 - 255 | 4 | EF BB BF (optional) | RFC 3629 | Web (93%+ of pages), JSON, XML |
| UTF-16 LE | 0 - 255 | 4 | FF FE | RFC 2781 | Windows internals, Java strings |
| UTF-16 BE | 0 - 255 | 4 | FE FF | RFC 2781 | Network byte order, older Mac OS |
| Windows-1252 | 0 - 255 | 1 | None | Microsoft | Legacy Windows apps, emails |
| Common Byte Representations | |||||
| Decimal | Base 10 | Example: 72 101 108 108 111 → "Hello" | |||
| Hexadecimal | Base 16 | Example: 48 65 6C 6C 6F → "Hello" | |||
| Octal | Base 8 | Example: 110 145 154 154 157 → "Hello" | |||
| Binary | Base 2 | Example: 01001000 01100101 → "He" | |||
| UTF-8 Multibyte Structure | |||||
| 0xxxxxxx | 1 byte | U+0000 - U+007F (ASCII compatible) | |||
| 110xxxxx 10xxxxxx | 2 bytes | U+0080 - U+07FF (Latin, Greek, Cyrillic) | |||
| 1110xxxx 10xxxxxx 10xxxxxx | 3 bytes | U+0800 - U+FFFF (CJK, most BMP) | |||
| 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx | 4 bytes | U+10000 - U+10FFFF (Emoji, rare scripts) | |||
| Printable ASCII Quick Reference | |||||
| 32 (20h) | Space | 48 - 57 | Digits 0-9 | ||
| 65 - 90 | A - Z uppercase | 97 - 122 | a - z lowercase | ||
| 33 - 47 | Punctuation set 1 | 123 - 126 | { | } ~ | ||