About

Base64 encoding maps arbitrary binary data onto a 64-character ASCII subset (A - Z, a - z, 0 - 9, +, /) plus = padding. Every 3 input bytes produce 4 output characters, inflating size by approximately 33%. The browser-native btoa function only handles Latin-1 codepoints (0 - 255). Feeding it raw UTF-8 multibyte characters (emoji, CJK, Cyrillic above U+00FF) throws a DOMException. This tool wraps the call through encodeURIComponent and percent-byte expansion so full Unicode survives the round-trip.

Incorrect Base64 handling silently corrupts payloads in JWT tokens, data URIs, SMTP attachments, and API request bodies. A single missing = pad or an illegal character like # will break a parser downstream. This tool validates the input character set and padding length before decoding, reports the exact error position, and shows byte-level statistics so you can verify data integrity before pasting the result into production code.

Formulas

Base64 encodes every group of 3 input bytes (24 bits) into 4 output characters (6 bits each). The output length formula:

L_out = 4 ⋅ ceil(L_in3)

where L_in = input byte length and L_out = output character count including padding.

The size overhead ratio is constant at the limit:

overhead = L_out − L_inL_in ≈ 0.333

The UTF-8 shim used in this tool performs the following transformation for encoding:

encoded = btoa(unescape(encodeURIComponent(input)))

And for decoding:

decoded = decodeURIComponent(escape(atob(input)))

Padding rules: when L_in mod 3 = 1, two = pads are appended. When L_in mod 3 = 2, one = pad is appended. When divisible by 3, no padding is needed.

Reference Data

Character Range	Index	Binary Pattern	Count	Notes
A - Z	0 - 25	000000 - 011001	26	Uppercase Latin
a - z	26 - 51	011010 - 110011	26	Lowercase Latin
0 - 9	52 - 61	110100 - 111101	10	Decimal digits
+	62	111110	1	Standard alphabet
/	63	111111	1	Standard alphabet
=	-	-	1	Padding character
-	62	111110	1	URL-safe replaces +
_	63	111111	1	URL-safe replaces /
Size Ratios
Input bytes	Output chars		Overhead
1	4		300%
2	4		100%
3	4		33.3%
57	76		33.3% (MIME line)
768	1024		33.3%
Common MIME Contexts
Email (SMTP)	RFC 2045 - line length ≤ 76 chars, CRLF line endings
Data URI	data:[mime];base64,... - no line breaks
JWT	Base64url (no padding, - and _)
PEM Certificate	64-char lines between BEGIN/END markers
HTTP Basic Auth	Authorization: Basic + Base64(user:pass)
XML (MTOM)	xsd:base64Binary - whitespace permitted
S/MIME	RFC 8551 - signed or encrypted payloads
Git objects	Pack files use modified Base64 internally

Frequently Asked Questions

The native btoa function only accepts strings where every character's code point falls in the range 0 - 255 (Latin-1). Characters like emoji, Chinese, Arabic, or Cyrillic above U+00FF contain multi-byte UTF-8 sequences that exceed this range. This tool pre-processes the input through encodeURIComponent to convert multi-byte characters into percent-encoded single-byte sequences before passing them to btoa, ensuring lossless round-trip encoding for any Unicode string.

Standard Base64 (RFC 4648 §4) uses + (index 62) and / (index 63) with = padding. Base64url (RFC 4648 §5) replaces these with - and _ respectively and typically omits padding. Base64url is required for JWT tokens, URL query parameters, and filenames because +, /, and = are reserved or problematic in URIs. This tool supports both variants via a toggle.

A valid Base64 string length must be divisible by 4. If it is not, the trailing group is ambiguous - the decoder cannot determine whether the missing characters represent data or padding. Most decoders (including atob) will throw an InvalidCharacterError. A string with 1 character modulo 4 is always invalid. This tool checks padding correctness before attempting the decode and reports the specific violation.

No. Base64 is a reversible encoding scheme, not encryption. It provides zero confidentiality. Any party with access to the encoded string can decode it instantly. It is commonly mistaken for obfuscation in contexts like HTTP Basic Authentication headers (Authorization: Basic dXNlcjpwYXNz), but the credentials are transmitted in cleartext equivalent. Always use TLS and proper encryption (AES-256, RSA) for sensitive data.

This tool runs entirely in the browser. The practical limit is determined by the JavaScript engine's string size limit, which is typically 256MB to 1GB in modern browsers. However, textarea rendering becomes slow above approximately 5MB of text. For very large payloads, performance degrades in the UI layer, not the encoding algorithm. The tool displays a warning above 1MB of input.

This occurs when the original encoding used a different character encoding than the decoder assumes. If the data was encoded as raw Latin-1 via plain btoa but contains UTF-8 multi-byte sequences, decoding via plain atob produces mojibake. Toggle the UTF-8 mode in this tool to apply the decodeURIComponent(escape(...)) wrapper. If the original data was binary (an image, a protobuf), the output is not meant to be interpreted as text at all.