About

Base64 is a binary-to-text encoding scheme defined in RFC 4648. It maps every 3 input bytes to 4 ASCII characters from a 64-character alphabet (A - Z, a - z, 0 - 9, +, /), with = padding. A naive atob call in JavaScript returns a Latin-1 binary string, not UTF-8. Any input containing multi-byte sequences (accented characters, CJK ideographs, emoji) will decode to garbage unless each byte is routed through a TextDecoder configured for UTF-8. This tool performs that correct pipeline: Base64 → binary octets → UTF-8 code points. Getting this wrong silently corrupts data in API payloads, JWT claims, email attachments, and data URIs.

The converter also supports URL-safe Base64 (RFC 4648 §5), which replaces + with - and / with _, and strips trailing = padding. This variant is mandatory in JWTs and many cloud storage APIs. Limitation: this tool operates entirely in-browser and is bounded by available heap memory. Inputs above approximately 5 MB may cause slowdowns on low-end devices.

Formulas

Base64 encoding maps each group of 3 input octets (24 bits) into 4 output characters (6 bits each). The encoded length is computed as:

L_encoded = 4 ⋅ ceil(n3)

where n = number of input bytes. The size overhead is always exactly 13 (33.3%). Padding characters = are appended so L_encoded mod 4 = 0.

The decoding pipeline implemented in this tool:

base64String → atob() → binaryString → Uint8Array → TextDecoder("utf-8") → utf8String

For URL-safe variant, the pre-processing step replaces characters before decoding:

- → + , _ → / , pad with = until len mod 4 = 0

where L_encoded is the final output length in characters, n is the raw byte count of the input, and ceil is the ceiling function.

Reference Data

Property	Standard Base64	URL-Safe Base64
RFC	RFC 4648 §4	RFC 4648 §5
Alphabet	A - Z, a - z, 0 - 9, +, /	A - Z, a - z, 0 - 9, -, _
Padding	= (mandatory)	Optional / stripped
Size Overhead	33.3%	33.3%
Encoding Ratio	3 bytes → 4 chars	3 bytes → 4 chars
Use Case	Email (MIME), PEM certs, XML	JWT, URLs, filenames
Line Breaks	MIME: every 76 chars	None
Input: "A"	QQ==	QQ
Input: "AB"	QUI=	QUI
Input: "ABC"	QUJD	QUJD
Input: "Hello"	SGVsbG8=	SGVsbG8
Input: "€"	4oKs	4oKs
Input: "日本"	5pel5pys	5pel5pys
Input: "😀"	8J+YgA==	8J-YgA
Empty String	(empty)	(empty)
UTF-8 BOM	77u/	77u_
Max Line (MIME)	76 chars + CRLF	N/A
Padding Chars Needed	0, 1, or 2	None
Invalid Characters	Any outside alphabet + =	Any outside alphabet
Bit Mapping	6 bits per character	6 bits per character

Frequently Asked Questions

The native atob() function returns a binary string where each character represents one byte (code point 0-255). For multi-byte UTF-8 sequences (any character above U+007F), the bytes must be collected into a Uint8Array and decoded with TextDecoder('utf-8'). Without this step, a 3-byte character like "€" (bytes 0xE2 0x82 0xAC) is misinterpreted as three separate Latin-1 characters.

Standard Base64 (RFC 4648 §4) uses "+" and "/" as characters 62 and 63, with "=" padding. URL-safe Base64 (RFC 4648 §5) replaces "+" with "-" and "/" with "_", and typically omits padding. URL-safe is required in JWTs, data URIs embedded in query strings, and any context where "+", "/", and "=" have reserved meanings.

MIME-encoded Base64 (RFC 2045) inserts CRLF every 76 characters. This converter strips all whitespace (spaces, tabs, CR, LF) before decoding, so it accepts both single-line and MIME-wrapped input without errors.

The converter validates input against the Base64 alphabet before attempting decode. Characters outside A - Z, a - z, 0-9, +, /, = (or -, _ for URL-safe mode) trigger an explicit error message identifying the invalid characters. This prevents the silent "InvalidCharacterError" that atob() throws natively.

When decoding Base64 that represents binary data (not valid UTF-8 text), the TextDecoder with fatal mode will report a decoding error because arbitrary binary bytes are not valid UTF-8 sequences. For binary payloads, use a dedicated Base64-to-file converter that writes raw bytes to a Blob download instead of interpreting them as text.

The practical limit is browser heap memory. JavaScript strings can hold up to approximately 512 MB on modern browsers, but performance degrades above roughly 5 MB of Base64 input (which decodes to ~3.75 MB of binary data). For inputs exceeding this, consider a streaming or chunked approach outside the browser.