About

Base32 encoding transforms arbitrary binary data into a restricted alphabet of 32 ASCII characters (A - Z and 2 - 7) plus the padding character =, as specified in RFC 4648 §6. Each group of 5 input bytes (40 bits) maps to 8 output characters, producing roughly 160% size expansion. The encoding is case-insensitive by specification, which makes it suitable for environments that are not binary-safe or that fold letter case - DNS labels, file systems, TOTP secret keys (RFC 6238), and onion addresses all rely on Base32 for this reason. Incorrect implementation leads to silent data corruption: a single misplaced bit shifts all subsequent output. This tool performs byte-accurate RFC 4648 encoding with proper = padding and full UTF-8 support via the TextEncoder API.

Formulas

Base32 encoding operates on 5-byte input blocks. Each block of 40 bits is partitioned into 8 groups of 5 bits. Each 5-bit group indexes into the alphabet A - Z, 2 - 7.

output length = 8 × ceil(n5)

where n = number of input bytes. The padding character count depends on the remainder r = n mod 5:

{

0 pad chars if r = 06 pad chars if r = 14 pad chars if r = 23 pad chars if r = 31 pad char if r = 4

For decoding, each Base32 character maps back to its 5-bit value. The bits are concatenated, then split into 8-bit bytes. Trailing bits that do not form a complete byte are discarded. The per-character index function is:

index(c) =

{

c − 65 if A ≤ c ≤ Zc − 24 if 2 ≤ c ≤ 7

where c is the ASCII code point of the character. Characters outside this set (except =) indicate a malformed input.

Reference Data

Encoding	Alphabet Size	Characters Used	Bits per Char	Expansion Ratio	Case Sensitive	Padding	RFC	Common Use
Base16 (Hex)	16	0 - 9, A - F	4	200%	No	None	RFC 4648 §8	MAC addresses, hashes
Base32	32	A - Z, 2 - 7	5	160%	No	=	RFC 4648 §6	TOTP, onion addresses
Base32hex	32	0 - 9, A - V	5	160%	No	=	RFC 4648 §7	NSEC3 (DNSSEC)
Base64	64	A - Z, a - z, 0 - 9, +/	6	133%	Yes	=	RFC 4648 §4	Email (MIME), data URIs
Base64url	64	A - Z, a - z, 0 - 9, -_	6	133%	Yes	Optional	RFC 4648 §5	JWT, URL parameters
Ascii85	85	! - u	6.4	125%	Yes	Special	-	PDF, PostScript
z-base-32	32	Human-friendly set	5	160%	No	None	-	Mnet, Tahoe-LAFS
Crockford Base32	32	0 - 9, A - Z (excl. I,L,O,U)	5	160%	No	None	-	ULID identifiers
Base58	58	Alphanumeric (excl. 0,O,I,l)	5.86	137%	Yes	None	-	Bitcoin addresses
UUencode	64	ASCII 32 - 95	6	133%	Yes	Length prefix	-	Legacy Unix mail
BinHex	64	Custom set	6	133%	Yes	Special	-	Classic Mac OS
Base2 (Binary)	2	0, 1	1	800%	N/A	None	-	Debugging, education

Frequently Asked Questions

The digits 0 and 1 are excluded because they are visually ambiguous with the letters O and I/l. RFC 4648 selects 2 - 7 to minimize transcription errors when humans read or type encoded values. This is critical for TOTP secret keys where a single wrong character generates invalid one-time passwords.

Base32 operates on raw bytes, not characters. A UTF-8 emoji like 😀 occupies 4 bytes. The TextEncoder API converts the string to its UTF-8 byte representation first, then Base32 encodes those bytes normally. Decoding reverses the process: Base32 to bytes, then TextDecoder reconstructs the original string. No data is lost.

Standard Base32 (RFC 4648 §6) uses A - Z, 2 - 7. Base32hex (RFC 4648 §7) uses 0 - 9, A - V. The hex variant preserves sort order of the original binary data, which is why DNSSEC NSEC3 records use it. Standard Base32 does not preserve sort order.

RFC 4648 §3.2 states that padding is required unless the specification referencing Base32 explicitly says otherwise. Without padding, the decoder cannot distinguish between inputs of 1 byte (6 padding chars) and 4 bytes (1 padding char) if truncated. In practice, TOTP (RFC 6238) implementations commonly strip padding because the secret length is known. This tool includes padding by default with an option to remove it.

Base32 encodes 5 bits per character while Base64 encodes 6 bits per character. For n input bytes, Base32 produces ceil(8n ÷ 5) characters versus Base64 producing ceil(4n ÷ 3) characters. The tradeoff is that Base32 avoids case sensitivity and problematic characters (+, /), making it safer for case-insensitive systems.

Characters outside the Base32 alphabet (A - Z, 2 - 7, =) cause the decoder to reject the input. This tool performs strict validation and reports the exact position and character that is invalid. Whitespace is silently stripped before validation, per RFC 4648 §3.3 recommendations for robustness.