User Rating 0.0 โ˜…โ˜…โ˜…โ˜…โ˜…
Total Usage 0 times
0 bytes 0 chars
0 chars
Is this tool helpful?

Your feedback helps us improve.

โ˜… โ˜… โ˜… โ˜… โ˜…

About

Base62 encoding maps arbitrary binary data onto a 62-character alphabet: 0-9, A - Z, a - z. Unlike Base64, it avoids +, /, and = padding, producing URL-safe, filename-safe output without percent-encoding overhead. This matters in systems where non-alphanumeric characters cause parsing failures: REST path segments, short-link services, database keys with collation constraints, and distributed trace IDs. An incorrect encoding choice can silently corrupt data when it passes through middleware that strips or escapes special characters.

This tool converts UTF-8 strings to their Base62 representation by treating the byte sequence as a big-endian unsigned integer and performing repeated modular division by 62. The result is deterministic and reversible. Limitation: because the algorithm operates on arbitrary-precision integers, inputs beyond approximately 100 KB will incur noticeable latency. For bulk binary payloads, a chunked approach or Base64 with URL-safe variant may be more practical.

base62 encode decode string encoder base62 converter alphanumeric encoding

Formulas

Base62 encoding treats the input byte array as a single big-endian unsigned integer N and converts it to base 62 using the alphabet A = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz".

N = nโˆ’1โˆ‘i=0 bi โ‹… 256nโˆ’1โˆ’i

where bi is the i-th byte of the UTF-8 encoded input and n is the total byte count. The encoded digits dk are extracted by repeated division:

dk = N mod 62 , N โ† N62

The output string is the sequence of A[dk] characters, reversed (most-significant digit first). Decoding reverses the process: each character maps to index dk, accumulated as N = N โ‹… 62 + dk, then N is converted back to bytes.

The expansion ratio is log(256)log(62) 1.344, meaning each input byte produces roughly 1.344 Base62 characters.

Where: N = integer representation of input bytes, bi = i-th byte value (0 - 255), A = Base62 alphabet string, dk = k-th digit in base 62.

Reference Data

EncodingAlphabet SizeCharacters UsedURL SafePaddingExpansion Ratio (approx.)Common Use Cases
Base62620-9, A - Z, a - zYesNone1.37ร—Short URLs, trace IDs, tokens
Base6464A - Z, a - z, 0-9, +/No=1.33ร—Email (MIME), data URIs
Base64url64A - Z, a - z, 0-9, -_YesOptional1.33ร—JWT, URL parameters
Base5858Base62 minus 0, O, I, lYesNone1.38ร—Bitcoin addresses, IPFS
Base3232A - Z, 2-7Yes=1.60ร—TOTP secrets, Crockford IDs
Base16 (Hex)160-9, A - FYesNone2.00ร—Hash digests, MAC addresses
Base85 (Ascii85)85ASCII 33 - 117NoNone1.25ร—PDF streams, PostScript
Base36360-9, A - ZYesNone1.55ร—Case-insensitive short IDs
Base9191ASCII printable subsetNoNone1.23ร—Compact binary-to-text
Z85 (ZeroMQ)85Printable ASCII subsetPartialNone1.25ร—ZeroMQ frames, CurveZMQ
Base128128Full 7-bit ASCIINoNone1.14ร—Protobuf varints
UUencode64ASCII 32 - 95NoLength byte1.37ร—Legacy Unix email

Frequently Asked Questions

Base64 uses + and / characters, which have reserved meanings in URLs (space and path separator respectively). This requires percent-encoding (%2B, %2F), expanding the string by up to 3ร— per special character. Base62 uses only 0-9, A - Z, a - z - all unreserved URI characters per RFC 3986 - so no escaping is ever needed. Base64url exists as a compromise but still optionally uses = padding.
The tool caps input at 100 KB (102,400 bytes). The encoding algorithm converts the full byte array into a single BigInt, meaning computational complexity grows quadratically with input size due to BigInt division. For inputs under 10 KB, encoding completes in under 100 ms. Beyond 50 KB, expect multi-second latency. For large payloads, consider chunked encoding or Base64.
Yes, and this is one of the most common applications. Base62 IDs are shorter than UUID hex representations (22 Base62 characters vs 32 hex characters for 128 bits) and safe for case-sensitive collations. However, verify your database collation is case-sensitive (utf8_bin in MySQL, C collation in PostgreSQL). A case-insensitive collation will treat aB and Ab as identical, causing key collisions.
When the input byte array starts with zero-bytes (e.g., \x00\x00Hello), the BigInt representation loses that leading-zero information since 0 โ‹… 256n = 0. This tool prefixes the encoded output with a 4-byte length header (the original byte count encoded as a 4-byte big-endian integer), which is itself included in the BigInt before Base62 conversion. The decoder reads this length to restore the exact original byte sequence, including any leading zeros.
No. toString(36) produces Base36 (digits + lowercase letters only, 36 characters). Base62 adds uppercase letters, giving 62 distinct symbols. This reduces output length by roughly 13% compared to Base36 for the same input. JavaScript's native radix functions also only handle numbers up to Number.MAX_SAFE_INTEGER (253 โˆ’ 1), whereas this tool uses BigInt for arbitrary precision.
The decoder validates every character against the 62-character alphabet before processing. Any character not in 0-9, A - Z, a - z triggers an immediate error with identification of the invalid character and its position. This prevents silent data corruption that would occur if invalid characters were simply skipped or mapped to zero.