User Rating 0.0 β˜…β˜…β˜…β˜…β˜…
Total Usage 0 times
Input
0 characters · 0 bytes
Output
0 characters · 0 bytes
Is this tool helpful?

Your feedback helps us improve.

β˜… β˜… β˜… β˜… β˜…

About

URLs transmitted over HTTP must conform to RFC 3986, which restricts the allowed character set to 66 unreserved characters: A - Z, a - z, 0 - 9, and the symbols - _ . ~. Every other character - including spaces, non-ASCII glyphs, and reserved delimiters like & or = - must be percent-encoded as %HH, where HH is the uppercase hexadecimal value of the character's UTF-8 byte. Failing to encode query parameters correctly causes broken links, corrupted form submissions, injection vulnerabilities, and silent data loss in analytics pipelines. This tool performs real RFC 3986 percent-encoding on arbitrary input, including multi-byte UTF-8 sequences, and provides three encoding strictness modes.

Limitation: this tool approximates browser-native encodeURIComponent behavior for Component mode. Full URL mode preserves structural delimiters (: / ? # & = @) and encodes everything else. Encode All mode encodes every character including unreserved ones. Pro tip: always encode individual parameter values, never the entire URL string, or you will double-encode the delimiters.

url encoder percent encoding ascii encoding url decoder rfc 3986 urlencode url escape

Formulas

RFC 3986 defines the percent-encoding transformation. For each character c in the input string, the encoder determines whether c belongs to the unreserved set. If not, it converts c to its UTF-8 byte sequence and emits each byte as a percent-encoded triplet.

encode(c) =

{
c if c ∈ U% β‹… hex(bi) for each byte bi ∈ UTF8(c) otherwise

Where U is the unreserved character set defined as:

U = { A - Z , a - z , 0 - 9 , - , _ , . , ~ }

For multi-byte characters (code point > 127), the character is first encoded into its UTF-8 byte representation. A character with code point U produces 1 to 4 bytes depending on the range:

{
1 byte if U ≀ 0x7F2 bytes if 0x80 ≀ U ≀ 0x7FF3 bytes if 0x800 ≀ U ≀ 0xFFFF4 bytes if 0x10000 ≀ U ≀ 0x10FFFF

Where c = input character, U = unreserved set, bi = i-th byte of UTF-8 encoding, hex = uppercase hexadecimal conversion function.

Reference Data

DecHexCharEncodedCategoryDescription
000NUL%00ControlNull character
909TAB%09ControlHorizontal tab
100ALF%0AControlLine feed (newline)
130DCR%0DControlCarriage return
3220SP%20ReservedSpace (also + in forms)
3321!%21Sub-delimiterExclamation mark
3422"%22UnsafeDouble quote
3523#%23Reserved (gen-delim)Fragment identifier
3624$%24Sub-delimiterDollar sign
3725%%25Reserved (encoding)Percent sign (escape char itself)
3826&%26Reserved (gen-delim)Ampersand (query separator)
3927'%27Sub-delimiterSingle quote / apostrophe
4028(%28Sub-delimiterOpening parenthesis
4129)%29Sub-delimiterClosing parenthesis
422A*%2ASub-delimiterAsterisk
432B+%2BSub-delimiterPlus sign (space in forms)
442C,%2CSub-delimiterComma
472F/%2FReserved (gen-delim)Forward slash (path separator)
583A:%3AReserved (gen-delim)Colon (scheme separator)
593B;%3BSub-delimiterSemicolon
603C<%3CUnsafeLess-than sign
613D=%3DReserved (gen-delim)Equals sign (key-value separator)
623E>%3EUnsafeGreater-than sign
633F?%3FReserved (gen-delim)Question mark (query start)
6440@%40Reserved (gen-delim)At sign (userinfo separator)
915B[%5BReserved (gen-delim)Opening bracket (IPv6)
935D]%5DReserved (gen-delim)Closing bracket (IPv6)
1237B{%7BUnsafeOpening brace
1247C|%7CUnsafePipe / vertical bar
1257D}%7DUnsafeClosing brace
1267E~~UnreservedTilde (not encoded)
1277FDEL%7FControlDelete character

Frequently Asked Questions

RFC 3986 specifies %20 as the correct percent-encoding for a space character (ASCII 32). The + convention comes from the older application/x-www-form-urlencoded format (HTML form submissions), defined in the W3C HTML specification. In query strings submitted by HTML forms, spaces become +. In all other URL components (path, fragment, non-form queries), spaces must be %20. This tool uses %20 by default since it is universally valid.
Emoji characters have Unicode code points above 0xFFFF, placing them in the Supplementary Multilingual Plane. UTF-8 encodes these as 4 bytes. For example, the character U+1F600 (πŸ˜€) becomes the byte sequence F0 9F 98 80, which percent-encodes as %F0%9F%98%80. Each %HH triplet represents one byte, not one character.
Double-encoding occurs when you encode an already-encoded string. The % character (ASCII 37) in %20 gets re-encoded to %25, producing %2520. The server then decodes it once, yielding %20 as a literal string instead of a space. This is a common bug in URL construction. Always encode raw values before assembling them into a URL, never encode the final assembled URL.
RFC 3986 Section 2.3 defines exactly 66 unreserved characters: uppercase A - Z (26), lowercase a - z (26), digits 0 - 9 (10), and four symbols: hyphen -, underscore _, period ., tilde ~. Every other character, including commonly assumed safe ones like !, *, and ', should be percent-encoded for maximum interoperability.
Component mode (equivalent to JavaScript's encodeURIComponent) encodes everything except the 66 unreserved characters. It is designed for encoding individual query parameter values or path segments. Full URL mode preserves the structural delimiters that define URL syntax: : / ? # & = @. Use Full URL mode when encoding an entire URL that already has correct structure. Use Component mode when encoding a value that will be inserted into a URL template.
Yes. Any valid Unicode code point from U+0000 to U+10FFFF can be percent-encoded. The character is first converted to its UTF-8 byte sequence (which can be 1 to 4 bytes), and each byte is individually percent-encoded. JavaScript's TextEncoder API handles surrogate pair resolution automatically, so characters like musical symbols (U+1D11E, π„ž) or rare CJK ideographs encode correctly as 4-byte sequences.