URL Encoding: What %20 Actually Means and Why It Matters
If you have ever copied a URL and seen sequences like %20, %2F, or %3D scattered throughout it, you have encountered URL encoding. These percent sequences are not random — they follow a precise specification that ensures any character can travel safely in a URL without being misinterpreted. Here is the full explanation.
Why URLs Need Encoding
A URL (Uniform Resource Locator) is made up of several components: scheme, host, path, query string, and fragment. Each component uses specific characters as structural delimiters. The slash / separates path segments. The question mark ? begins the query string. The ampersand & separates query parameters. The equals sign = connects parameter names to values.
This creates a problem: what happens when the data you want to put in a URL contains these same delimiter characters? If you want to pass the value a=b&c=d as a query parameter value, a naive URL would be completely ambiguous — parsers would not know where the value ends and where structure begins.
URL encoding (also called percent-encoding) solves this by replacing any character that has special meaning — or that is not safe in a URL — with a percent sign followed by that character's hexadecimal ASCII code.
The Percent-Encoding Format
The format is simple: %XX where XX is the two-digit uppercase hexadecimal value of the byte. Every character in a URL that is not an "unreserved character" must be percent-encoded.
RFC 3986 (the standard governing URIs) defines the unreserved characters as: A–Z, a–z, 0–9, hyphen (-), period (.), underscore (_), and tilde (~). These 66 characters can appear in any URL component without encoding. Everything else must be encoded when used as data.
Format: % followed by two hex digits
Example: space (ASCII 32, hex 20) → %20
Unreserved chars: A-Z a-z 0-9 - . _ ~
Common Encoded Characters
Here are the most frequently encountered percent-encoded characters and what they represent:
- %20 — Space (ASCII 32). The most common encoding you will see.
- %2F — Forward slash (/). Encoded when a slash appears as data rather than a path delimiter.
- %3D — Equals sign (=). Encoded in query parameter values so parsers do not confuse it with the key=value separator.
- %26 — Ampersand (&). Encoded in values to avoid being mistaken for a parameter separator.
- %3F — Question mark (?). Encoded in path segments to avoid triggering the query string.
- %23 — Hash (#). Encoded to prevent it from being treated as a fragment identifier.
- %40 — At sign (@). Common in email addresses used as URL components.
- %2B — Plus sign (+). Has a special meaning in query strings (represents a space in form submissions), so literal plus signs must be encoded.
- %3A — Colon (:). Encoded in certain URL components where it would be ambiguous.
Query Strings vs Path Encoding
Encoding rules differ slightly between URL components. In a path segment, the slash is a delimiter, so any literal slash in data must be encoded as %2F. But in query strings, slash is not a delimiter and can appear unencoded (though encoding it is always safe).
There is also a historical quirk: HTML forms submitting via GET encode spaces as + rather than %20 in the application/x-www-form-urlencoded format. This means %2B (the encoding of a literal plus) and + (representing a space) mean different things. When parsing form data, a server must decode + as a space — but when parsing a raw URL path, + is just a literal plus sign. This inconsistency trips up many developers.
encodeURIComponent vs encodeURI in JavaScript
JavaScript provides two built-in functions for URL encoding, and they behave differently in important ways:
encodeURIComponent()
encodeURIComponent() encodes all characters except: A–Z, a–z, 0–9, -, _, ., !, ~, *, ', (, and ). This makes it aggressive — it encodes slash, question mark, ampersand, equals sign, and everything else that has structural meaning in a URL.
Use encodeURIComponent() when encoding individual query parameter names or values. It ensures your data will never be misinterpreted as URL structure.
encodeURI()
encodeURI() is less aggressive. It preserves characters that are valid URL structure components: ; , / ? : @ & = + $ #. It is designed to encode a complete URL that already has valid structure — it will not break the colons in https:// or the slashes in paths.
Use encodeURI() when you have a full URL and want to make it safe for embedding in another context, without destroying its structure. Never use encodeURI() to encode a parameter value — it will leave & and = unencoded, which will corrupt your query string.
Parse and encode URLs instantly
Break down any URL into its components, encode or decode percent-encoded strings, and inspect query parameters.
Double Encoding: A Common Bug
Double encoding happens when you encode a value that has already been encoded. A space becomes %20 after the first encoding. If encoded again, the % itself (ASCII 37, hex 25) becomes %25, turning %20 into %2520. A server that decodes this only once will receive %20 as a literal string instead of a space.
Double encoding is a frequent source of bugs in URL construction and can also be a security concern — some systems decode URLs multiple times in sequence, which attackers can exploit to bypass input validation (a technique called double-encoding attacks).
The fix is straightforward: encode each value exactly once at construction time, and decode each component exactly once at parse time.
Non-ASCII Characters and UTF-8 Encoding
Characters outside the ASCII range — such as accented letters, Chinese characters, or emoji — require encoding as UTF-8 byte sequences where each byte is then percent-encoded. For example, the é character (U+00E9) encodes as the two-byte UTF-8 sequence 0xC3 0xA9, which becomes %C3%A9 in a URL. Modern browsers handle this automatically when you type non-ASCII characters into a URL bar, but server-side code must explicitly use UTF-8 when encoding and decoding.
Form Submission Encoding
When an HTML form uses the default application/x-www-form-urlencoded encoding and submits via GET, the browser encodes form field names and values using a modified form of URL encoding where spaces become + rather than %20. The resulting query string is appended to the URL. This is why you might see ?q=hello+world in a search URL — it means the user typed "hello world" in a form field.
Server frameworks decode this automatically, but if you are manually parsing query strings or building form data, remember that + means space in this context.
The Bottom Line
URL encoding is a compact system with a small set of rules: unreserved characters travel as-is, everything else uses %XX hex notation. The main practical points to remember are: always use encodeURIComponent() for query parameter values in JavaScript, be aware of the +-for-space convention in HTML forms, avoid double encoding, and handle UTF-8 correctly for non-ASCII data. Getting these right prevents a large class of URL-related bugs that are surprisingly common in production systems.
Try the URL Parser
Decode percent-encoded URLs, inspect query parameters, and encode special characters — all in your browser.