URL Encoding: What %20 Actually Means and Why It Matters

If you have ever copied a URL and seen sequences like %20, %2F, or %3D scattered throughout it, you have encountered URL encoding. These percent sequences are not random — they follow a precise specification that ensures any character can travel safely in a URL without being misinterpreted. Here is the full explanation.

Why URLs Need Encoding

A URL (Uniform Resource Locator) is made up of several components: scheme, host, path, query string, and fragment. Each component uses specific characters as structural delimiters. The slash / separates path segments. The question mark ? begins the query string. The ampersand & separates query parameters. The equals sign = connects parameter names to values.

This creates a problem: what happens when the data you want to put in a URL contains these same delimiter characters? If you want to pass the value a=b&c=d as a query parameter value, a naive URL would be completely ambiguous — parsers would not know where the value ends and where structure begins.

URL encoding (also called percent-encoding) solves this by replacing any character that has special meaning — or that is not safe in a URL — with a percent sign followed by that character's hexadecimal ASCII code.

The Percent-Encoding Format

The format is simple: %XX where XX is the two-digit uppercase hexadecimal value of the byte. Every character in a URL that is not an "unreserved character" must be percent-encoded.

RFC 3986 (the standard governing URIs) defines the unreserved characters as: A–Z, a–z, 0–9, hyphen (-), period (.), underscore (_), and tilde (~). These 66 characters can appear in any URL component without encoding. Everything else must be encoded when used as data.

Format: % followed by two hex digits
Example: space (ASCII 32, hex 20) → %20
Unreserved chars: A-Z a-z 0-9 - . _ ~

Common Encoded Characters

Here are the most frequently encountered percent-encoded characters and what they represent:

Query Strings vs Path Encoding

Encoding rules differ slightly between URL components. In a path segment, the slash is a delimiter, so any literal slash in data must be encoded as %2F. But in query strings, slash is not a delimiter and can appear unencoded (though encoding it is always safe).

There is also a historical quirk: HTML forms submitting via GET encode spaces as + rather than %20 in the application/x-www-form-urlencoded format. This means %2B (the encoding of a literal plus) and + (representing a space) mean different things. When parsing form data, a server must decode + as a space — but when parsing a raw URL path, + is just a literal plus sign. This inconsistency trips up many developers.

encodeURIComponent vs encodeURI in JavaScript

JavaScript provides two built-in functions for URL encoding, and they behave differently in important ways:

encodeURIComponent()

encodeURIComponent() encodes all characters except: A–Z, a–z, 0–9, -, _, ., !, ~, *, ', (, and ). This makes it aggressive — it encodes slash, question mark, ampersand, equals sign, and everything else that has structural meaning in a URL.

Use encodeURIComponent() when encoding individual query parameter names or values. It ensures your data will never be misinterpreted as URL structure.

encodeURI()

encodeURI() is less aggressive. It preserves characters that are valid URL structure components: ; , / ? : @ & = + $ #. It is designed to encode a complete URL that already has valid structure — it will not break the colons in https:// or the slashes in paths.

Use encodeURI() when you have a full URL and want to make it safe for embedding in another context, without destroying its structure. Never use encodeURI() to encode a parameter value — it will leave & and = unencoded, which will corrupt your query string.

link

Parse and encode URLs instantly

Break down any URL into its components, encode or decode percent-encoded strings, and inspect query parameters.

arrow_forward Parse & Encode URLs

Double Encoding: A Common Bug

Double encoding happens when you encode a value that has already been encoded. A space becomes %20 after the first encoding. If encoded again, the % itself (ASCII 37, hex 25) becomes %25, turning %20 into %2520. A server that decodes this only once will receive %20 as a literal string instead of a space.

Double encoding is a frequent source of bugs in URL construction and can also be a security concern — some systems decode URLs multiple times in sequence, which attackers can exploit to bypass input validation (a technique called double-encoding attacks).

The fix is straightforward: encode each value exactly once at construction time, and decode each component exactly once at parse time.

Non-ASCII Characters and UTF-8 Encoding

Characters outside the ASCII range — such as accented letters, Chinese characters, or emoji — require encoding as UTF-8 byte sequences where each byte is then percent-encoded. For example, the é character (U+00E9) encodes as the two-byte UTF-8 sequence 0xC3 0xA9, which becomes %C3%A9 in a URL. Modern browsers handle this automatically when you type non-ASCII characters into a URL bar, but server-side code must explicitly use UTF-8 when encoding and decoding.

Form Submission Encoding

When an HTML form uses the default application/x-www-form-urlencoded encoding and submits via GET, the browser encodes form field names and values using a modified form of URL encoding where spaces become + rather than %20. The resulting query string is appended to the URL. This is why you might see ?q=hello+world in a search URL — it means the user typed "hello world" in a form field.

Server frameworks decode this automatically, but if you are manually parsing query strings or building form data, remember that + means space in this context.

The Bottom Line

URL encoding is a compact system with a small set of rules: unreserved characters travel as-is, everything else uses %XX hex notation. The main practical points to remember are: always use encodeURIComponent() for query parameter values in JavaScript, be aware of the +-for-space convention in HTML forms, avoid double encoding, and handle UTF-8 correctly for non-ASCII data. Getting these right prevents a large class of URL-related bugs that are surprisingly common in production systems.

link

Try the URL Parser

Decode percent-encoded URLs, inspect query parameters, and encode special characters — all in your browser.

arrow_forward Parse & Encode URLs