How Browsers Parse a URL

Step 1: Breaking Down the URL String

When a user enters a URL into the browser’s address bar, such as:

https://example.com:443/path/page?query=1#hash

the browser parses this string into meaningful components. Each part plays a distinct role in how the browser handles the request:

Component	Example	Meaning
Scheme	`https`	Indicates the communication protocol (e.g., HTTP, HTTPS, FTP)
Host Name	`example.com`	The domain to be resolved via DNS
Port	`443`	The specific port on the server to connect to; defaulted if omitted
Path	`/path/page`	The location of the resource on the server
Query	`?query=1`	Additional parameters sent to the server
Fragment	`#hash`	Internal page reference; not sent to the server

Notes on Each Part

Scheme (https): Determines how the browser will communicate. For instance, https implies encrypted communication via TLS, default port 443, and a secure connection.
Host Name (example.com): Will be sent to the DNS resolver to be translated into an IP address.
Port (:443): Specifies the endpoint on the server. If omitted, browsers infer the default from the scheme.
Path (/path/page): Used to identify the specific resource being requested on the server.
Query (?query=1): Key-value pairs often used to send data like form inputs or filters.
Fragment (#hash): A client-side reference for in-page navigation. This part is not sent in the HTTP request.

In short: the browser transforms a flat string into a structured object that can inform each subsequent stage — from DNS resolution to sending an HTTP request.

Step 2: How Each Part Affects the Request Lifecycle

Once the URL is parsed, the browser begins to act on the parsed data:

🔐 Scheme

Determines whether the browser must establish a secure (TLS) connection or a regular connection.
Defines the default port (e.g., 443 for HTTPS, 80 for HTTP).
Influences protocol selection (HTTP, FTP, mailto, etc.).

🌐 Host Name

Used in the DNS resolution process to obtain an IP address.
The resolved IP becomes the basis of the TCP connection.
May affect routing or security checks (e.g., CORS or certificate validation).

🔌 Port

Tells the operating system which port to connect to on the server.
If omitted, defaults to the port defined by the scheme.
Rarely modified in everyday browsing, but crucial in development or proxy scenarios.

📂 Path

Specifies the exact resource requested on the server.
Interpreted by the server’s routing logic (e.g., /about, /api/user/42).
May influence internal processing (e.g., returning different content based on path).

❓ Query

Supplies parameters or filters to the server-side application.
Commonly used in search, form submission, pagination, etc.
Appears in server logs and analytics.

🧭 Fragment

Handled entirely on the client-side — never sent to the server.
Used for jumping to sections within a document (e.g., #faq, #top).
Also used in some SPA frameworks for client-side routing (/#/home).

Understanding how each component drives downstream behavior is key to debugging issues and designing robust applications.

Next: In the following chapter, we’ll look at how the browser uses the parsed host name to perform DNS resolution, and how that process determines the IP address needed to continue the request.

Name	Typ	Größe	Geändert am	Zugriff
📄 dxvk-2.7.tar.gz	GZ	9,80 MB	07.07.2025 15:36	0644
📄 vkd3d-proton-2.14.1.tar.zst	ZST	2,77 MB	07.07.2025 15:37	0644