A Simple Browser Request
A few years back a colleague mentioned an interview question he likes to ask:
"Explain what happens after you type a URL into the browser and press enter?"
The great thing about this question is the range and depth of the answer.
As a challenge I'm going to attempt to answer this question with my current knowledge.
- To start, the browser parses the input, splits the URL into segments and prepares an object representing a request.
- The hostname of the URL needs to be resolved to an IP address, so it makes a DNS request either through the OS or these days via DNS over HTTPS.
- Once it has the IP address it will attempt to make a stateful TCP connection (due to HTTPS using destination port 443), and then upgraded to TLS, or if both client and server support HTTP/3 the connection is negotiated over UDP instead.
- For encryption the client uses the public key (asymmetric encryption) from the domains SSL certificate to negotiate with the server a temporary session key (symmetric encryption), this is then used to encrypt the traffic back n forth.
- Once the connection is setup an HTTP GET request is made, by sending HTTP headers over the connection, this includes the Host, Path, User-Agent, Date, Accept-Encoding (for compression), and Cookies if there are any.
- The request itself is sent in chunks, or "streamed" over the wire, these chunked packets get sent to the OS, and the OS decides whether the destination IP address is within the current local network, if not it will forward the packets to the network router (gateway).
- The router will do the same, look at whether the destination IP is within its wider network (the ISPs network) and if not, swap the packets source IP with its own and forward to its gateway. This process continues until the destination IP is found, each step along the way increments a time to live (TTL) counter on the packets, and once the counter limit is hit the packets gets dropped. This is to avoid packets looping around and clogging up the system.
- Assuming the destination IP address is found the packets will go through the servers network firewall, which might drop the packets if it doesn't like the source IP.
- Assuming everything is ok the packets will be routed within the server OS to the program listening on port 443. Usually this is the web server program.
- The web server program parses the headers in the request and decides what response it should give.
- The server could decide the request is 403 Forbidden, maybe it parsed the Cookie header but the authentication data inside it wasn't valid. Maybe it decides there is nothing at the requested path (404 Not Found). On this occasion the server decides to respond with a 200 Ok status header, and in the body of the response a string of HTML data.
- The response packets returned are based on the source IP of the request, each step of the way swapping the destination IP back to the original source IP from the original request. Because of this state routers have routing tables they maintain to map these requests back n forth.
- The packets make their way back through the client OS to the browser, which will start to scan the incoming HTML as each chunk arrives. Once it parses the head section of HTML it realises there are other resources it should get, usually CSS and JavaScript. The Request-Response cycle then repeats for those resources.
- When parsing the HTML (assuming UTF-8 formatting) the browser creates a Document Object Model (DOM), representing a tree like structure of the HTML page. A similar thing happens with CSS, it gets parsed by the CSS engine into the CSS Object Model (CSSOM), these models are processed so the browser can decide what it should render and ultimately which pixels it should paint onto the screen.
- If JavaScript is requested it will be parsed by the browsers JS engine (e.g V8 in Chrome). Initially from text to V8 bytecode, and finally into machine code where it will execute in a sandboxed environment. Sometimes this can block the browser from initially painting so it's important to keep this in mind, otherwise you sit around looking at a blank screen.
- An HTTP Response also contains headers, an important one being the Cache-Control header. This tells the browser how long it should keep hold of a resource and how it should go about checking for updates on that resource. Depending on the cache rules future requests for the resource will avoid making more network requests, and instead re-use the resource it has cached.
This sums up the process of visiting a webpage. Diving further into this subject has significantly benefited my career, enabling me to bypass the complexities of various frameworks and libraries to grasp the underlying mechanics. As mentioned earlier, every phase of this process can be explored in greater depth, which is why its such an excellent question.
I'll deep dive some of these steps in future articles.