Browser Under the Hood

Corey Lynch
6 min readFeb 23, 2021
What exactly is their purpose?

A web browser receives a file from a server or your local disk and loads it. It uses a ‘browser engine’ to do that behind the scenes. Chrome’s engine is called ‘Blink’ which is just an implementation of ‘WebKit’, Apple’s engine.

Man, Web Development is the BOM!

The browser engine receives raw bytes of data, typically as data transmitted over a network. So the first task for the web browser is to convert that raw data into something the browser can read, the Document Object Model (DOM).

The Document here is the same as the Document from the BOM

To do this, the browser engine will begin by converting that data into characters. Remember the encoding you set at the top of your HTML file? That is what your browser will encode that data into characters.

charset=“UTF-8”

Next, the browser will need to take those characters and tokenize them. So, the .html extension on the file is taken by the browser to assume that the file has characters grouped together to mean something special. This special grouping is a token.

Default: p {display: block; margin-top: 1em; margin-bottom: 1em; margin-left: 0; margin-right: 0;}

Once tokenization is complete, the browser then creates nodes from all the tokens, and once all the nodes are created, they are linked together into a tree data structure known as the DOM, which establishes the relationships each node has with one another.

index.html was compiled, sent to the browser with my server, then interpreted by the browser like this.

Meanwhile, a fetch request is being made to the source of your data for each <link> found on each of your .html files and that’s exactly how CSS styles become incorporated into the process.
Just like with your HTML, the CSS data is received by your browser. That data is converted into characters, then tokenized according to the conventions of a .css file and made into nodes. Then those nodes are also made into a tree structure. Very similar to what was created with the HTML, the DOM, with CSS, you have the Cascading Style Sheets Object Model (CSSOM).

On the right is the interpreted styles for the body element. Can’t really see what it looks like before rendering.

From those two tree models, the DOM and the CSSOM, the browser creates the Render Tree. The Render Tree contains all the visible DOM content on the page and the CSSOM information for each of those nodes. So if an element has been given the attribute of display: none; it will not be added to the Render Tree despite that HTML node being created on the DOM and that attribute node being created on the CSSOM. It is determined to not be visible and hence not included on the Render Tree which contains all visible nodes.
At the time the Render Tree is created, still nothing has been rendered to the user’s screen. Just because we have a list of visible elements and attributes listed in the Render Tree doesn’t mean the browser knows quite where on the page each element lands. The layout of the page is the next thing the browser is responsible for calculating. In this ‘Reflow’ step, the browser takes all the layout stylings attached to all the visible elements on the Render Tree and calculates their positions on the page. After the Reflow, the browser now has everything needed to render or ‘paint’ all the elements on the Render Tree to the screen for the user to finally see.
So when does the actual functionality come into play? Because we most likely also have Javascript as well and that will be inside a <script> tag on our .html file.
Each time your browser sees a <script> tag inside of your .html file it will stop DOM construction until the script finishes executing. This is why it is important to have your <script> tags at the bottom of your .html file. If you put your <script> tag any higher, DOM construction will stop at the tag, and if anything goes wrong with executing your script, the rest of your DOM construction may not be built. That is, your page may not load or only be partially loaded. With the <script> tag below all of your HTML elements, you are allowing for your DOM to be fully constructed before it pauses for script execution. This allows your page to be rendered even if anything happens on script execution.

At the bottom of the body because it halts DOM construction.

This whole process of using a browser to take in data and render it to the screen for a user is known as Critical Rendering Path (CRP). Website optimization is CRP optimization. So when we put our <script> tags at the bottom of our .html file or use async in our <script> tags is CRP optimization.

async means keep going while I take care of this, so it prevents the halting of DOM construction.

So to summarize, a user uses their browser to que data from a webserver. Their browser receives that data in raw bytes, which is can’t understand until that browser’s engine turns that data into characters, tokenizes those characters, and then begins reading the .html file it receives. It uses that .html file in the tokenization of the characters to build each element into a node. It then creates relationships between the nodes by constructing them into a tree model known as the DOM. Meanwhile, once the browser reads the <link> tag on that .html file, it begins the same process on the linked file even if the file linked has the .css extension. In that case, instead of a DOM being constructed a similar model known as the CSSOM is constructed. Once all the relationships of the HTML and CSS Nodes are created within their respective trees, the browser combines those trees and their respective relationships into the Render Tree which is an Object Model representing all the visible elements to be contained on a page. Each node containing its specified attributes. Then once the render tree is completed, the browser paints all the elements to the page and calculates the position of each element in relation to each other element.

At this point, the user is able to view the elements contained in the file received by their browser. Assuming the <script> tags are after all elements in the .html file, the script is executed and the functionality of the page is added, otherwise, the user may be looking at a partially loaded page.

I still prefer to keep scripts at the bottom whenever possible.

I hope you found this article useful and thank you for your time. If you have any questions, comments, or corrections, all are welcome, you can leave them below or message me on LinkedIn.

--

--

Corey Lynch

Frontend Software Developer and Security Technician with experience in Ruby, Rails, JavaScript, and React. Flatiron Software Engineering Alumni.