Node.js Internals (How node.js works under the hood)

status: ongoing
Most people don't know that Node.js runs JavaScript on a single thread. This often raises an important question and that question motivated me to learn Node.js internals.
Question: If Node.js runs on single-thread then how can it handle millions of requests without becoming blocked?
After spending hours reading documentation, watching videos, and studying various resources, I finally have a clear understanding of how Node.js works internally.
In this article, I will share what I learned and explain the concepts in a simple and practical way that you will never forget :) .
What is Node.js
Node.js is a runtime or a program which take javascript code as inputs, parse it, compiles it into machine level code and then execute it.
Things you should know before understanding the Node.js Architecture/Internals
Callbacks
V8 Engine
Libuv
Min Heap
Event Loop
V8 Engine
V8 Engine is the Javascript Engine developed by Google for their browser Chrome. Later node developers took V8 engine combine it with additional components like libuv and C++ APIs to allow JavaScript to run outside the browser.
Task of V8 Engine
Parsing
Interpertation
Compilation
Optimization
Garbage Collection
Execution
Libuv Library
Libuv library is one of the most important pieces of code in the node.js that lets node run asyncronous operations.
In simple term Libuv is the library that interact with Operating System for handling async operations like I/0, Timers, Network calls and thread managements. We will talk later about its exposed APIs that node calls while performing async operations.
Learn more on : Libuv.org
Event Loop
Event loop is a part of Libuv library that constantly look for callbacks in the queue and execute them one by one. We will be looking later how and when the event loop start, execute callbacks and what are that queues i am talking about.
There are different phases in the Event Loop and each phase have its one task to perform
Event Loop Phases
Initial Phase
Timer Phase
Pending Callbacks Phase
Ideal/Preparation Phase
Poll Phase
Check Phase
Close Callback Phase
Initial Phase
First of all there is no such phase I just named it initial phase cause it consist of all things performed before event loop kicks in.
In this phase the Js code we wrote is parsed and executed line by line by V8 engine but V8 engine doesnot know how to execute asyncronous code like "setTimeout", "I/O operations" so when it sees those code it calls the Node.js provided APIs for such operations and under the hood those apis are implemented with the help of Libuv.
Lets look how this code is executed in initial phase
console.log("hello world");
setTimeout(()=>console.log("This is callback"),1000);
console.log("end of js code");
Here, as you can see we have both normal syncronous and asyncronous code
How Node.js run above code
Step 1 : V8 sees
console.log("hello world")- It's a syncronous code so is parsed and executed right there.
Step 2:
setTimeout(()=>console.log("This is callback"))When V8 encounters
setTimeout(), it does not set the timer itself becausesetTimeoutis not a built-in JavaScript feature provided by V8. Instead,setTimeoutis an API exposed by Node.js. When the function is called, Node.js handles the request and internally uses libuv's timer APIs, such asuv_timer_start(timer_handle, callback, 1000, 0);to register and manage the timer.libuv receives the delay from Node.js and converts it into an absolute expiration timestamp. It then stores the timer in an internal min-heap ordered by expiration time. Each timer object contains the expiration timestamp and a reference to the callback function.
However, libuv does not execute the callback immediately. Instead, the callback is only executed later when the event loop enters the Timers phase.
Step 3: Same as step 1.
In Conclusion this phase is responsible for executing all the syncronous code and one important thing to remember is that the event loop hasn't started yet.
Timer Phase
Actually this is the first phase of the event loop. This phase starts when initial phase is completed and event loop begin running.
So what does this phase do?
In Initial phase I said that the setTimeout()doesn't immediately run the callback , it just put the timer and callback in the heap right so in this phase expired timer callback are executed.
What it does internally is When the event loop enters the timers phase, Libuv takes a snapshot of the current time at the beginning of that iteration. It then compares this value with the expiry time of node at the root of the min heap.
If the timer has expired, its callback is executed and the timer is removed from the heap. After removal, the heap is reorganized and the next root element is checked using the "same snapshot".
This process continues until there are no more timers whose expiry time is less than or equal to the snapshot time.
Lots of people have misconception that Node.js recalculates the current time for every comparison during same iteration. In reality the current time is computed only once for each Timer phase.
Logic Code:
// This is the similar logic function that the Timer phase run for each Timer Phase
function processTimers() {
// take a snapshot of current time at the start of this event loop iteration
const currentTime = getCurrentTime(); // (internally: uv_now(loop))
// keep checking the root of the heap while timers are expired and also notice we are not updating the current time
while (heap.isNotEmpty() && heap.peekRoot().expiryTime <= currentTime) {
// get the timer with the smallest expiry time
const timer = heap.popRoot();
// execute its callback
timer.callback();
}
}
At this point, you might have a question: what happens if the callback of an expired timer takes long time to execute and during that execution, another timer's expiry time is reached?
The answer lies in how the timers phase processes timers. Like I already explained above At the beginning of each timers-phase iteration, Libuv takes a snapshot of the current time. All timer comparisons during that iteration are made against this snapshot.
As a result, even if the execution of one callback takes so long that another timer's expiry time is reached, the newly expired timer will not be executed during the current iteration because it is still being compared against the original time snapshot. Instead, it will be processed in the next event loop iteration when a fresh time snapshot is taken.
Important to Understand
setTimeout() does not guarantee exact timing. If you set a delay of 1 second, the callback is only guaranteed to run not earlier than 1 second but it may run later too.
The Node.js event loop does not start its phases until the initial phase is completed. After the initial phase is completed then only the event loop begins with the Timers phase. So from that we know if initial phase have some long time taking process, functions then timer is automatically delayed because node runs on single thread.
However, timers may still be delayed even if initial phase is completed quickly like when the event loop is busy executing other high priority callbacks which we will be talking in upcoming phases.
Example for what I said Earlier:
setTimeout(fn, 1000);
while(true) {}
As the while loop is the syncronous code it will execute in initial phase and we know event loop will only start after initial phase is completed. Because the loop never end node js never be able to start event loop making imposible to reach timer phase executing the setTimeout callback. So, Hence timer is delayed here.
This is just one example showing how setTimeout can be delayed. There are many similar examples available online that you can explore, analyze and experiment to deepen your understanding.
Pending Callbacks Phase
This is one of the interesting phase of event loop. It took me quite amount of time to fully understand what actually happens here but sorry to say it's much simpler than you think.
The Pending Callbacks Phase is the second phase of the event loop. One important thing to remember about this phase is during the first iteration of the event loop, this phase usually has nothing to execute so it does nothing.
Nothing to execute ? Lets see,
Its main purpose is to check the queue called "pending callbacks queue" and execute the callbacks available there. This queue holds certain callbacks that are posponded by the Poll phase to run in next event loop cycle.
Since the Poll Phase occurs later in the event loop cycle, there are no callbacks available in the queue when the Pending Callbacks Phase runs for the very first time. As a result this phase has nothing to process/execute.
At this point, you might be wondering just like I did "what kind of callbacks are actually stored in the Pending Callbacks Queue".
The interesting thing is that this queue does not hold any callbacks you might expect such as those from setTimeout(), setInterval(), or setImmediate(). Instead, it mainly contain certain system-level I/O callbacks especially callbacks related to network and socket operations. For now just remember that this queue mainly contains system-level I/O callbacks
We will exactly see how such callbacks ended up here when we will reach the poll phase.

