Memory Management in Node.js

If you have ever coded in C or C++, you probably used functions like malloc, calloc, and free to manage memory manually. You request a block of memory from the operating system, use it, and then release it when you're finished. If you forget to free memory, it leads to leaks. Freeing it at the wrong time will crash the program. While manual memory management gives you control, it also makes it easy to create memory bugs.

JavaScript is quite different. You don’t allocate memory manually. You don’t free it either. You simply create variables, objects, arrays, and functions, and everything works seamlessly. This doesn’t mean JavaScript avoids memory usage. The JavaScript engine takes full responsibility for managing it.

In Node.js, the engine is V8.

In this blog, we will explore how memory management and garbage collection function within the V8 engine, starting from the operating system level and gradually moving inward.

From previous discussions, we understand how threads operate in Node.js. Node.js primarily runs JavaScript on a main thread, and it can use worker threads for parallel tasks. Each thread requires memory to execute code, store variables, and track function calls.

All memory assigned to a Node.js process by the operating system is known as the Resident Set Size, or RSS for short. RSS is the amount of physical RAM currently used by the process. It is real memory in RAM, not virtual memory, and not just a promise.

RSS includes everything the process utilizes: stack memory for threads, heap memory for JavaScript objects, native memory used by C++ code in Node.js, buffers, runtime metadata, and loaded libraries. RSS is a broad category. When RSS increases, it means your process is using more actual RAM, not just JavaScript objects.

Inside the RSS, memory is generally divided into two main parts: stack memory and heap memory.

Before we proceed, let’s clarify what we mean by data. In this context, data refers to values used by a JavaScript program. A number is data. A string is data. An object consists of smaller values and is considered structured data. At its core, all this data turns into bytes stored in memory.

Some of this data resides on the stack.

The stack holds data with a fixed size and a predictable lifespan. This includes primitive values like numbers, booleans, null, and undefined. It also covers function parameters, return addresses, and local variables. The stack also stores references, which are pointers to objects that exist elsewhere.

This stack is the same call stack seen in stack traces. Every time a function is called, a new stack frame is created. When the function returns, that frame is destroyed automatically. Stack memory is very fast and simple to manage, but it has limited size.

The heap is for everything that doesn’t fit those criteria.

Data with a dynamic size or unpredictable lifespan is stored in heap memory. Objects, arrays, functions, closures, and most strings are located here. When you assign an object to a variable, the variable itself resides on the stack, holding only a reference. The actual object is stored somewhere in the heap.

Now, let’s look deeper into the heap itself.

The heap is not just one large memory block. V8 divides the heap into multiple spaces, each designed for specific kinds of data and garbage collection strategies. The diagram you plan to include shows a simplified but accurate view of this layout.

When you create an object, array, or a complex string, it typically starts in a part of the heap called the New Space. The new space is optimized for short-lived objects since most objects created in JavaScript die quickly.

The new space is divided into two regions called from space and to space. When a new object is allocated, it goes into the from space. Allocation here is very fast and usually means just moving a pointer forward.

As your application runs, the from space fills up. When it reaches a certain limit, or when there's no more room to allocate new objects, V8 triggers a minor garbage collection, also known as the scavenger.

During a minor garbage collection, V8 starts with a set of root references. These roots consist of variables on the stack, global objects, and currently executing function frames. From these roots, V8 follows references into the heap.

Each object in the from space is checked. If an object cannot be reached from any root, it is deemed garbage, and its memory is reclaimed. If it is still reachable, it is copied into the to space.

Once all live objects have been copied, the entire from space is cleared. Then, the roles of from space and to space are swapped. The old to space becomes the new from space, and allocation continues. This copying and swapping mechanism makes minor garbage collection fast.

Objects that survive multiple minor garbage collections are considered long-lived. V8 promotes these objects into another region called the Old Space.

There is a special case. Very large objects may skip the new space and be allocated directly into old space. This avoids repeatedly copying large objects and helps protect the limited size of the new space.

Old space is the resting place for long-lived objects. Application state, cached data, global objects, and long-lived closures tend to end up here. Garbage collection in old space is more resource-intensive and is managed through major garbage collection.

Major garbage collection employs techniques called Mark Sweep and Mark Compact. First, V8 marks all objects reachable from root references. Then it sweeps away the unmarked objects and frees their memory. Sometimes, live objects are moved closer together to reduce fragmentation and improve memory locality.

Old space is divided into two areas. One area holds objects that mainly contain references to other objects. The other contains objects that primarily store raw data like numbers or bytes. This division helps the garbage collector operate more efficiently.

In addition to new space and old space, the heap also includes a Large Object Space. Objects larger than a certain size are allocated here. Each large object has its own memory chunk, often supported by a separate memory mapping. These objects are never moved by the garbage collector and are only freed when they are no longer reachable.

Another key part of the heap is the Code Space. This is where V8 keeps machine code generated from JavaScript. To understand this, we need to discuss JIT.

JavaScript isn’t directly executed by the CPU. V8 first parses JavaScript into an intermediate representation. As the program runs, V8 monitors which parts of the code are executed frequently. These hot paths are then compiled into optimized machine code through a system called Just In Time compilation, often shortened to JIT.

JIT compilation means the engine translates JavaScript into native machine instructions while the program is executing, rather than doing it beforehand. This compiled machine code is stored in code space and processed directly by the CPU, allowing JavaScript to be fast, despite being a high-level language.

Code space is executable memory. It is distinct from object data and managed carefully for performance and security reasons.

There are also smaller specialized spaces within the heap, like map space, cell space, and property cell space. These store object shapes and internal metadata. Object shapes define how properties are organized in memory, enabling V8 to optimize property access and variable lookups.

This whole memory management system explains why JavaScript developers do not allocate or free memory manually. V8 automatically allocates memory as needed and reclaims it when objects are no longer reachable.

However, this doesn’t mean memory management is without responsibility. Holding onto references longer than necessary, especially through global variables or closures, can hinder garbage collection and lead to memory leaks.