Process: A Tale of Code in Motion
Last updated: February 6th 2025
Introduction
In a previous article, we compared Node.js and PHP, focusing on how they handle blocking operations. The key difference? Node.js sidesteps the issue by ensuring its process keeps chugging along even when encountering blocking tasks, whereas PHP, by contrast, typically grinds to a halt, pausing the entire process until the task completes.
But what exactly is a process? And how does it differ from, or relate to, a program? Imagine a program as a set of instructions—a blueprint if you will—that tells your computer what needs to be done. A process, on the other hand, is the execution of that blueprint. It’s the active instance that runs in your system’s memory, handling tasks and interacting with the operating system. When you launch a program, you're essentially creating a process. This distinction is crucial, especially in the context of server-side environments like Node.js and PHP.
Act 1: The Static Program
Imagine you’re a chef. Your recipe is a program: a set of instructions written on paper. It’s static, lifeless, and waiting to be executed. Similarly, a program is a collection of code—whether written in C, Python, or PHP—stored on your hard drive as a file. It could be a simple script or a complex application, but until it runs, it’s just text or binary data.
But how does code become a program?
- Compilation: For languages like C or Go, the code is translated into machine code (binary) specific to a CPU architecture (e.g., x86, ARM). This binary is the CPU’s “native language.”
- Interpretation: For languages like Python or PHP, the code remains human-readable. An interpreter (e.g., the Python runtime) executes it line-by-line, bypassing compilation.
So, a program isn’t always compiled. It’s simply code prepared for execution, whether as machine code or an interpreted script.
Act 2: The Living Process
Now, imagine the chef actively cooking using the recipe. The recipe (program) springs to life as a process: an instance of the program running in memory, with its own resources (CPU time, RAM, file handles). When you launch a program, the operating system creates a process, allocates memory, and starts executing instructions.
This is why Node.js and PHP behave differently. When PHP encounters a blocking task (like reading a file), its process pauses, wasting resources. Node.js, however, uses non-blocking I/O, keeping its process busy with other tasks while waiting—like a chef multitasking instead of staring at a boiling pot.
Act 3: The Linking Dilemma
Let’s rewind. Before a program becomes a process, it often relies on external libraries—reusable code for common tasks (e.g., opening a file). Here’s where linking enters the story.
-
Static Linking:
Imagine a chef packing every ingredient and tool into a single suitcase. Similarly, static linking bundles all dependencies into the final executable. Pros? The program works anywhere (if the CPU/OS match). Cons? The executable becomes bloated, and updating libraries means recompiling the entire program. -
Dynamic Linking:
Here, the chef relies on a shared kitchen stocked with tools and ingredients. The program references external libraries (e.g.,.dll
files on Windows,.so
on Linux) at runtime. Pros? Smaller executables and easier updates (replace one library, fix all programs using it). Cons? The “DLL Hell”: if a library is missing or incompatible, the program crashes. Ever seen “msvcr100.dll not found”? That’s dynamic linking gone wrong.
(Fun fact: “DLL” stands for Dynamic-Link Library, a Windows term. Linux calls them Shared Objects, with .so
extensions.)
Act 4: The Ghost in the Machine
Let’s dissect a real-world example. Suppose you write a C program:
#include <stdio.h>
int main() {
printf("Hello, Process!");
return 0;
}
When compiled, the code becomes machine-specific binary. The printf
function isn’t part of your code—it’s from the C standard library (libc
). If you statically link libc
, your executable grows by ~2MB. If you dynamically link it, your program stays small but depends on libc.so
(Linux) or msvcrt.dll
(Windows) existing on the machine.
Now imagine distributing this program:
- Static linking: “Here’s a 3MB file. It just works.”
- Dynamic linking: “Here’s a 1MB file. Also, install these 10 libraries… and pray they’re the right version.”
This explains why developers debate linking strategies. Games often use static linking to avoid dependency nightmares. Web servers? They lean on dynamic linking to save memory.
Act 5: The Program’s Journey
Let’s tie this back to processes. When you run a program:
- The OS loads the executable into memory.
- It resolves dependencies (statically or dynamically linked).
- It creates a process, complete with a unique ID, memory space, and threads.
- The CPU executes the instructions, and the process lives until it exits or is killed.
A single program can spawn multiple processes. For example, open three Chrome windows, and you’ll see three Chrome processes in Task Manager—each isolated, crashing one doesn’t kill the others.
Epilogue: Why This Matters
Understanding processes and programs isn’t just academic. When we compared Node.js and PHP, the key takeaway was concurrency—how efficiently a runtime manages processes (or threads) under load. Node.js’s event loop keeps its process busy; PHP’s blocking model wastes resources.
Similarly, linking strategies affect software deployment. Docker containers, for instance, mitigate “DLL Hell” by packaging dependencies into isolated environments. Serverless platforms like AWS Lambda take this further, abstracting away the OS itself.
So next time you run a program, remember: it’s not just code. It’s a symphony of compilation, linking, and OS magic—transforming static instructions into a living, breathing process.
Code Excution
To execute C code, you must first compile it. Here's a simplified breakdown of the process:
-
Compilation:
The compiler translates your C code into assembly language (a low-level human-readable representation specific to your CPU architecture). -
Assembly:
The assembler then converts the assembly code into machine code, producing an executable file in a format like ELF (Executable and Linkable Format). -
Execution:
The resultingELF
file contains binary instructions that can be directly executed by the CPU.
when you click to excute a program -we will call it ELF
going forward-, the OS loads the program into meomey using the OS Loader
? But wait! what is a Loader
? you could imagine the Loader
as a fork lift in a construction site, it's job is to move the materials to the correct place so the workers can do there job, in unix this fork lift is called execve()
and in windows is LdrInitializeThunk
Inside an ELF
if we look inside an ELF
what would we find ?
ELF Header |
---|
Program Header Table |
Sections |
Section Header Table |
1. Header:
Contains metadata about the ELF file itself, including:
-
Magic Number
: 0x7F 'E' 'L' 'F' (identifies the file as ELF). -
Class
: 32-bit (ELF32) or 64-bit (ELF64). -
Data Encoding
: Little-endian or Big-endian. -
Target Architecture
: (e.g., x86, ARM).
// Simplified ELF Header structure (32/64-bit):
typedef struct {
unsigned char e_ident[16]; // Magic number + metadata
Elf32_Half e_type; // File type (executable, shared object, etc.)
Elf32_Half e_machine; // Target architecture (e.g., x86, ARM)
Elf32_Word e_version; // ELF version (usually 1)
Elf32_Addr e_entry; // Entry point (memory address where execution starts)
Elf32_Off e_phoff; // Offset to Program Header Table
Elf32_Off e_shoff; // Offset to Section Header Table
Elf32_Word e_flags; // Processor-specific flags
Elf32_Half e_ehsize; // Size of this header
Elf32_Half e_phentsize; // Size of a Program Header entry
Elf32_Half e_phnum; // Number of Program Header entries
Elf32_Half e_shentsize; // Size of a Section Header entry
Elf32_Half e_shnum; // Number of Section Header entries
Elf32_Half e_shstrndx; // Index of the section names string table
} Elf32_Ehdr;
2. Program Header Table
-
LOAD Segments
: Tells the OS which parts of the program (like the actual code and data) need to be loaded into memory. Think of a program like a book. The LOAD segments are like saying, "Put Chapter 1 (the code) and Chapter 2 (the data) into the reader’s hands (memory) so they can use it." -
DYNAMIC Segment
Lists external libraries (like.so
files on Linux or.dll
files on Windows) that the program needs to run. If your program is a cake recipe, DYNAMIC is like saying, "You’ll need eggs and flour from the pantry (shared libraries) to bake this cake." The OS grabs these "ingredients" at runtime. -
INTERP Segment
: Gives the path to thedynamic linker
, a tool that helps the program find and use those external libraries.
NOTE Segment
Holds extra info for debugging or special cases (not critical for running the program).
Elf file type: EXEC (Executable file)
Entry point: 0x4025a0
There are 9 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000040 0x00400040 0x00400040 0x001f8 0x001f8 R 0x8
INTERP 0x000238 0x00400238 0x00400238 0x0001c 0x0001c R 0x1
LOAD 0x000000 0x00400000 0x00400000 0x01d5c8 0x01d5c8 R E 0x200000
LOAD 0x01d5d0 0x006dd5d0 0x006dd5d0 0x00a4c 0x00a4c RW 0x200000
DYNAMIC 0x01d5e8 0x006dd5e8 0x006dd5e8 0x00200 0x00200 RW 0x8
NOTE 0x000254 0x00400254 0x00400254 0x00044 0x00044 R 0x4
GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x10
GNU_RELRO 0x01d5d0 0x006dd5d0 0x006dd5d0 0x00a30 0x02a30 R 0x1
3. Section Header Table
Contains metadata about the sections within the ELF file, which organize code, data, symbols, and debugging information. While the Program Header Table focuses on execution (loading into memory), the Section Header Table is used primarily by linkers and debuggers to understand the program’s structure. Key sections include:
.text
: The executable code (machine instructions).
Like a recipe’s step-by-step instructions—the CPU follows these to run the program.
.data
: Initialized global/static variables (e.g., int x = 5;).
A pre-filled grocery bag—items (data) are ready to use when you start cooking (execution).
.bss
: Uninitialized variables (e.g., int buffer[100];). Reserves memory space but stores no actual data in the file.
Empty parking spots—reserved for future use but take no space in the garage (file) until occupied.
.rodata
: Read-only data (e.g., string literals, constants).
A road sign—you can read it, but you can’t modify it.
.symtab
: Symbol table linking function/variable names to their memory addresses.
A phone directory—maps names (symbols) to numbers (addresses) so the linker can "call" them.
.shstrtab
: Stores section names (e.g., “.text”, “.data”) as strings.
Labels on file folders—without them, you’d have to open every folder to know what’s inside.
Why is it impoertant ?
:
The Section Header Table is like a detailed index for developers and tools. It’s optional at runtime (often stripped from release builds) but critical for debugging or recompiling the program. Think of it as the "behind-the-scenes" metadata that explains how the program was built, not how it runs.
Segments
≠ Sections
:
Segments (Program Header Table)
Segments are entities defined within the Program Header Table, which governs the runtime execution environment. They specify how the operating system or loader should map portions of the binary into memory, including details such as memory permissions (read, write, execute), alignment requirements, and virtual address ranges. Segments
are critical for defining the memory layout of a process during execution, ensuring that code, data, and other runtime-relevant components are positioned appropriately in memory.
Sections (Section Header Table)
: Sections, conversely, are cataloged within the Section Header Table and serve as a static organizational framework for the binary. They categorize raw code, data, debug symbols, relocation entries, and metadata (e.g., .text, .data, .rodata, .bss) to facilitate compilation, linking, and developer analysis. Sections are primarily oriented toward software tooling (e.g., compilers, linkers, debuggers) and provide a logical partitioning of the binary’s contents for human readability and development purposes.
## Process: How a Program Becomes Alive
When a program executes, the kernel transforms it into a process by allocating critical resources: RAM, CPU time, and a unique Process ID (PID). This PID ensures no two active processes share the same identifier, enabling precise tracking and control.
Memory Layout of a Process
- Stack: Manages function calls, local variables, and control flow (LIFO structure).
- Heap: Dynamically allocated memory (e.g.,
malloc()
in C), manually managed or garbage-collected. - Data/Static: Stores global and static variables (initialized and uninitialized segments).
- Text/Code: Contains executable instructions (read-only).
Namespaces: Isolation for Duplicate PIDs
PID uniqueness is enforced via namespaces, that mean that no two proccesses can have the same PID. Processes in separate envireonments aka (Docker) namespaces can share identical PIDs without conflict. For example, two Docker containers may each host a process with PID 1, as they reside in distinct namespaces. Think of namespaces as parallel "dimensions"—each has its own isolated view of system resources, much like virtual machines with independent clocks.
Namespaces enable lightweight virtualization (containers), where processes operate in sandboxed environments without overhead from full OS emulation. The kernel manages resource allocation, security boundaries, and inter-process communication, ensuring stability and efficiency.
In summary, a running program becomes a process through kernel resource allocation, structured memory segments, and namespace isolation—key pillars of modern computing.
Bonus: Instruction pointer counter
The program counter (PC) is a small but essential part of a computer’s processor that keeps track of the location of the next instruction to be carried out. In simple terms, it acts like a bookmark in a program, ensuring that the computer executes commands in the proper sequence. After each command is fetched, the PC updates itself to point to the next instruction.
However, not all programs run in a straight line. Sometimes, instructions tell the computer to jump to a different part of the program, such as when repeating actions in a loop or choosing between different actions based on a condition. In these cases, the PC is adjusted to direct the computer to the new location, allowing for flexible program behavior.
This article was written by Ahmad Adel. Ahmad is a freelance writer and also a backend developer.
Related articles
-
Node.js, Bun.js, and Deno: How JavaScript Runtimes Have Changed
An short article on different javascript runtimes: Node, Bun, and Deno
Last updated: February 5th 2025
-
JavaScript’s event loop vs. PHP’s multi-process model
An article comparing JS's event-loop and PHP's multi-process model
Last updated: February 5th 2025
-
Node.js boilerplate Typescript, Express, Prisma
On creating a modern Express.js API with Typescript
Last updated: February 6th 2025
-
Nodejs: Implementing a Basic Authentication Mechanism
An article on setting up basic authentication with NodeJS
Last updated: February 7th 2025
-
Turning Node.js Multi-Process: Scaling Applications with the Cluster Module
On scaling NodeJS processes with the cluster module...
Last updated: February 9th 2025
-
Building a Scalable Facebook-style Messaging Backend with NodeJS
Steps to build a facebook-style messaging backend with NodeJS
Last updated: February 10th 2025
-
What is PM2 and Why Your Node App Needs it
An article on PM2 - a Node process manager.
Last updated: February 20th 2025