Process: A Tale of Code in Motion

Last updated: February 6th 2025

Introduction

In a previous article, we compared Node.js and PHP, focusing on how they handle blocking operations. The key difference? Node.js sidesteps the issue by ensuring its process keeps chugging along even when encountering blocking tasks, whereas PHP, by contrast, typically grinds to a halt, pausing the entire process until the task completes.

But what exactly is a process? And how does it differ from, or relate to, a program? Imagine a program as a set of instructions—a blueprint if you will—that tells your computer what needs to be done. A process, on the other hand, is the execution of that blueprint. It’s the active instance that runs in your system’s memory, handling tasks and interacting with the operating system. When you launch a program, you're essentially creating a process. This distinction is crucial, especially in the context of server-side environments like Node.js and PHP.

Act 1: The Static Program

Imagine you’re a chef. Your recipe is a program: a set of instructions written on paper. It’s static, lifeless, and waiting to be executed. Similarly, a program is a collection of code—whether written in C, Python, or PHP—stored on your hard drive as a file. It could be a simple script or a complex application, but until it runs, it’s just text or binary data.

But how does code become a program?

  • Compilation: For languages like C or Go, the code is translated into machine code (binary) specific to a CPU architecture (e.g., x86, ARM). This binary is the CPU’s “native language.”
  • Interpretation: For languages like Python or PHP, the code remains human-readable. An interpreter (e.g., the Python runtime) executes it line-by-line, bypassing compilation.

So, a program isn’t always compiled. It’s simply code prepared for execution, whether as machine code or an interpreted script.

Act 2: The Living Process

Now, imagine the chef actively cooking using the recipe. The recipe (program) springs to life as a process: an instance of the program running in memory, with its own resources (CPU time, RAM, file handles). When you launch a program, the operating system creates a process, allocates memory, and starts executing instructions.

This is why Node.js and PHP behave differently. When PHP encounters a blocking task (like reading a file), its process pauses, wasting resources. Node.js, however, uses non-blocking I/O, keeping its process busy with other tasks while waiting—like a chef multitasking instead of staring at a boiling pot.

Act 3: The Linking Dilemma

Let’s rewind. Before a program becomes a process, it often relies on external libraries—reusable code for common tasks (e.g., opening a file). Here’s where linking enters the story.

  • Static Linking:
    Imagine a chef packing every ingredient and tool into a single suitcase. Similarly, static linking bundles all dependencies into the final executable. Pros? The program works anywhere (if the CPU/OS match). Cons? The executable becomes bloated, and updating libraries means recompiling the entire program.

  • Dynamic Linking:
    Here, the chef relies on a shared kitchen stocked with tools and ingredients. The program references external libraries (e.g., .dll files on Windows, .so on Linux) at runtime. Pros? Smaller executables and easier updates (replace one library, fix all programs using it). Cons? The “DLL Hell”: if a library is missing or incompatible, the program crashes. Ever seen “msvcr100.dll not found”? That’s dynamic linking gone wrong.

(Fun fact: “DLL” stands for Dynamic-Link Library, a Windows term. Linux calls them Shared Objects, with .so extensions.)

Act 4: The Ghost in the Machine

Let’s dissect a real-world example. Suppose you write a C program:

#include <stdio.h>
int main() {
  printf("Hello, Process!");
  return 0;
}

When compiled, the code becomes machine-specific binary. The printf function isn’t part of your code—it’s from the C standard library (libc). If you statically link libc, your executable grows by ~2MB. If you dynamically link it, your program stays small but depends on libc.so (Linux) or msvcrt.dll (Windows) existing on the machine.

Now imagine distributing this program:

  • Static linking: “Here’s a 3MB file. It just works.”
  • Dynamic linking: “Here’s a 1MB file. Also, install these 10 libraries… and pray they’re the right version.”

This explains why developers debate linking strategies. Games often use static linking to avoid dependency nightmares. Web servers? They lean on dynamic linking to save memory.

Act 5: The Program’s Journey

Let’s tie this back to processes. When you run a program:

  1. The OS loads the executable into memory.
  2. It resolves dependencies (statically or dynamically linked).
  3. It creates a process, complete with a unique ID, memory space, and threads.
  4. The CPU executes the instructions, and the process lives until it exits or is killed.

A single program can spawn multiple processes. For example, open three Chrome windows, and you’ll see three Chrome processes in Task Manager—each isolated, crashing one doesn’t kill the others.

Epilogue: Why This Matters

Understanding processes and programs isn’t just academic. When we compared Node.js and PHP, the key takeaway was concurrency—how efficiently a runtime manages processes (or threads) under load. Node.js’s event loop keeps its process busy; PHP’s blocking model wastes resources.

Similarly, linking strategies affect software deployment. Docker containers, for instance, mitigate “DLL Hell” by packaging dependencies into isolated environments. Serverless platforms like AWS Lambda take this further, abstracting away the OS itself.

So next time you run a program, remember: it’s not just code. It’s a symphony of compilation, linking, and OS magic—transforming static instructions into a living, breathing process.

Code Excution

To execute C code, you must first compile it. Here's a simplified breakdown of the process:

  1. Compilation:
    The compiler translates your C code into assembly language (a low-level human-readable representation specific to your CPU architecture).

  2. Assembly:
    The assembler then converts the assembly code into machine code, producing an executable file in a format like ELF (Executable and Linkable Format).

  3. Execution:
    The resulting ELF file contains binary instructions that can be directly executed by the CPU.

when you click to excute a program -we will call it ELF going forward-, the OS loads the program into meomey using the OS Loader ? But wait! what is a Loader ? you could imagine the Loader as a fork lift in a construction site, it's job is to move the materials to the correct place so the workers can do there job, in unix this fork lift is called execve() and in windows is LdrInitializeThunk

Inside an ELF

if we look inside an ELF what would we find ?

ELF Header
Program Header Table
Sections
Section Header Table

1. Header:

Contains metadata about the ELF file itself, including:

  1. Magic Number: 0x7F 'E' 'L' 'F' (identifies the file as ELF).

  2. Class: 32-bit (ELF32) or 64-bit (ELF64).

  3. Data Encoding: Little-endian or Big-endian.

  4. Target Architecture: (e.g., x86, ARM).

// Simplified ELF Header structure (32/64-bit):
typedef struct {
  unsigned char e_ident[16];  // Magic number + metadata
  Elf32_Half    e_type;       // File type (executable, shared object, etc.)
  Elf32_Half    e_machine;    // Target architecture (e.g., x86, ARM)
  Elf32_Word    e_version;    // ELF version (usually 1)
  Elf32_Addr    e_entry;      // Entry point (memory address where execution starts)
  Elf32_Off     e_phoff;      // Offset to Program Header Table
  Elf32_Off     e_shoff;      // Offset to Section Header Table
  Elf32_Word    e_flags;      // Processor-specific flags
  Elf32_Half    e_ehsize;     // Size of this header
  Elf32_Half    e_phentsize;  // Size of a Program Header entry
  Elf32_Half    e_phnum;      // Number of Program Header entries
  Elf32_Half    e_shentsize;  // Size of a Section Header entry
  Elf32_Half    e_shnum;      // Number of Section Header entries
  Elf32_Half    e_shstrndx;   // Index of the section names string table
} Elf32_Ehdr;

2. Program Header Table

  1. LOAD Segments: Tells the OS which parts of the program (like the actual code and data) need to be loaded into memory. Think of a program like a book. The LOAD segments are like saying, "Put Chapter 1 (the code) and Chapter 2 (the data) into the reader’s hands (memory) so they can use it."

  2. DYNAMIC Segment Lists external libraries (like .so files on Linux or .dll files on Windows) that the program needs to run. If your program is a cake recipe, DYNAMIC is like saying, "You’ll need eggs and flour from the pantry (shared libraries) to bake this cake." The OS grabs these "ingredients" at runtime.

  3. INTERP Segment: Gives the path to the dynamic linker, a tool that helps the program find and use those external libraries.

  1. NOTE Segment Holds extra info for debugging or special cases (not critical for running the program).
Elf file type: EXEC (Executable file)  
Entry point: 0x4025a0  
There are 9 program headers, starting at offset 64  

Program Headers:  
  Type           Offset   VirtAddr   PhysAddr   FileSiz  MemSiz   Flg Align  
  PHDR           0x000040 0x00400040 0x00400040 0x001f8  0x001f8  R   0x8  
  INTERP         0x000238 0x00400238 0x00400238 0x0001c  0x0001c  R   0x1  
  LOAD           0x000000 0x00400000 0x00400000 0x01d5c8 0x01d5c8 R E 0x200000  
  LOAD           0x01d5d0 0x006dd5d0 0x006dd5d0 0x00a4c  0x00a4c  RW  0x200000  
  DYNAMIC        0x01d5e8 0x006dd5e8 0x006dd5e8 0x00200  0x00200  RW  0x8  
  NOTE           0x000254 0x00400254 0x00400254 0x00044  0x00044  R   0x4  
  GNU_STACK      0x000000 0x00000000 0x00000000 0x00000  0x00000  RW  0x10  
  GNU_RELRO      0x01d5d0 0x006dd5d0 0x006dd5d0 0x00a30  0x02a30  R   0x1

3. Section Header Table

Contains metadata about the sections within the ELF file, which organize code, data, symbols, and debugging information. While the Program Header Table focuses on execution (loading into memory), the Section Header Table is used primarily by linkers and debuggers to understand the program’s structure. Key sections include:

.text: The executable code (machine instructions).
Like a recipe’s step-by-step instructions—the CPU follows these to run the program.

.data: Initialized global/static variables (e.g., int x = 5;).
A pre-filled grocery bag—items (data) are ready to use when you start cooking (execution).

.bss: Uninitialized variables (e.g., int buffer[100];). Reserves memory space but stores no actual data in the file.
Empty parking spots—reserved for future use but take no space in the garage (file) until occupied.

.rodata: Read-only data (e.g., string literals, constants).
A road sign—you can read it, but you can’t modify it.

.symtab: Symbol table linking function/variable names to their memory addresses.
A phone directory—maps names (symbols) to numbers (addresses) so the linker can "call" them.

.shstrtab: Stores section names (e.g., “.text”, “.data”) as strings.
Labels on file folders—without them, you’d have to open every folder to know what’s inside.

Why is it impoertant ?:
The Section Header Table is like a detailed index for developers and tools. It’s optional at runtime (often stripped from release builds) but critical for debugging or recompiling the program. Think of it as the "behind-the-scenes" metadata that explains how the program was built, not how it runs.

SegmentsSections:

Segments (Program Header Table) Segments are entities defined within the Program Header Table, which governs the runtime execution environment. They specify how the operating system or loader should map portions of the binary into memory, including details such as memory permissions (read, write, execute), alignment requirements, and virtual address ranges. Segments are critical for defining the memory layout of a process during execution, ensuring that code, data, and other runtime-relevant components are positioned appropriately in memory.

Sections (Section Header Table): Sections, conversely, are cataloged within the Section Header Table and serve as a static organizational framework for the binary. They categorize raw code, data, debug symbols, relocation entries, and metadata (e.g., .text, .data, .rodata, .bss) to facilitate compilation, linking, and developer analysis. Sections are primarily oriented toward software tooling (e.g., compilers, linkers, debuggers) and provide a logical partitioning of the binary’s contents for human readability and development purposes.

## Process: How a Program Becomes Alive

When a program executes, the kernel transforms it into a process by allocating critical resources: RAM, CPU time, and a unique Process ID (PID). This PID ensures no two active processes share the same identifier, enabling precise tracking and control.

Memory Layout of a Process

  • Stack: Manages function calls, local variables, and control flow (LIFO structure).
  • Heap: Dynamically allocated memory (e.g., malloc() in C), manually managed or garbage-collected.
  • Data/Static: Stores global and static variables (initialized and uninitialized segments).
  • Text/Code: Contains executable instructions (read-only).

Namespaces: Isolation for Duplicate PIDs

PID uniqueness is enforced via namespaces, that mean that no two proccesses can have the same PID. Processes in separate envireonments aka (Docker) namespaces can share identical PIDs without conflict. For example, two Docker containers may each host a process with PID 1, as they reside in distinct namespaces. Think of namespaces as parallel "dimensions"—each has its own isolated view of system resources, much like virtual machines with independent clocks.

Namespaces enable lightweight virtualization (containers), where processes operate in sandboxed environments without overhead from full OS emulation. The kernel manages resource allocation, security boundaries, and inter-process communication, ensuring stability and efficiency.

In summary, a running program becomes a process through kernel resource allocation, structured memory segments, and namespace isolation—key pillars of modern computing.

Bonus: Instruction pointer counter

The program counter (PC) is a small but essential part of a computer’s processor that keeps track of the location of the next instruction to be carried out. In simple terms, it acts like a bookmark in a program, ensuring that the computer executes commands in the proper sequence. After each command is fetched, the PC updates itself to point to the next instruction.

However, not all programs run in a straight line. Sometimes, instructions tell the computer to jump to a different part of the program, such as when repeating actions in a loop or choosing between different actions based on a condition. In these cases, the PC is adjusted to direct the computer to the new location, allowing for flexible program behavior.

This article was written by Ahmad AdelAhmad is a freelance writer and also a backend developer.

chat box icon
Close
combined chatbox icon

Welcome to our Chatbox

Reach out to our Support Team or chat with our AI Assistant for quick and accurate answers.
webdockThe Webdock AI Assistant is good for...
webdockChatting with Support is good for...