Keywords
What Is a Process?
A process is an instance of an executing program. When a program is run, the kernel: (1) loads the program’s code into virtual memory, (2) allocates memory for variables, (3) creates kernel data structures recording the PID, termination status, UIDs, GIDs, and more. From the kernel’s view, processes are the entities among which it shares hardware resources.
Process Memory Layout
A process’s virtual address space has four logical segments:
Compiled machine instructions of the program. Read-only to prevent accidental overwrites. Shared between all processes running the same program — only one physical copy in RAM regardless of how many instances run.
Global and static variables that were explicitly initialised in source code (e.g., int x = 42; at file scope). Loaded from the program file at startup.
Area for dynamic memory allocation via malloc(). Grows upward toward higher addresses. Memory persists until freed with free() or the process exits. Memory leaks happen when allocated heap memory is never freed.
Grows and shrinks as functions are called and return. Each call creates a stack frame holding local variables, function arguments, and the return address. Grows downward toward lower addresses. Stack memory is automatically reclaimed when functions return.
HIGH ADDRESS ┌─────────────────────┐ │ Stack ↓ │ local variables, return addresses │ │ │ (free space) │ │ │ │ Heap ↑ │ malloc() / free() ├─────────────────────┤ │ Data Segment │ initialised global/static variables ├─────────────────────┤ │ Text Segment │ program code (read-only) LOW ADDRESS
Creating Processes: fork()
A process creates a child using fork(). The child is an almost exact copy — same code, same data, same open file descriptors. fork() returns twice: returns the child’s PID in the parent, returns 0 in the child.
pid_t pid = fork();
if (pid == -1) {
perror("fork failed");
} else if (pid == 0) {
/* CHILD: runs here */
printf("Child PID: %d\n", getpid());
} else {
/* PARENT: runs here, pid = child's PID */
printf("Parent, child is: %d\n", pid);
waitpid(pid, NULL, 0); /* wait for child to finish */
}
Executing New Programs: exec()
execve() replaces the calling process’s program image with a new program. The PID stays the same; code, data, stack, and heap are all replaced. Open file descriptors persist (unless marked O_CLOEXEC). The fork-exec pattern — fork a child, set up any redirections, then exec — is how the shell runs every command.
pid_t pid = fork();
if (pid == 0) {
/* child: redirect stdout to a file, then exec */
int fd = open("output.txt", O_WRONLY|O_CREAT|O_TRUNC, 0644);
dup2(fd, STDOUT_FILENO); /* fd 1 now points to file */
close(fd);
execl("/bin/ls", "ls", "-l", NULL);
_exit(1); /* only reached if exec fails */
}
waitpid(pid, NULL, 0);
Process Credentials
Identifies the actual owner of the process. Set at login and inherited by all children. Represents “who started this process.”
What the kernel checks for permission on file access, signals, and other resources. Normally equals RUID but can differ via the SUID mechanism.
A copy of the EUID from the last exec(). Lets a process temporarily drop to its RUID and later restore the elevated EUID — used for precise privilege management.
When a file has the SUID bit set and is owned by root, any process running it gets EUID=0. Example:
ls -l /usr/bin/passwd -rws r-x r-x 1 root root ... /usr/bin/passwd ↑ 's' = SUID bit — running this gives the process EUID=0
A regular user runs passwd and their process temporarily becomes EUID=0, allowing it to write to /etc/shadow. The passwd program validates the user’s identity first — the escalation is tightly controlled.
Linux Capabilities
Since kernel 2.2, Linux divides root’s privileges into ~40 independent units called capabilities. A process can be granted only what it needs:
A web server needs only CAP_NET_BIND_SERVICE to use port 80 — granting it full root would be excessive and dangerous. Modern systems use capabilities to follow the principle of least privilege.
The init Process (PID 1)
After booting, the kernel creates init (PID 1) from /sbin/init. Init is the ancestor of every process on the system. It always has PID 1, runs with full root privileges, cannot be killed even by root, and terminates only on system shutdown. Init continuously reaps zombie children and manages system services. Modern Linux systems use systemd as PID 1.
Daemon Processes
A daemon is a long-lived background process with no controlling terminal. It is usually started at boot and runs until shutdown. Examples: sshd, httpd, crond, syslogd. Standard daemonisation steps:
pid_t pid = fork();
if (pid > 0) exit(0); /* parent exits, shell gets control back */
setsid(); /* child: new session, no controlling terminal */
/* redirect stdin/stdout/stderr to /dev/null */
/* chdir("/"), umask(0), close inherited file descriptors */
Memory Mappings: mmap()
mmap() creates a new memory mapping in the process’s virtual address space. Two types:
Maps a region of a file into virtual memory. Reading/writing those memory bytes is equivalent to I/O on the file. Pages are loaded from the file on demand. Used to load executable code and for memory-mapped I/O.
No backing file — pages are initialised to zero. Used by malloc() for large allocations and for shared memory between a parent and child after fork().
Static vs Shared Libraries
At link time the linker extracts needed object modules from the library and copies them into the executable. Every statically linked program contains its own private copy of all library code it uses. Disadvantage: wasted disk space and RAM from duplicate copies; a bug fix in a library requires relinking every program that uses it.
The linker only records that the program needs libfoo.so. At runtime, the dynamic linker (ld.so) loads the shared library and resolves function references. Only one copy of the library code lives in RAM shared by all programs. Advantages: saves disk space and RAM; bug fixes take effect immediately for all programs without relinking.
ldd /bin/ls # show shared library dependencies
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6
libselinux.so.1 => /lib/x86_64-linux-gnu/libselinux.so.1
Interview Questions
Answer: fork() creates a new process by duplicating the caller. Both parent and child continue running the same code; fork() returns the child’s PID in the parent and 0 in the child. exec() replaces the calling process’s program image with a new program — the PID stays the same but code, data, stack, and heap are entirely replaced. The shell uses both together: fork() to create a child, exec() in the child to run the command, and waitpid() in the parent to collect the exit status.
Answer: When fork() creates a child, physically copying all parent memory for large processes would be extremely slow. Copy-on-write avoids this: the kernel marks all shared memory pages read-only and both processes point to the same physical pages. A page is only copied to a new location when either process actually writes to it, triggering a protection fault. Since fork() followed immediately by exec() never writes to parent pages, fork-exec incurs almost no memory copy cost.
Answer: After a process exits, the kernel retains a small record of its exit status until the parent collects it with wait(). A process in this state — dead but uncollected — is a zombie. It holds a PID and process table slot but no CPU or significant memory. Prevention: (1) call waitpid() for every child; (2) install a SIGCHLD handler that calls waitpid(-1, NULL, WNOHANG) in a loop; (3) set SIGCHLD to SIG_IGN for auto-reaping; (4) double-fork so the grandchild is adopted and reaped by init.
Answer: Capabilities divide root’s privileges into ~40 independent units. A process gets only the capabilities it actually needs. A web server binding to port 80 needs only CAP_NET_BIND_SERVICE; giving it full root is excessive and dangerous. If the process is compromised, the attacker gains only those specific capabilities rather than full root access. Modern service managers like systemd assign per-service capability sets, greatly reducing the attack surface compared to the traditional binary root/non-root model.
Answer: Shared libraries save disk space and RAM because only one physical copy of the library code is loaded regardless of how many programs use it. More critically for security: bug fixes to a shared library automatically benefit all programs that use it the next time they run — no recompilation or relinking needed. With static libraries every program containing vulnerable code must be rebuilt and redistributed individually. For a widely used library like libc, shared linking means a single patch fixes every program on the system.
Continue to Chapter 05
Next: All 7 IPC mechanisms, signals as software interrupts, and POSIX threads with mutexes and condition variables.
