What You Will Learn
This section explains the fundamental difference between a program (a file on disk) and a process (a running instance of that program). Understanding this distinction is the foundation for all process management in Linux.
A program is a file stored on disk. It contains instructions and data that tell the kernel how to build a process in memory when the program is executed. The most common executable format on modern Linux is ELF (Executable and Linking Format).
A program file holds several key pieces of information:
| Component | Description | Example |
|---|---|---|
| Binary Format ID | Magic bytes that identify the file as ELF | 0x7F ‘E’ ‘L’ ‘F’ |
| Entry Point | Memory address where execution starts | _start symbol |
| Machine Code | Compiled CPU instructions (text segment) | .text section |
| Data | Initialized global/static variable values | .data, .rodata |
| Symbol Table | Names and addresses of functions/variables | Used by debuggers |
| Shared Lib Info | Which .so libraries are needed at runtime | libc.so.6 |
A process is an instance of a program that the kernel has loaded into memory and is actively running. The kernel creates a process by reading the program file and setting up memory, file descriptors, signal handlers, and other resources.
One program file can spawn many processes simultaneously. For example, you can open three terminal windows and run bash in each — three processes, one program.
|
📄 /bin/bash
(Program File on Disk) |
→ |
|
| Aspect | Program | Process |
|---|---|---|
| Exists where? | On disk (filesystem) | In RAM (memory) |
| Static or Dynamic? | Static — doesn’t change | Dynamic — has state, runs, changes |
| Has PID? | No | Yes — unique ID assigned by kernel |
| Resources allocated? | No | Yes — CPU, memory, file descriptors |
| How many at once? | One copy on disk | Many instances can run simultaneously |
Code Examples
Every running program is a process. This example prints basic facts that only a running process can know — its own PID.
#include <stdio.h>
#include <unistd.h>
int main(void)
{
/* A program file has no PID — only a process does.
Printing our PID proves we are a running process. */
printf("I am a process. My PID is: %d\n", getpid());
printf("My parent process PID is: %d\n", getppid());
return 0;
}
/* Compile: gcc -o proc_id proc_id.c
Run: ./proc_id
Output:
I am a process. My PID is: 4821
My parent process PID is: 4756 */
Using fork(), one program creates two processes. Both run the same code but have different PIDs.
#include <stdio.h>
#include <unistd.h>
int main(void)
{
pid_t pid = fork(); /* Create a child process */
if (pid == 0) {
/* This block runs in the CHILD process */
printf("CHILD process — PID: %d\n", getpid());
} else {
/* This block runs in the PARENT process */
printf("PARENT process — PID: %d, child PID: %d\n",
getpid(), pid);
}
/* Both parent and child continue here */
printf("PID %d: Same program, two processes!\n", getpid());
return 0;
}
/* Output (PIDs will vary):
PARENT process — PID: 5100, child PID: 5101
CHILD process — PID: 5101
PID 5100: Same program, two processes!
PID 5101: Same program, two processes! */
You can inspect the ELF header of any compiled program using readelf. This shows the binary format information stored in the program file.
#include <stdio.h>
int main(void)
{
printf("Hello from my ELF binary!\n");
return 0;
}
/* Compile and inspect:
$ gcc -o hello hello.c
$ readelf -h hello
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 ... (ELF magic bytes)
Class: ELF64
Type: ET_EXEC (Executable file)
Entry point address: 0x401060
...
$ file hello
hello: ELF 64-bit LSB pie executable, x86-64 ...
The 'file' command reads the magic bytes and identifies
the binary format — this is the "Binary Format ID" field
described in the ELF specification. */
Interview Preparation Questions
A program is a static file on disk containing instructions. A process is a running instance of that program — it has a PID, allocated memory (stack, heap), open file descriptors, and other kernel-managed resources. A single program file can create many processes.
ELF (Executable and Linking Format) is the standard binary format for Linux executables. It replaced older formats like a.out and COFF. It stores the binary format ID, entry point, machine code (.text), initialized data, symbol tables, and shared library dependencies in a structured, extensible way.
Yes. The read-only text (code) segment is shared in physical memory across all processes running the same program. Each process gets its own stack, heap, and data segments. This is memory-efficient — the kernel doesn’t duplicate the executable code for each process.
The kernel maintains a process descriptor (task_struct in Linux) containing: PID, parent PID, virtual memory tables, open file descriptor table, signal handlers, resource usage and limits, current working directory, credentials (UID/GID), and scheduling information.
The shell calls fork() to create a child process, then calls exec() in the child. exec() reads the ELF header of myprogram, loads the text and data segments into memory, sets up the stack, and jumps to the program’s entry point (_start, which then calls main()).
No. The actual entry point is usually _start, which is a small assembly stub provided by the C runtime (crt0.o / glibc). _start sets up the environment, initializes the C library, and then calls main(). You can verify this with readelf -h ./binary | grep "Entry point".
