6.1 Processes and Programs

6.1 Processes and Programs
Linux System Programming — Chapter 6
2
Key Concepts
3
Code Examples
6
Interview Questions

What You Will Learn

This section explains the fundamental difference between a program (a file on disk) and a process (a running instance of that program). Understanding this distinction is the foundation for all process management in Linux.

What is a Program?

A program is a file stored on disk. It contains instructions and data that tell the kernel how to build a process in memory when the program is executed. The most common executable format on modern Linux is ELF (Executable and Linking Format).

A program file holds several key pieces of information:

Components Inside an ELF Program File
Component Description Example
Binary Format ID Magic bytes that identify the file as ELF 0x7F ‘E’ ‘L’ ‘F’
Entry Point Memory address where execution starts _start symbol
Machine Code Compiled CPU instructions (text segment) .text section
Data Initialized global/static variable values .data, .rodata
Symbol Table Names and addresses of functions/variables Used by debuggers
Shared Lib Info Which .so libraries are needed at runtime libc.so.6

What is a Process?

A process is an instance of a program that the kernel has loaded into memory and is actively running. The kernel creates a process by reading the program file and setting up memory, file descriptors, signal handlers, and other resources.

One program file can spawn many processes simultaneously. For example, you can open three terminal windows and run bash in each — three processes, one program.

One Program → Multiple Processes
📄 /bin/bash
(Program File on Disk)
Process PID 1201
Terminal 1 — bash
Process PID 1342
Terminal 2 — bash
Process PID 1489
Terminal 3 — bash
Key Point: Each process has its own separate memory space (stack, heap, data). Changes in one process do not affect another, even if both run the same program.

Program vs Process — Side by Side
Aspect Program Process
Exists where? On disk (filesystem) In RAM (memory)
Static or Dynamic? Static — doesn’t change Dynamic — has state, runs, changes
Has PID? No Yes — unique ID assigned by kernel
Resources allocated? No Yes — CPU, memory, file descriptors
How many at once? One copy on disk Many instances can run simultaneously

Code Examples

Example 1: Verify You Are Running a Process, Not a Program

Every running program is a process. This example prints basic facts that only a running process can know — its own PID.

#include <stdio.h>
#include <unistd.h>

int main(void)
{
    /* A program file has no PID — only a process does.
       Printing our PID proves we are a running process. */
    printf("I am a process. My PID is: %d\n", getpid());
    printf("My parent process PID is: %d\n", getppid());
    return 0;
}

/* Compile: gcc -o proc_id proc_id.c
   Run:     ./proc_id
   Output:
     I am a process. My PID is: 4821
     My parent process PID is: 4756   */

Example 2: Same Program, Different Processes

Using fork(), one program creates two processes. Both run the same code but have different PIDs.

#include <stdio.h>
#include <unistd.h>

int main(void)
{
    pid_t pid = fork();   /* Create a child process */

    if (pid == 0) {
        /* This block runs in the CHILD process */
        printf("CHILD process — PID: %d\n", getpid());
    } else {
        /* This block runs in the PARENT process */
        printf("PARENT process — PID: %d, child PID: %d\n",
               getpid(), pid);
    }
    /* Both parent and child continue here */
    printf("PID %d: Same program, two processes!\n", getpid());
    return 0;
}

/* Output (PIDs will vary):
   PARENT process — PID: 5100, child PID: 5101
   CHILD process  — PID: 5101
   PID 5100: Same program, two processes!
   PID 5101: Same program, two processes!  */

Example 3: Inspect the ELF Binary Format of Your Program

You can inspect the ELF header of any compiled program using readelf. This shows the binary format information stored in the program file.

#include <stdio.h>

int main(void)
{
    printf("Hello from my ELF binary!\n");
    return 0;
}

/* Compile and inspect:
   $ gcc -o hello hello.c

   $ readelf -h hello
   ELF Header:
     Magic:   7f 45 4c 46 02 01 01 00 ...   (ELF magic bytes)
     Class:   ELF64
     Type:    ET_EXEC (Executable file)
     Entry point address: 0x401060
     ...

   $ file hello
   hello: ELF 64-bit LSB pie executable, x86-64 ...

   The 'file' command reads the magic bytes and identifies
   the binary format — this is the "Binary Format ID" field
   described in the ELF specification. */

Interview Preparation Questions

Q1. What is the difference between a program and a process?

A program is a static file on disk containing instructions. A process is a running instance of that program — it has a PID, allocated memory (stack, heap), open file descriptors, and other kernel-managed resources. A single program file can create many processes.

Q2. What is the ELF format and why does Linux use it?

ELF (Executable and Linking Format) is the standard binary format for Linux executables. It replaced older formats like a.out and COFF. It stores the binary format ID, entry point, machine code (.text), initialized data, symbol tables, and shared library dependencies in a structured, extensible way.

Q3. Can multiple processes run the same program simultaneously?

Yes. The read-only text (code) segment is shared in physical memory across all processes running the same program. Each process gets its own stack, heap, and data segments. This is memory-efficient — the kernel doesn’t duplicate the executable code for each process.

Q4. What information does the kernel track for each process?

The kernel maintains a process descriptor (task_struct in Linux) containing: PID, parent PID, virtual memory tables, open file descriptor table, signal handlers, resource usage and limits, current working directory, credentials (UID/GID), and scheduling information.

Q5. What happens when you type ./myprogram in the shell?

The shell calls fork() to create a child process, then calls exec() in the child. exec() reads the ELF header of myprogram, loads the text and data segments into memory, sets up the stack, and jumps to the program’s entry point (_start, which then calls main()).

Q6. What is the entry point in an ELF binary? Is it always main()?

No. The actual entry point is usually _start, which is a small assembly stub provided by the C runtime (crt0.o / glibc). _start sets up the environment, initializes the C library, and then calls main(). You can verify this with readelf -h ./binary | grep "Entry point".

Leave a Reply

Your email address will not be published. Required fields are marked *