Linux system programming – Process groups, sessions, pseudoterminals

 

Chapter 06 — Advanced Topics & Master Q&A
Process groups, sessions, pseudoterminals, time, client-server, realtime, /proc — plus a complete interview question bank covering all six chapters.
Navigation: Home | ← Ch 05

Keywords

Process GroupsSessionssetsid()Controlling TerminalPseudoterminal PTYCLOCK_MONOTONICUnix EpochClient-ServerepollPOSIX Realtime/proc Filesystem

Process Groups

Every process belongs to a process group identified by a numeric PGID. Process groups are the mechanism behind terminal job control. When you press Ctrl-C, the kernel sends SIGINT to every process in the foreground process group — not just one process. This ensures an entire pipeline (find | sort | head) is interrupted together.

A process group has a leader — the process whose PID equals the PGID. When a shell starts a pipeline, it creates a new process group for all pipeline processes.

setpgid(0, 0);          /* move self to a new process group */
pid_t pgid = getpgid(pid);     /* get PGID of another process */
kill(-pgid, SIGTERM);   /* kill an entire process group */

Sessions and Controlling Terminals

A session is a collection of process groups. The session ID (SID) equals the PID of the session leader. Sessions tie together all process groups from one terminal login.

Session Structure
SESSION (SID = shell PID)
│
├── Controlling Terminal (/dev/pts/0)
│
├── Foreground Process Group   ← receives Ctrl-C (SIGINT), Ctrl-Z (SIGTSTP)
│   ├── shell (PID 100)
│   └── current command (PID 101)
│
└── Background Process Groups
    ├── job1 &   (PID 200)
    └── job2a | job2b &   (PID 300, 301)

Background processes that try to read from the terminal receive SIGTTIN, which stops them.

Daemonisation with setsid()

A daemon must have no controlling terminal. setsid() creates a new session, places the caller in a new process group, and detaches from any terminal. It must be called after fork() because a process group leader cannot create a new session.

pid_t pid = fork();
if (pid > 0) exit(0);         /* parent exits */
setsid();                      /* child: new session, no terminal */
/* redirect 0/1/2 to /dev/null, chdir("/"), umask(0) */

Pseudoterminals (PTY)

A pseudoterminal is a software device that emulates a hardware terminal. It has two ends: the master (controlled by a terminal emulator or SSH client) and the slave (appears as /dev/pts/N to the program inside). Everything you type in an SSH session travels through a PTY. When you resize the terminal window, SIGWINCH is sent via the PTY to the foreground process group. PTYs power all interactive terminal applications: SSH, tmux, screen, gnome-terminal.

Date and Time

Calendar / Wall Clock Time

Seconds since the Unix Epoch — midnight, 1 January 1970, UTC. Stored as time_t. Obtained with time(). Can jump backwards if NTP corrects the clock. Never use it to measure elapsed intervals — you may get negative or wrong results.

CLOCK_MONOTONIC — The Right Clock for Timing

A clock that only ever moves forward. Its epoch is arbitrary (usually boot time). Use clock_gettime(CLOCK_MONOTONIC, &ts). Always use this for: measuring elapsed time, implementing timeouts, benchmarking, and scheduling periodic work.

struct timespec start, end;
clock_gettime(CLOCK_MONOTONIC, &start);
/* ... do work ... */
clock_gettime(CLOCK_MONOTONIC, &end);
double elapsed = (end.tv_sec - start.tv_sec)
               + (end.tv_nsec - start.tv_nsec) / 1e9;
Process CPU Time

Total CPU consumed by the process, split into: user CPU time (running application code) and system CPU time (kernel executing system calls on behalf of the process). The shell’s time command reports all three: real (wall clock), user, and sys.

Client-Server Architecture

A fundamental pattern: a server is a long-lived process providing a service; clients send requests and receive responses. The channel is typically a socket — UNIX domain for local, TCP for network.

Server Concurrency Strategies
Iterative — one client at a time (trivial services only) Fork per client — traditional Apache pre-fork model Thread per client — one pthread per accepted connection epoll event loop — single thread, thousands of connections (nginx, Node.js)

epoll vs select/poll: select() and poll() scan the entire set of monitored fds on every call — O(N) per call. With 10,000 connections this collapses. epoll maintains a kernel-side registration; epoll_wait() returns only fds that actually have events — O(events). This is why high-concurrency servers use epoll.

Realtime Linux

Realtime means predictably bounded response time, not just “fast.” A realtime system guarantees responses occur within a specified deadline. Missing a hard deadline is a system failure (medical devices, flight control, industrial automation). Standard Linux is not hard-realtime because the scheduler can delay processes unpredictably. The PREEMPT_RT patch reduces worst-case latency. POSIX realtime extensions supported by Linux:

SCHED_FIFO / SCHED_RR — fixed high priority scheduling mlockall() — pin all pages in RAM, prevent page faults clock_nanosleep() — nanosecond-precision absolute wakeups POSIX message queues — bounded latency IPC Asynchronous I/O (AIO)

The /proc Virtual Filesystem

/proc is a virtual filesystem — no disk storage. The kernel synthesises its contents on demand. All content is human-readable text. It is a window into the kernel’s runtime state and every running process’s internals.

/proc Key Files Reference
Path Contents
/proc/PID/status Process state, UIDs, GIDs, memory usage, signal masks, thread count
/proc/PID/cmdline Original command line arguments (null-separated)
/proc/PID/maps Complete virtual memory map with addresses, permissions, backing files
/proc/PID/fd/ Symlinks for each open file descriptor pointing to the actual resource
/proc/PID/exe Symlink to the executable currently being run
/proc/cpuinfo CPU model, feature flags (SSE/AES), core count, cache sizes
/proc/meminfo Total/free/buffered/cached RAM and swap — what free(1) reads
/proc/net/tcp All TCP connections with addresses, ports, state, UID
/proc/sys/ Writable kernel tunables — same as sysctl
/proc/self/ Magic link to /proc/PID/ for the current process — convenient in code

Master Interview Question Bank — All Chapters

Kernel & OS Fundamentals
Q: What is the kernel and what are its main responsibilities?

Answer: The kernel is the central privileged software managing all hardware resources. Responsibilities: (1) process scheduling; (2) memory management with virtual memory; (3) providing a file system; (4) creating and terminating processes; (5) device access through drivers; (6) networking; (7) the system call API. The kernel runs in kernel mode (ring 0); all user programs run in user mode (ring 3) and cannot access kernel memory directly.

Q: What are the two CPU privilege levels and why do they exist?

Answer: Kernel mode (ring 0) allows any instruction and any memory access. User mode (ring 3) restricts access to user-space memory only. The separation prevents user programs from corrupting the kernel or other processes. The only way into the kernel from user mode is through a system call — a controlled entry point where the kernel validates all arguments before acting. Without this separation, any bug or malicious program could crash the entire OS.

Filesystem & I/O
Q: What is the difference between a hard link and a symbolic link?

Answer: A hard link is a directory entry pointing directly to an inode. All hard links share the same inode; data is freed only when all links are deleted. Hard links cannot cross filesystems and cannot point to directories. A symbolic link is a file containing a pathname string; the kernel follows it transparently. If the target is deleted, the symlink becomes dangling. Symlinks can cross filesystems and can point to directories. Hard links share the same inode number (visible with ls -li); symlinks have their own inode and display as linkname -> target.

Q: What does “universality of I/O” mean?

Answer: The same four system calls — open(), read(), write(), close() — work on regular files, terminals, pipes, sockets, and device files. A program reading from stdin works identically whether stdin is a keyboard, a file, or a pipe. Programs do not need to know what type of resource they interact with. This enables powerful composition (shell pipelines), simplifies programming, and provides one consistent API for all I/O in the system.

Processes & Memory
Q: What is the difference between fork() and exec()?

Answer: fork() creates a new process by duplicating the caller. Both parent and child continue running the same program; fork() returns the child’s PID in the parent and 0 in the child. exec() replaces the calling process’s program image with a new program — the PID stays the same but all code, data, stack, and heap are replaced. The shell uses both: fork() to create a child, set up redirections in the child, then exec() to run the command. The parent calls waitpid() to collect the exit status.

Q: What is a zombie process and how do you prevent one?

Answer: After a process exits, the kernel retains a small record of its exit status until the parent calls wait(). This dead-but-uncollected state is a zombie. It holds a PID and process table slot but no CPU or significant memory. Prevention: call waitpid() for every child; install a SIGCHLD handler that calls waitpid(-1, NULL, WNOHANG) in a loop; set SIGCHLD to SIG_IGN for auto-reaping; or use double-fork so the grandchild is adopted by init which continuously reaps its children.

Q: What are Linux capabilities and why are they better than running services as root?

Answer: Capabilities divide root’s privileges into ~40 independent units. A process gets only what it needs — a web server binding port 80 needs only CAP_NET_BIND_SERVICE, not full root. If the process is compromised, the attacker gets only those specific capabilities. With traditional root, a compromised service gives full system access. Modern service managers like systemd assign per-service capability sets, greatly limiting damage from security breaches. This is the principle of least privilege applied at the OS level.

IPC & Signals
Q: Name all seven IPC mechanisms with one-line use cases.

Answer: (1) Pipes — shell pipelines, streaming between parent and child. (2) FIFOs — IPC between unrelated processes via a filesystem path. (3) Sockets — client-server networking or fast local IPC with UNIX domain sockets. (4) File locking — prevent two processes updating the same database record. (5) Message queues — typed job dispatch with selective/priority retrieval. (6) Semaphores — limit concurrent access to N shared resources. (7) Shared memory — highest-throughput data sharing; requires separate synchronisation.

Q: What is async-signal safety and why does it matter?

Answer: A signal handler can fire at any moment — even inside malloc() or printf(). Most C library functions are not async-signal-safe because they use internal shared state. If a signal fires while malloc() holds an internal heap lock and your handler also calls malloc(), the inner call tries to acquire the same lock — deadlock. POSIX defines a list of safe functions including write(), read(), _exit(), sem_post(), getpid(). The safe pattern: set a volatile sig_atomic_t flag in the handler; do all real work in the main loop when the flag is set.

Threads
Q: What is the difference between a data race and a deadlock?

Answer: A data race is a correctness problem — two threads access the same memory simultaneously with at least one write and no synchronisation. The result is undefined, non-deterministic behaviour. A deadlock is a liveness problem — two or more threads each hold a lock and wait for one held by another, forming a cycle so no thread can proceed. Data races are fixed with locks; deadlocks are prevented by consistent lock acquisition ordering across all threads.

Q: Why must you use a while loop with pthread_cond_wait, not an if statement?

Answer: POSIX allows spurious wakeups — a thread can return from pthread_cond_wait() without being signalled. If you use if, a spurious wakeup causes the thread to proceed when the condition is still false, leading to incorrect behaviour. With while, the thread rechecks the condition after every wakeup and goes back to sleep if it is still false. This is mandatory for correctness and is required by the POSIX specification.

Advanced Topics
Q: What does setsid() do and why must a daemon call it?

Answer: setsid() creates a new session, makes the caller the session leader and sole process group member, and detaches from any controlling terminal. Without this, a daemon receives SIGHUP when the user who started it logs out, killing it. It must be called after fork() because a process group leader cannot create a new session — the fork ensures the child has a PID different from its parent’s PGID. After setsid() the process has no terminal and survives user logout.

Q: Why should you use CLOCK_MONOTONIC instead of time() for measuring elapsed time?

Answer: time() returns wall-clock seconds since the Unix Epoch. This value can jump backwards if NTP corrects the system clock — giving a negative or wrong elapsed time. CLOCK_MONOTONIC is guaranteed to only ever move forward, making it safe for elapsed time measurement. Use time() when you need the actual current date/time (logging, displaying to users). Use clock_gettime(CLOCK_MONOTONIC, &ts) for measuring how long something takes, implementing timeouts, and scheduling periodic tasks.

Q: What is /proc/PID/maps and when would you use it?

Answer: /proc/PID/maps shows the complete virtual memory map of a process — one line per region with start/end address, permissions (rwxp), offset, and backing file or label ([stack], [heap]). It is useful for: debugging segfaults by comparing the faulting address to mapped regions; identifying which shared libraries are loaded; observing ASLR randomisation; memory usage analysis; and security research. Tools like gdb, valgrind, and perf read it internally to resolve memory addresses to function names.

Q: Why does Linux use fork()+exec() rather than a single spawn() call?

Answer: The two-step model provides a critical window between fork and exec where the child can configure its environment before becoming a new program: close or redirect file descriptors (I/O redirection), change environment variables, modify the working directory, drop privileges with setuid(), set resource limits with setrlimit(), and reset signal handlers. All of this uses ordinary system calls with no special spawn() options. Shell pipelines, redirections, and background jobs all exploit this window. A single spawn() call would need a complex flags structure to replicate all of this. The simplicity of fork-exec is intentional and elegant.

Study Guide Complete!

You have covered all sections §2.1 through §2.19 of Linux system programming fundamentals.

← Back to Index Start Over from Chapter 01

Leave a Reply

Your email address will not be published. Required fields are marked *