Chapter 26 — Part 5: Orphans and Zombies

Chapter 26 — Part 5: Orphans and Zombies
What Happens When Parents or Children Die First
Topic
Orphans & Zombies
Level
Intermediate
Examples
3 Programs

The Two Scenarios

Parent and child processes have independent lifetimes. Two things can go wrong:

  • Parent dies first: Child becomes an orphan — adopted by init (PID 1)
  • Child dies first (before parent waits): Child becomes a zombie — stays in the process table until the parent calls wait()

Orphan Process — How init Adopts
Normal: init (PID 1) → Parent → Child
↓ Parent exits
Child is now an ORPHAN
getppid() returns something other than original parent
↓ Kernel auto-reparents
init (PID 1) becomes new parent
getppid() == 1

Zombie Process — Life Cycle
Child Running Child calls exit()
(or killed by signal)
Becomes ZOMBIE
Resources released
Only PID + status kept
Parent calls wait()
Zombie removed, status given to parent
OR Parent exits without wait()
init adopts zombie, init’s wait() cleans up

What a Zombie Process Keeps vs Releases
RELEASED (freed to system) KEPT (in kernel process table)
  • Memory (heap, stack, code)
  • Open file descriptors
  • Signal handlers
  • CPU time slices
  • Page table entries
  • PID (entry in process table)
  • Exit/termination status
  • Resource usage statistics
  • User and group IDs

Example 1: Detecting Orphan Adoption by init
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int main(void)
{
    pid_t parent_pid = getpid();
    pid_t child;

    child = fork();
    if (child == -1) {
        perror("fork");
        exit(EXIT_FAILURE);
    }

    if (child == 0) {
        /* Child process */
        printf("[Child] Started. My PID=%d, Parent PID=%d\n",
               getpid(), getppid());

        /* Sleep to let parent exit first */
        sleep(2);

        /* After parent exits, check new parent */
        printf("[Child] After parent exit: getppid()=%d\n", getppid());

        if (getppid() == 1)
            printf("[Child] Adopted by init (PID=1) — I am an ORPHAN\n");
        else
            printf("[Child] Adopted by subreaper PID=%d\n", getppid());

        exit(0);
    }

    /* Parent exits immediately, before child wakes up */
    printf("[Parent] PID=%d exiting. Child PID=%d will be orphaned.\n",
           getpid(), child);
    exit(0);
    /* No wait() — child becomes orphan */
}
/*
 * Expected output:
 * [Parent] PID=1001 exiting. Child PID=1002 will be orphaned.
 * [Child] Started. My PID=1002, Parent PID=1001
 * [Child] After parent exit: getppid()=1
 * [Child] Adopted by init (PID=1) — I am an ORPHAN
 */

Note: On modern Linux with systemd, the subreaper may not be PID 1 exactly, but the concept is the same.

Example 2: Creating a Zombie and Proving SIGKILL Doesn’t Work
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <signal.h>

int main(void)
{
    pid_t child;
    char cmd[256];

    child = fork();
    if (child == -1) {
        perror("fork");
        exit(EXIT_FAILURE);
    }

    if (child == 0) {
        /* Child exits immediately without being waited on */
        printf("[Child] PID=%d exiting. Becoming zombie...\n", getpid());
        _exit(0);
    }

    /* Parent does NOT call wait() — let child become zombie */
    sleep(1); /* Give child time to exit */

    printf("[Parent] Child PID=%d is now a zombie. Check with ps:\n", child);
    snprintf(cmd, sizeof(cmd), "ps -p %d -o pid,stat,comm", child);
    system(cmd);  /* Will show 'Z' state */

    /* Attempt to kill zombie with SIGKILL — this does NOTHING */
    printf("[Parent] Sending SIGKILL to zombie...\n");
    if (kill(child, SIGKILL) == 0)
        printf("[Parent] kill() call succeeded (but zombie is immune)\n");

    sleep(1);
    printf("[Parent] Zombie still exists after SIGKILL:\n");
    system(cmd);  /* Zombie still there! */

    /* NOW call wait() to properly reap it */
    printf("[Parent] Calling wait() to reap zombie...\n");
    wait(NULL);

    printf("[Parent] Zombie reaped. Check ps again:\n");
    system(cmd);  /* No output — zombie gone */

    return 0;
}
/*
 * Key observation: SIGKILL cannot kill a zombie process.
 * The only ways to remove a zombie:
 *   1. Parent calls wait()
 *   2. Parent itself exits (init then cleans up)
 */

Example 3: Zombie Accumulation — The Danger

This shows what happens when a long-running parent never calls wait() — zombies pile up and can exhaust the process table.

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>

#define NUM_CHILDREN 5

int main(void)
{
    pid_t pids[NUM_CHILDREN];
    char cmd[128];

    printf("[Parent] PID=%d creating %d children without wait()\n",
           getpid(), NUM_CHILDREN);

    /* Create children — none will be waited on */
    for (int i = 0; i < NUM_CHILDREN; i++) {
        pids[i] = fork();
        if (pids[i] == 0) {
            /* Child exits immediately */
            _exit(i * 10); /* Different exit codes */
        }
    }

    sleep(1); /* Let all children exit and become zombies */

    printf("[Parent] Zombies accumulated. ps output:\n");
    snprintf(cmd, sizeof(cmd), "ps -o pid,stat,ppid,comm --ppid %d",
             getpid());
    system(cmd);  /* All show Z state */

    /* The RIGHT way: reap them all */
    printf("\n[Parent] Now reaping all zombies...\n");
    int status;
    pid_t reaped;
    int count = 0;
    while ((reaped = waitpid(-1, &status, WNOHANG)) > 0) {
        count++;
        printf("[Parent] Reaped PID=%d, exit_code=%d\n",
               reaped, WIFEXITED(status) ? WEXITSTATUS(status) : -1);
    }
    printf("[Parent] Reaped %d zombies.\n", count);

    return 0;
}

Why Zombies are Dangerous in Long-Running Programs

Each zombie holds one entry in the kernel process table. The process table has a fixed maximum size (typically 32768 PIDs on Linux by default).

If a server creates thousands of child processes and never calls wait(), the process table fills up. At that point, fork() returns EAGAINno new processes can be created on the entire system.

Zombies cannot be killed by any signal, including SIGKILL. The only fix is to kill their parent (after which init reaps them) or to have the parent call wait().

Key Terms:

zombie orphan init (PID 1) reparenting process table SIGKILL getppid() <defunct>

Interview Questions

Q1. What is a zombie process?

A process that has terminated but whose parent has not yet called wait() to retrieve its exit status. It retains a process table entry with PID, exit status, and resource stats, but all other resources are freed.

Q2. What is an orphan process?

A child process whose parent has exited. The orphan is automatically adopted by init (PID 1), which will eventually call wait() to clean it up.

Q3. Can you kill a zombie with SIGKILL?

No. A zombie is already dead — it has no running code. Signals are not delivered to zombies. The only ways to remove a zombie are: its parent calls wait(), or the parent exits and init reaps it.

Q4. How does a parent detect that it has been orphaned?

if (getppid() == 1)
    printf("My parent died — I was adopted by init\n");

Q5. What happens to a zombie if its parent exits without calling wait()?

The zombie becomes an orphan zombie. init (PID 1) inherits it and calls wait() to clean it up. So ultimately all zombies are reaped, but the timing depends on init.

Q6. Why are zombies dangerous in a network server?

Each zombie occupies a process table slot. If thousands of children are never waited on, the process table fills up, preventing any new processes from being created on the system.

Q7. ps shows a process with state “Z” and name “<defunct>”. What does this mean?

It’s a zombie process. “Z” is the zombie state. “<defunct>” is how ps labels processes in the zombie state.

Q8. How can a daemon server prevent zombie accumulation?

Three common techniques: (1) call waitpid() with WNOHANG periodically, (2) install a SIGCHLD handler that calls waitpid() in a loop, or (3) set SIGCHLD disposition to SIG_IGN to auto-reap children.

Q9. What does the prctl(PR_SET_PDEATHSIG, …) call do?

It lets a process specify a signal to receive when it becomes an orphan (when its parent dies). Useful for child processes that need to clean up when their parent exits unexpectedly.

Next: SIGCHLD Signal — Asynchronous Child Monitoring

Part 6: SIGCHLD → EmbeddedPathashala Home

Leave a Reply

Your email address will not be published. Required fields are marked *