Joining with a Terminated Thread

Joining with a Terminated Thread
Part 6 of 9  |  pthread_join() · Thread Zombies · Peers
Level
Beginner
Function
pthread_join()
Book
TLPI – Ch 29.6

When you create a thread and it finishes its work, you typically need to know that it has finished and optionally collect its result. The function pthread_join() does exactly this — it waits (blocks) until the specified thread terminates and then lets you read its return value. This part explains how joining works, the “zombie thread” problem, and how joining differs from the process-level waitpid().

Key Terms

pthread_join() Joining Thread Zombie retval waitpid() Peer Threads Blocking

1. pthread_join() — Function Signature

#include <pthread.h>

int pthread_join(pthread_t thread, void **retval);

/* Returns: 0 on success, positive error number on failure */
Parameter Description
thread The ID of the thread to wait for (obtained from pthread_create()).
retval If non-NULL, the thread’s return value is copied here. It is a void ** — a pointer to a void *. Pass NULL if you don’t care about the return value.

2. How pthread_join() Works

Main Thread
Creates worker thread
Does some work…
Calls pthread_join() — BLOCKS here
Wakes up when worker finishes
Reads worker’s return value
Worker Thread
Starts running start function
Does heavy computation…
More processing…
Returns / calls pthread_exit()
Thread exits — signals joiner

Behaviour:

  • If the thread has already terminated by the time pthread_join() is called, the function returns immediately.
  • If the thread is still running, the calling thread blocks until it finishes.
  • The function cleans up the resources associated with the terminated thread (like the stack). Without calling pthread_join() or detaching the thread, these resources are not freed.

3. Thread Zombies — What Happens Without Joining

When a thread terminates, if no other thread calls pthread_join() for it, the system cannot fully release its resources. This is called a zombie thread — analogous to a zombie process (Section 26.2 in TLPI).

The zombie thread continues to occupy:

  • Its stack memory
  • Internal threading library structures
  • A slot in the thread table

If enough zombie threads accumulate, you will eventually be unable to create new threads (pthread_create() will fail with EAGAIN).

How to avoid zombie threads:

  • Call pthread_join() for every joinable thread
  • Or create the thread as detached from the start using pthread_attr_setdetachstate()
  • Or call pthread_detach(tid) after creation if you don’t need the return value

4. Threads Are Peers — No Hierarchy

This is an important conceptual difference from processes:

Processes — Hierarchical

Only the parent process can call wait() or waitpid() for a child. There is a strict parent-child relationship. A sibling process cannot wait for another sibling.

Threads — Peer-to-Peer

Any thread in a process can call pthread_join() for any other thread. There is no parent-child concept. Thread A can join thread C even if thread B created thread C.

However, pthread_join() can only join with a specific thread ID — there is no “join with any thread” equivalent of waitpid(-1, ...). This is intentional: joining with an arbitrary thread could cause a library’s private threads to be accidentally joined by application code, breaking modularity.

5. pthread_join() vs waitpid() — Key Differences

Feature pthread_join() waitpid()
Who can call it Any thread in the process Only the parent process
“Wait for any” Not supported Yes: waitpid(-1, ...)
Non-blocking mode Not built-in (use condition variables) Yes: WNOHANG flag
Hierarchy Peer — any can join any Parent-child relationship
Double join Undefined behaviour (may join a new thread) Error: no such child

6. Code Example: pthread_join() — Collecting Return Values

/* compile: gcc -pthread -o thread_join thread_join.c */
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>

/* Compute sum of 1..n and return result */
static void *sum_worker(void *arg)
{
    long n = (long)(intptr_t) arg;
    long sum = 0;

    for (long i = 1; i <= n; i++)
        sum += i;

    printf("Worker: sum(1..%ld) = %ld\n", n, sum);

    /* Return sum as void * via cast */
    return (void *)(intptr_t) sum;
}

int main(void)
{
    pthread_t t1, t2;
    void     *res1, *res2;
    int       s;

    /* Create two threads with different arguments */
    s = pthread_create(&t1, NULL, sum_worker, (void *)(intptr_t) 100);
    if (s != 0) { fprintf(stderr, "create t1 failed\n"); exit(1); }

    s = pthread_create(&t2, NULL, sum_worker, (void *)(intptr_t) 200);
    if (s != 0) { fprintf(stderr, "create t2 failed\n"); exit(1); }

    /* Wait for both threads and collect return values */
    pthread_join(t1, &res1);
    pthread_join(t2, &res2);

    printf("Main: t1 returned %ld\n", (long)(intptr_t) res1);
    printf("Main: t2 returned %ld\n", (long)(intptr_t) res2);
    printf("Main: grand total = %ld\n",
           (long)(intptr_t)res1 + (long)(intptr_t)res2);

    return 0;
}

Expected output:

Worker: sum(1..100) = 5050
Worker: sum(1..200) = 20100
Main: t1 returned 5050
Main: t2 returned 20100
Main: grand total = 25150

7. Code Example: Thread Zombie Demonstration

This example shows the difference between a correctly joined thread and an un-joined one, and demonstrates how failing to join eventually causes thread creation to fail.

/* compile: gcc -pthread -o zombie_demo zombie_demo.c */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <pthread.h>

static void *do_nothing(void *arg) { return NULL; }

int main(void)
{
    pthread_t tid;
    int s, created = 0;

    /*
     * Create threads WITHOUT joining — they become zombies.
     * Eventually pthread_create() will fail.
     */
    while (1) {
        s = pthread_create(&tid, NULL, do_nothing, NULL);
        if (s != 0) {
            printf("pthread_create failed after %d threads: %s\n",
                   created, strerror(s));
            break;
        }
        /* NOTE: not calling pthread_join() — threads become zombies */
        created++;

        if (created % 100 == 0)
            printf("Created %d zombie threads so far...\n", created);
    }

    /*
     * To fix this, either:
     *   pthread_join(tid, NULL);   after each create, OR
     *   use detached threads       (Part 7)
     */
    return 0;
}

Run this to see the system run out of thread slots. In practice you would fix this by either calling pthread_join() or creating detached threads (covered in Part 7).

8. Interview Questions

Q1. What is pthread_join() and why is it important?
pthread_join() waits for a specified thread to terminate and optionally retrieves its return value. It is important because: (1) it synchronises the calling thread with the target thread’s completion, and (2) it releases the resources (stack, thread descriptor) of the terminated thread, preventing “zombie threads”.
Q2. What is a zombie thread?
A zombie thread is a thread that has terminated but whose resources have not been freed because no other thread has called pthread_join() on it (and it was not detached). Zombie threads waste system resources and can eventually prevent creation of new threads.
Q3. Can thread A join thread C even if thread B created thread C?
Yes. Unlike processes (where only the parent can wait() for a child), threads are peers. Any thread in a process can call pthread_join() for any other thread, regardless of which thread created it.
Q4. What happens if you call pthread_join() twice on the same thread ID?
It leads to undefined behaviour. After the first join, the thread’s resources are freed and the ID may be reused for a new thread. The second join might accidentally join a different thread that happened to receive the same ID.
Q5. Why doesn’t Pthreads support “join with any thread”?
Because threads are peers with no hierarchy, a “join any” operation could accidentally join a thread that was privately created by a library function. The library would then be unable to join its own thread, causing errors. This would break modular program design. The solution is to use condition variables when you need “wait for any of my known threads” behaviour.

Next: Part 7 — Detaching a Thread

Learn how to create fire-and-forget threads that clean up automatically without needing pthread_join().

Leave a Reply

Your email address will not be published. Required fields are marked *