Thread Safety and Reentrancy

 

Chapter 31 · File 1 of 5
Thread Safety and Reentrancy
Understanding what makes functions safe (or unsafe) when used by multiple threads
Topics Covered:

Thread Safety Global Variables & Threads Mutex Serialization Critical Sections Reentrancy Non-thread-safe Functions _r Reentrant Variants

1. What is Thread Safety?

A function is called thread-safe if it can be safely called by multiple threads at the same time without causing any data corruption, unexpected behavior, or crashes.

To understand why this matters, think about what happens when two threads run the same function simultaneously. If the function only uses local variables (variables on its own stack), then each thread has its own copy — no problem. But if the function uses global or static variables, both threads share the same memory locations, and their operations can interfere with each other.

This interference is called a data race or race condition. The outcome of the program becomes unpredictable and depends on the exact timing of thread scheduling by the operating system.

Why Global and Static Variables Are the Problem

When a function uses a static or global variable, that variable exists in one memory location shared by the entire process. All threads in the process see and modify the same copy. If two threads modify it at the same time, the result is garbage. This is the most common cause of functions being non-thread-safe.

Problem: Two Threads Sharing a Static Buffer
Thread A: writes “Hello” → static buf
static char buf[256] ← SHARED!
Thread B: writes “World” → static buf
static char buf[256] ← SAME MEMORY!
⚠ Thread A reads buf back and gets “World” — it was overwritten by Thread B!

2. A Classic Non-Thread-Safe Function Example

The book shows a simple incr() function that increments a global counter. It looks innocent, but it is broken when two threads call it at the same time. Here is the problem: reading a value, incrementing it, and writing it back is not atomic — the OS can switch threads between any of these steps. This is called a read-modify-write race condition.

/* ===== EXAMPLE 1: Non-Thread-Safe Counter (The Problem) ===== */

#include <stdio.h>
#include <pthread.h>

static int glob = 0;  /* Global variable — shared by ALL threads */

/* This function is NOT thread-safe */
static void incr(int loops)
{
    int loc, j;
    for (j = 0; j < loops; j++) {
        loc = glob;   /* Step 1: Read glob into local variable */
        loc++;        /* Step 2: Increment local copy */
        glob = loc;   /* Step 3: Write back to glob */
        /* DANGER: Another thread can run between steps 1, 2, and 3!
           Both threads may read the same old value, both increment to
           the same new value, and the net effect is only +1 instead of +2 */
    }
}

void *threadA(void *arg) { incr(1000000); return NULL; }
void *threadB(void *arg) { incr(1000000); return NULL; }

int main(void)
{
    pthread_t t1, t2;
    pthread_create(&t1, NULL, threadA, NULL);
    pthread_create(&t2, NULL, threadB, NULL);
    pthread_join(t1, NULL);
    pthread_join(t2, NULL);

    /* Expected: 2000000 — Actual: unpredictable, often much less! */
    printf("Final glob = %d (expected 2000000)\n", glob);
    return 0;
}
⚠ What Goes Wrong: Both threads read the same value of glob, both compute the same incremented value, and both write it back. The increment from one thread is “lost”. The final value of glob can be far less than expected — this is called a lost update.

3. Two Approaches to Thread Safety

Approach 1: Serialize Entire Function with a Mutex

The simplest way to make a function thread-safe is to put a mutex lock around the entire function. Only one thread can enter the function at a time. The other threads wait. This is called serialization — threads execute the function one after another, not in parallel.

Advantage: Simple to implement.
Disadvantage: Kills parallelism. If threads spend most of their time in this function, they effectively run single-threaded. The whole benefit of having multiple threads is lost.

/* ===== EXAMPLE 2: Thread-Safe Counter Using Mutex (Serialized) ===== */

#include <stdio.h>
#include <pthread.h>

static int glob = 0;
static pthread_mutex_t mtx = PTHREAD_MUTEX_INITIALIZER;  /* Static init */

/* Thread-safe version: entire function is protected by a mutex */
static void incr_safe(int loops)
{
    int j;
    pthread_mutex_lock(&mtx);     /* Lock: only one thread enters */
    for (j = 0; j < loops; j++) {
        glob++;                   /* Safe: only one thread here at a time */
    }
    pthread_mutex_unlock(&mtx);   /* Unlock: let another thread in */
}

/* Problem: both threads run one-at-a-time — no parallel speedup */

Approach 2: Protect Only Critical Sections (Better Concurrency)

A smarter approach is to identify only the parts of the function that actually touch shared data (called critical sections) and protect only those parts with a mutex. The rest of the function can run in parallel across multiple threads.

This allows greater concurrency: threads can execute the non-critical parts of the function simultaneously, and only briefly block when they need to access the shared variable.

/* ===== EXAMPLE 3: Thread-Safe Counter — Critical Section Only ===== */

#include <stdio.h>
#include <pthread.h>

static long glob = 0;
static pthread_mutex_t mtx = PTHREAD_MUTEX_INITIALIZER;

/* Suppose this function also does expensive non-shared work */
static void incr_with_work(int loops)
{
    int j;
    for (j = 0; j < loops; j++) {

        /* ... lots of non-shared computation here (runs in parallel) ... */

        /* CRITICAL SECTION: protect only the shared variable access */
        pthread_mutex_lock(&mtx);
        glob++;
        pthread_mutex_unlock(&mtx);

        /* ... more non-shared computation here (runs in parallel) ... */
    }
}

/* Multiple threads can now run incr_with_work() in parallel.
   They only serialize briefly at the glob++ line. */
💡 Key Principle: Keep critical sections as short as possible. Only the minimum amount of code that accesses shared data should be inside lock/unlock. The shorter the critical section, the more parallelism you get.

4. Reentrant Functions — Thread Safety Without Mutexes

Even critical-section mutexes have overhead — every lock and unlock call takes time. A reentrant function achieves thread safety with zero mutex cost by completely avoiding global and static variables.

A reentrant function only uses:

  • Local variables (on the stack — each thread has its own stack)
  • Caller-provided buffers (the caller allocates memory and passes a pointer)
  • No static/global state at all

The term “reentrant” means the function can be re-entered (called again) while a previous call is still executing, from a different thread or even from a signal handler, and both calls work correctly without interfering.

Reentrant vs Non-Reentrant
Property Non-Reentrant Reentrant
Uses global/static variables Yes ❌ No ✅
Returns pointer to static buffer Yes ❌ No ✅
Needs mutex for thread safety Yes (or TSD) No — already safe
Performance Lower (mutex overhead) Higher (no locking)
Interface change required No Often yes (caller provides buffer)

Why Not All Functions Can Be Made Reentrant

Not every function can be made reentrant. There are two main reasons:

1. They inherently need global data. The malloc() family of functions maintains a global linked list of free memory blocks. There is no way to move this to per-thread storage — all threads share the heap. These functions are made thread-safe using mutexes, not reentrancy.

2. Their interface was defined before threads existed. Some C library functions return a pointer to a static internal buffer. For example, strerror() returns a pointer to a static character array. You cannot change this interface without breaking every program that uses it. This is where Thread-Specific Data (covered in File 3 and 4) becomes useful.

5. SUSv3 Non-Thread-Safe Functions

SUSv3 (Single UNIX Specification version 3) requires almost all standard C library functions to be thread-safe. However, it explicitly lists a small set of functions that are not required to be thread-safe. These are typically functions that:

  • Return a pointer to a statically allocated buffer
  • Use a static variable to maintain state between calls
  • Were designed before threads were invented

Some well-known examples from this list:

asctime() ctime() strtok() strerror() getenv() localtime() gmtime() readdir() rand() ttyname() getpwnam() getgrgid() inet_ntoa() ptsname()
⚠ Warning for Fresher Programmers: Even if these functions happen to work in threads on your Linux machine today (because glibc provides a thread-safe version), you cannot rely on this for portable code. The standard does not require it. Always use the thread-safe _r variants when writing multithreaded code.

The _r Reentrant Variants

For many of the unsafe functions, POSIX specifies a reentrant replacement with the suffix _r. These functions require the caller to provide a buffer for the result, which is stored in that buffer instead of a static internal one.

Original (Non-Reentrant) Reentrant Version What the Caller Provides
asctime() asctime_r() char buf[26]
ctime() ctime_r() char buf[26]
strtok() strtok_r() char **saveptr
strerror() strerror_r() char *buf, size_t buflen
getpwnam() getpwnam_r() struct passwd *, char buf[], …
gmtime() gmtime_r() struct tm *result
rand() rand_r() unsigned int *seedp
readdir() readdir_r() struct dirent *entry, …
ttyname() ttyname_r() char *buf, size_t buflen
/* ===== EXAMPLE 4: Using _r (Reentrant) Variants in Threaded Code ===== */

#include <stdio.h>
#include <string.h>
#include <time.h>
#include <pthread.h>

/* BAD: Non-reentrant — shares static buffer between threads */
void bad_time_example(void)
{
    time_t t = time(NULL);
    char *s = ctime(&t);        /* Returns pointer to STATIC buffer! */
    printf("%s\n", s);          /* Another thread may have overwritten it */
}

/* GOOD: Reentrant — caller provides buffer, no static storage used */
void good_time_example(void)
{
    time_t t = time(NULL);
    char buf[64];               /* Caller-provided buffer (on stack) */
    ctime_r(&t, buf);           /* Result goes into our own buffer */
    printf("%s\n", buf);        /* Safe: buffer belongs to this thread */
}

/* GOOD: Thread-safe strerror using strerror_r() */
void print_error(int errnum)
{
    char errbuf[256];
    strerror_r(errnum, errbuf, sizeof(errbuf));
    printf("Error: %s\n", errbuf);  /* Safe: per-call buffer */
}

/* GOOD: Thread-safe strtok using strtok_r() */
void parse_csv(char *line)
{
    char *saveptr;              /* strtok_r state — per-thread, on stack */
    char *token = strtok_r(line, ",", &saveptr);
    while (token != NULL) {
        printf("Token: %s\n", token);
        token = strtok_r(NULL, ",", &saveptr);
    }
    /* Multiple threads can call this simultaneously — each has own saveptr */
}
💡 Best Practice: In all new multithreaded code, always use the _r variants of standard library functions. They are more verbose, but they are safe and correct. Some modern alternatives like getaddrinfo() replace older functions (like gethostbyname()) and are already reentrant by design.

Interview Questions

Q1. What does it mean for a function to be thread-safe?A function is thread-safe if it can be called simultaneously by multiple threads without causing incorrect behavior, data corruption, or crashes. The key requirement is that concurrent invocations must not interfere with each other’s execution or results.

Q2. What is the most common reason a function is NOT thread-safe?The most common reason is the use of global or static variables. Because all threads in a process share the same address space, any global or static variable is shared by all threads. If multiple threads read and write such a variable simultaneously without synchronization, the result is a data race and unpredictable behavior.

Q3. What is the difference between making a function thread-safe using a per-function mutex vs. critical sections?Using a per-function mutex (locking the entire function) is simpler but causes serialization — only one thread can execute the function at a time, eliminating all parallelism. Using critical sections (locking only the parts that access shared data) allows multiple threads to run the non-critical parts of the function in parallel, providing much better performance and concurrency.

Q4. What is a reentrant function? How does it differ from a thread-safe function?A reentrant function avoids global and static variables entirely. It stores all state in local variables or caller-provided buffers, so multiple simultaneous calls cannot interfere with each other. All reentrant functions are thread-safe, but not all thread-safe functions are reentrant — a function protected by a mutex is thread-safe but not reentrant (it uses a shared mutex).

Q5. Can strtok() be used safely in a multithreaded program? What is the alternative?strtok() is not thread-safe because it uses an internal static pointer to track its position between calls. If two threads call strtok() simultaneously, they corrupt each other’s state. The reentrant alternative is strtok_r(), which requires the caller to provide a saveptr variable to hold the parsing state — since this is a local variable, it is per-thread and safe.

Q6. Why can’t all functions be made reentrant?Two main reasons: (1) Some functions inherently need global data structures — for example, malloc() manages a global heap and cannot be made reentrant. (2) Some functions (designed before threads) return pointers to statically allocated buffers — their interfaces cannot be changed without breaking all existing callers. Thread-Specific Data or Thread-Local Storage are used for these cases instead.

Q7. What does SUSv3 say about thread safety of standard library functions?SUSv3 requires almost all standard C library functions to be thread-safe. However, it explicitly lists a set of exceptions — functions that are NOT required to be thread-safe, mostly those that use static storage to return results (like asctime(), strerror(), localtime()). Portable programs must not assume these functions are thread-safe even if a particular implementation makes them so.

Leave a Reply

Your email address will not be published. Required fields are marked *