The Problem: Shared Variables & Critical Sections
Section 30.1 — Understanding why threads need synchronization
Why Do Threads Share Memory?
One of the biggest advantages of threads over processes is that all threads in a program share the same memory. This means any thread can read or write the same global variable. You don’t need pipes, sockets, or shared memory segments like you would between processes.
But this convenience comes with a serious problem: if two threads try to modify the same variable at the same time, the result can be completely wrong. This is called a race condition.
A Real Example of the Problem
Consider this simple task: two threads each increment a shared global variable glob one million times each. At the end, glob should be 2,000,000. But it isn’t. Let’s understand why.
Here is what each thread does inside its loop:
/* Each thread does this many times: */
loc = glob; /* Step 1: Read glob into a local variable */
loc++; /* Step 2: Increment the local copy */
glob = loc; /* Step 3: Write it back to glob */
These are THREE separate steps. The kernel scheduler can pause a thread after any one of these steps and give the CPU to the other thread. When that happens, the first thread’s work gets overwritten.
Step-by-Step: What Goes Wrong
| Time | Thread 1 (CPU) | Thread 2 (CPU) | glob value |
|---|---|---|---|
| T1 | loc1 = glob → loc1 = 2000 | waiting | 2000 |
| T2 | TIME SLICE EXPIRES — paused | starts running | 2000 |
| T3 | waiting | runs 1000 loops: glob becomes 3000 | 3000 |
| T4 | resumes: loc1++ → loc1 = 2001 | paused | 3000 |
| T5 | glob = loc1 → glob = 2001 ❌ | paused | 2001 (lost 1000 increments!) |
Thread 1 read glob as 2000, then Thread 2 incremented it to 3000, then Thread 1 wrote back 2001 — erasing all of Thread 2’s work. This is the race condition.
Code Example 1: The Broken Program (No Synchronization)
This program creates two threads, each incrementing a shared glob variable many times. The final value is unpredictable.
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
static int glob = 0; /* Shared global variable */
/* Thread function: increments glob 'loops' times */
static void *threadFunc(void *arg)
{
int loops = *((int *) arg);
int loc, j;
for (j = 0; j < loops; j++) {
loc = glob; /* Read */
loc++; /* Increment */
glob = loc; /* Write back */
}
return NULL;
}
int main(int argc, char *argv[])
{
pthread_t t1, t2;
int loops = (argc > 1) ? atoi(argv[1]) : 1000000;
int s;
s = pthread_create(&t1, NULL, threadFunc, &loops);
if (s != 0) { perror("pthread_create"); exit(1); }
s = pthread_create(&t2, NULL, threadFunc, &loops);
if (s != 0) { perror("pthread_create"); exit(1); }
pthread_join(t1, NULL);
pthread_join(t2, NULL);
/* Expected: 2 * loops. Actual: UNPREDICTABLE */
printf("glob = %d (expected = %d)\n", glob, 2 * loops);
return 0;
}
/*
* Compile: gcc -o thread_incr thread_incr.c -lpthread
*
* Sample output (varies every run!):
* glob = 1234567 (expected = 2000000)
* glob = 1876432 (expected = 2000000)
* glob = 2000000 (got lucky this time)
*/
glob will almost always be wrong.What Is a “Critical Section”?
A critical section is a piece of code that accesses a shared resource (like a shared variable), and which must be executed atomically — meaning: no other thread should be allowed to access the same resource while this code is running.
In our example, the three lines loc = glob; loc++; glob = loc; together form the critical section. They must all complete as one uninterrupted unit.
Atomic means “indivisible” — either all three steps happen, or none. But without any synchronization mechanism, the OS scheduler can interrupt between any two steps.
Isn’t glob++ Atomic?
Common Beginner Mistake: You might think that replacing the three lines with a single glob++ would fix the problem. It does NOT.
On most hardware (especially RISC processors), even glob++ compiles down to the same three steps: load, add, store. The C language does not guarantee that increment is atomic. The race condition still exists.
To truly fix it, you need a mutex.
Why Is the Result Nondeterministic?
The exact wrong value of glob changes every time you run the program. Why? Because the Linux kernel decides when to switch threads based on many factors: system load, hardware timer interrupts, CPU scheduling algorithm. You have no control over this.
In complex programs, race conditions may only manifest under specific timing conditions — making them very hard to detect and reproduce. This is why correct synchronization is critical from the start.
Code Example 2: Observing the Nondeterminism
This version prints the value of glob from inside each thread, so you can see threads interleaving:
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
static int glob = 0;
/* Struct to pass both thread ID and loop count */
typedef struct {
int thread_num;
int loops;
} ThreadArg;
static void *threadFunc(void *arg)
{
ThreadArg *targ = (ThreadArg *) arg;
int j, loc;
for (j = 0; j < targ->loops; j++) {
loc = glob;
loc++;
glob = loc;
/* Print every 100 iterations to see interleaving */
if (j % 100 == 0)
printf("Thread %d: glob = %d\n", targ->thread_num, glob);
}
return NULL;
}
int main(void)
{
pthread_t t1, t2;
ThreadArg arg1 = {1, 500};
ThreadArg arg2 = {2, 500};
pthread_create(&t1, NULL, threadFunc, &arg1);
pthread_create(&t2, NULL, threadFunc, &arg2);
pthread_join(t1, NULL);
pthread_join(t2, NULL);
printf("Final glob = %d (expected 1000)\n", glob);
return 0;
}
/*
* Compile: gcc -o thread_observe thread_observe.c -lpthread
*
* Redirect output to file and observe interleaving:
* ./thread_observe > output.txt
*
* You will see Thread 1 and Thread 2 printing interleaved,
* and the final glob will be less than 1000.
*/
The Solution: Mutex
The fix is to use a mutex (mutual exclusion lock). A mutex ensures that only one thread can execute the critical section at a time. The other thread must wait until the first releases the mutex.
We cover this in the next files: static mutex (30.1.1), locking/unlocking (30.1.2), and the corrected program.
Interview Questions — Critical Sections & Race Conditions
glob++ thread-safe? Why or why not? No. On most architectures, glob++ compiles to three machine instructions: load, increment, store. The OS can preempt the thread between any two of these instructions, causing a race condition. Only truly atomic hardware instructions (like x86 LOCK prefix) are thread-safe without a mutex.