Chapter 33 · Section 33.4
Thread Implementation Models
M:1 (user-level), 1:1 (kernel-level), and M:N (two-level) thread models explained
Key Terms
What Is a Kernel Scheduling Entity (KSE)?
A Kernel Scheduling Entity (KSE) is the unit of work that the kernel’s scheduler manages. The kernel allocates CPU time to KSEs. In traditional single-threaded Unix, a KSE and a process were the same thing.
When threading was added to Unix, a new question arose: how should user-space threads (pthreads) map onto these kernel-visible KSEs? The answer defines the thread implementation model.
The Core Question: How do N user threads map to kernel KSEs?
4 user threads → 1 KSE (M:1) | 4 threads → 4 KSEs (1:1) | 4 threads → 2 KSEs (M:N)
M:1 Model — Many-to-One (User-Level Threads)
In the M:1 model, all threads map onto a single kernel scheduling entity. The kernel sees only one process. All thread management (creation, scheduling, synchronisation) happens entirely in user space via the threading library.
M:1 Model: All threads share one kernel view
✅ Advantages
- Very fast thread operations (no kernel mode switch)
- Mutex/lock operations happen entirely in user space
- Easy to port across different OS platforms
❌ Disadvantages
- One blocking syscall (e.g., read()) blocks ALL threads
- Cannot use multiple CPUs/cores in parallel
- Cannot assign different priorities to threads across processes
read() and it blocks, the kernel sees only one process and puts the entire process to sleep. All other user-level threads in that process are also frozen, even if they have work to do and the CPU is available. This is the fatal flaw of M:1.1:1 Model — One-to-One (Kernel-Level Threads)
In the 1:1 model, each user thread maps to its own KSE. The kernel knows about every thread and schedules them individually.
1:1 Model: Each thread gets its own KSE
✅ Advantages
- Blocking syscall only blocks the calling thread
- Kernel can schedule threads on different CPU cores in parallel
- True parallel execution on multicore systems
❌ Disadvantages
- Thread creation/switching requires kernel mode switch (slower than M:1)
- Applications with thousands of threads can overload the kernel scheduler
- More kernel resources per thread
M:N Model — Many-to-Many (Two-Level Model)
The M:N model tries to combine the best of M:1 and 1:1. Multiple user threads map onto fewer KSEs. The library multiplexes user threads onto a pool of KSEs.
M:N Model: Many threads, fewer KSEs
✅ Advantages (Theoretical)
- Fewer KSEs → less kernel overhead for many threads
- Blocking syscall can block just one KSE; library wakes another
- Parallel execution across multiple CPUs
❌ Disadvantages
- Extremely complex to implement correctly
- Library and kernel must cooperate and share scheduling info
- Signal management is complicated
- On Linux, the scheduler was rewritten to handle 1:1 at scale anyway
Comparison Table
| Feature | M:1 (User) | 1:1 (Kernel) | M:N (Two-level) |
|---|---|---|---|
| Kernel awareness of threads | ❌ None | ✅ Full | ⚠️ Partial |
| Thread create/switch speed | ⚡ Very fast | 🐢 Slower (syscall) | ⚡ Fast (user) |
| Blocking syscall effect | Blocks ALL threads | Only calling thread | One KSE blocked |
| True multicore parallelism | ❌ No | ✅ Yes | ✅ Yes |
| Implementation complexity | Low | Medium | Very High |
| Scalability (many threads) | Limited (1 CPU) | Good (NPTL optimised) | Good (theoretically) |
| Used by Linux? | ❌ No | ✅ Yes (NPTL + LinuxThreads) | ❌ Rejected for NPTL |
Code Example 1 — 1:1 Model: True Parallel Execution on Multiple Cores
On Linux (1:1 model), threads actually run on different CPU cores simultaneously. This example uses CPU affinity to pin threads to specific cores and verifies parallel execution.
/*
* ep_parallel_threads.c — Show true parallelism (1:1 model, multiple cores)
* Compile: gcc -o ep_parallel_threads ep_parallel_threads.c -lpthread
*
* On a multicore system, you'll see both threads running simultaneously.
* On M:1, only one would run at a time.
*/
#define _GNU_SOURCE
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sched.h>
#include <time.h>
static volatile long counter_a = 0;
static volatile long counter_b = 0;
struct thread_arg {
int cpu_id; /* which CPU to run on */
volatile long *counter; /* which counter to increment */
const char *name;
};
void *cpu_bound_thread(void *arg)
{
struct thread_arg *ta = (struct thread_arg *)arg;
cpu_set_t cpuset;
/* Pin this thread to a specific CPU */
CPU_ZERO(&cpuset);
CPU_SET(ta->cpu_id, &cpuset);
int ret = pthread_setaffinity_np(pthread_self(), sizeof(cpuset), &cpuset);
if (ret != 0)
printf("[%s] Warning: could not set CPU affinity\n", ta->name);
else
printf("[%s] Pinned to CPU %d\n", ta->name, ta->cpu_id);
/* Busy-work loop: count as fast as possible for 1 second */
struct timespec start, now;
clock_gettime(CLOCK_MONOTONIC, &start);
while (1) {
clock_gettime(CLOCK_MONOTONIC, &now);
double elapsed = (now.tv_sec - start.tv_sec)
+ (now.tv_nsec - start.tv_nsec) / 1e9;
if (elapsed >= 1.0) break;
(*ta->counter)++;
}
printf("[%s] Counted to: %ld\n", ta->name, *ta->counter);
return NULL;
}
int main(void)
{
pthread_t tid_a, tid_b;
struct thread_arg arg_a = { .cpu_id = 0, .counter = &counter_a, .name = "ThreadA" };
struct thread_arg arg_b = { .cpu_id = 1, .counter = &counter_b, .name = "ThreadB" };
int ncpus = (int)sysconf(_SC_NPROCESSORS_ONLN);
printf("[Main] CPUs available: %d\n", ncpus);
printf("[Main] Creating two CPU-bound threads (Linux 1:1 model)\n");
pthread_create(&tid_a, NULL, cpu_bound_thread, &arg_a);
pthread_create(&tid_b, NULL, cpu_bound_thread, &arg_b);
pthread_join(tid_a, NULL);
pthread_join(tid_b, NULL);
printf("[Main] Combined count: %ld\n", counter_a + counter_b);
printf("[Main] In M:1 model, both threads would share 1 CPU;\n");
printf("[Main] in 1:1 model they run truly in parallel.\n");
return 0;
}
Code Example 2 — Blocking Syscall: Why M:1 Fails
This example illustrates conceptually why a blocking syscall is fatal in M:1 but harmless in 1:1. We simulate it by having one thread block on read() (from a pipe with no writer) while another thread should continue working.
/*
* ep_blocking_syscall.c — Simulate blocking syscall impact
* In 1:1 (Linux): Thread B keeps running while Thread A blocks.
* In M:1: Thread B would also freeze (kernel sees 1 process sleeping).
*
* Compile: gcc -o ep_blocking_syscall ep_blocking_syscall.c -lpthread
*/
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
static int pipe_fds[2]; /* pipe: [0]=read end, [1]=write end */
/* Thread A: blocks on read() — simulates I/O wait */
void *blocking_thread(void *arg)
{
char buf[64];
printf("[ThreadA] About to call read() — will block until data arrives\n");
ssize_t n = read(pipe_fds[0], buf, sizeof(buf) - 1);
if (n > 0) {
buf[n] = '\0';
printf("[ThreadA] Read: '%s'\n", buf);
}
printf("[ThreadA] Done\n");
return NULL;
}
/* Thread B: CPU work — should NOT be affected by Thread A's blocking */
void *cpu_thread(void *arg)
{
printf("[ThreadB] Doing CPU work (should NOT block even though Thread A is blocked)\n");
long sum = 0;
for (long i = 0; i < 200000000L; i++) /* ~0.5s of work */
sum += i;
printf("[ThreadB] CPU work done. sum = %ld\n", sum);
return NULL;
}
int main(void)
{
pthread_t tid_a, tid_b;
if (pipe(pipe_fds) != 0) { perror("pipe"); exit(1); }
printf("[Main] Creating threads. Linux uses 1:1 model.\n");
printf("[Main] Thread A will block on read(); Thread B should run freely.\n\n");
pthread_create(&tid_a, NULL, blocking_thread, NULL);
pthread_create(&tid_b, NULL, cpu_thread, NULL);
/* Wait for Thread B to finish */
pthread_join(tid_b, NULL);
printf("\n[Main] Thread B finished — proves it was NOT blocked by Thread A.\n");
printf("[Main] Now writing to pipe to unblock Thread A...\n");
/* Unblock Thread A */
write(pipe_fds[1], "wake up!", 8);
pthread_join(tid_a, NULL);
close(pipe_fds[0]);
close(pipe_fds[1]);
printf("[Main] All done.\n");
return 0;
}
Interview Questions
read() or write()) blocks the entire process, including all user-level threads. The kernel has no knowledge of individual threads, so it cannot keep other threads running while one is blocked. Additionally, M:1 cannot exploit multiple CPU cores — all threads run on the same single kernel context.Section Summary
- A KSE is what the kernel schedules. Thread models define the user-thread-to-KSE mapping.
- M:1: All threads on one KSE. Fast user-level ops, but blocking syscall freezes all threads. No true parallelism.
- 1:1: Each thread is a KSE. True parallelism, blocking only affects calling thread. Used by Linux (NPTL, LinuxThreads).
- M:N: N threads on M KSEs. Best of both worlds in theory, but extremely complex. Rejected for Linux NPTL.
Keep Learning — It’s Free
EmbeddedPathashala — free embedded systems education.
