CPU Affinity Linux Process Scheduling

 

Linux Process Scheduling · Chapter 35 · Part 7

CPU Affinity

Control which CPUs a process or thread is allowed to run on. Learn the cpu_set_t API, the performance motivation behind affinity, per-thread affinity with gettid(), and the privilege rules enforced by the kernel.

sched_setaffinity sched_getaffinity cpu_set_t gettid() isolcpus

What is CPU Affinity?

On a multiprocessor (SMP) system, the Linux scheduler is free to migrate a process from one CPU to another at any time. While this keeps all CPUs busy, it has a hidden cost: cache invalidation. When a process moves, its hot data in the L1/L2 cache of the old CPU is useless — it must reload everything on the new CPU.

CPU affinity is a mechanism that lets you restrict which CPUs a process is eligible to run on. The kernel represents this as a bitmask — the CPU affinity mask — stored per thread in the kernel’s task_struct.

CPU Affinity Mask Concept (4-CPU system)

Default (no restriction)
0
1
2
3
Mask: 1111 (binary) = 0xF
May run on any CPU

Pinned to CPU 0 only
0
1
2
3
Mask: 0001 (binary) = 0x1
Runs on CPU 0 only

Restricted to CPUs 2 & 3
0
1
2
3
Mask: 1100 (binary) = 0xC
Bounces only between CPUs 2 and 3

Soft Affinity vs Hard Affinity

Soft affinity (default): The scheduler tries to keep a process on the same CPU to exploit cache warmth, but will migrate it when load balancing demands it. No API call needed — this is always active.

Hard affinity: You explicitly set an affinity mask via sched_setaffinity(). The kernel guarantees the process will only ever run on the CPUs in that mask, regardless of load balancing.

The cpu_set_t Type and Macros

The affinity mask is stored in a cpu_set_t opaque type defined in <sched.h>. Never manipulate bits in it directly — always use the provided macros. The constant CPU_SETSIZE is defined as 1024, which means a single cpu_set_t can represent up to 1024 CPUs.

Macro Arguments Effect
CPU_ZERO(set) cpu_set_t *set Clears all bits — no CPU is in the set
CPU_SET(cpu, set) int cpu, cpu_set_t *set Adds CPU cpu to the set
CPU_CLR(cpu, set) int cpu, cpu_set_t *set Removes CPU cpu from the set
CPU_ISSET(cpu, set) int cpu, cpu_set_t *set Returns nonzero if CPU cpu is in the set
CPU_SETSIZE constant 1024 — max CPUs representable in cpu_set_t
Header required: #define _GNU_SOURCE before #include <sched.h> — these are Linux-specific extensions not in POSIX.

sched_setaffinity() and sched_getaffinity()

/* Both declared in <sched.h>, require _GNU_SOURCE */ int sched_setaffinity(pid_t pid, size_t cpusetsize, const cpu_set_t *mask); int sched_getaffinity(pid_t pid, size_t cpusetsize, cpu_set_t *mask); /* Both return 0 on success, -1 on error */

Parameter Meaning
pid Process (or thread) ID to set/get. Use 0 for the calling thread. Use gettid() for a specific thread in a multithreaded process.
cpusetsize Size of the cpu_set_t in bytes. Always pass sizeof(cpu_set_t).
mask Pointer to the cpu_set_t. For set: the new affinity. For get: filled in by the kernel.

Privilege Rules for sched_setaffinity()
Unprivileged process

Can set the affinity of a process only if the real or effective UID of the calling process matches the real or saved set-UID of the target process. In other words: same user only.

CAP_SYS_NICE

A process with this capability can set the CPU affinity of any process in the system, regardless of UID.

Code Example 1 — Pin a Process to a Single CPU

This program takes a PID on the command line and pins it to CPU 0. It then reads back the affinity mask to verify the change, and prints all CPUs in the resulting mask.

/* pin_to_cpu0.c — pin a process to CPU 0 using sched_setaffinity()
   Usage: ./pin_to_cpu0 <pid>
   If pid is 0, pins the calling process itself.
   Requires _GNU_SOURCE for cpu_set_t macros.
*/
#define _GNU_SOURCE
#include <sched.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>

static void print_affinity_mask(pid_t pid)
{
    cpu_set_t mask;
    CPU_ZERO(&mask);

    if (sched_getaffinity(pid, sizeof(mask), &mask) == -1) {
        perror("sched_getaffinity");
        return;
    }

    printf("Affinity mask (CPUs allowed): ");
    int found = 0;
    for (int cpu = 0; cpu < CPU_SETSIZE; cpu++) {
        if (CPU_ISSET(cpu, &mask)) {
            printf("%d ", cpu);
            found++;
        }
    }
    if (!found)
        printf("(none)");
    printf("\n");
}

int main(int argc, char *argv[])
{
    pid_t pid = 0;

    if (argc > 1)
        pid = (pid_t) atoi(argv[1]);

    printf("Before change:\n");
    print_affinity_mask(pid);

    /* Build a mask with only CPU 0 */
    cpu_set_t newmask;
    CPU_ZERO(&newmask);
    CPU_SET(0, &newmask);       /* add CPU 0 */

    if (sched_setaffinity(pid, sizeof(newmask), &newmask) == -1) {
        fprintf(stderr, "sched_setaffinity: %s\n", strerror(errno));
        return EXIT_FAILURE;
    }

    printf("After pinning to CPU 0:\n");
    print_affinity_mask(pid);

    return EXIT_SUCCESS;
}
Sample output (4-CPU machine, targeting itself):
Before change: Affinity mask (CPUs allowed): 0 1 2 3
After pinning to CPU 0: Affinity mask (CPUs allowed): 0

Per-Thread Affinity with gettid()

In Linux, CPU affinity is a per-thread attribute, not per-process. Each thread in a multithreaded program has its own affinity mask stored in its task_struct. To set affinity for a specific thread, pass its kernel thread ID obtained via gettid().

gettid() vs getpid(): In a multithreaded process, getpid() returns the same value for all threads (the process ID). gettid() (syscall, requires #include <sys/syscall.h> and syscall(SYS_gettid) on older glibc, or directly on glibc 2.30+) returns the unique kernel thread ID, which is what sched_setaffinity() expects when targeting individual threads.

Per-Thread Affinity in a 3-Thread Process

Thread 0 (main)
tid = 1001
0
1
2
3
CPUs 0–1 only

Thread 1 (worker)
tid = 1002
0
1
2
3
CPUs 2–3 only

Thread 2 (RT)
tid = 1003
0
1
2
3
CPU 0 only (isolated RT)
Each thread inside the same process can have a completely independent CPU affinity mask.

Code Example 2 — Per-Thread Affinity in a Multithreaded Program

This example spawns two worker threads and pins each to a different CPU. The main thread pins itself to CPU 0. This pattern is common in real-time audio/video pipelines where the RT thread must be isolated from others.

/* thread_affinity.c — per-thread CPU pinning demo
   Spawns two pthreads, pins each to a dedicated CPU.
   Requires: -lpthread
   Compile:  gcc -Wall -o thread_affinity thread_affinity.c -lpthread
*/
#define _GNU_SOURCE
#include <sched.h>
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/syscall.h>   /* SYS_gettid */

/* Helper: get kernel thread ID (tid) of calling thread */
static pid_t get_tid(void)
{
    return (pid_t) syscall(SYS_gettid);
}

/* Helper: pin calling thread to one CPU */
static int pin_thread_to_cpu(int cpu_num)
{
    cpu_set_t mask;
    CPU_ZERO(&mask);
    CPU_SET(cpu_num, &mask);
    return sched_setaffinity(get_tid(), sizeof(mask), &mask);
}

static void *worker(void *arg)
{
    int target_cpu = *(int *)arg;

    if (pin_thread_to_cpu(target_cpu) == -1) {
        perror("sched_setaffinity (thread)");
        return NULL;
    }

    pid_t tid = get_tid();
    printf("Thread tid=%d pinned to CPU %d\n", tid, target_cpu);

    /* Simulate work — stays on target_cpu */
    volatile long long sum = 0;
    for (long long i = 0; i < 500000000LL; i++)
        sum += i;

    printf("Thread tid=%d done (sum=%lld)\n", tid, sum);
    return NULL;
}

int main(void)
{
    /* Pin main thread to CPU 0 */
    if (pin_thread_to_cpu(0) == -1) {
        perror("pin main");
        return EXIT_FAILURE;
    }
    printf("Main thread (tid=%d) pinned to CPU 0\n", get_tid());

    pthread_t t1, t2;
    int cpu1 = 1, cpu2 = 2;

    pthread_create(&t1, NULL, worker, &cpu1);
    pthread_create(&t2, NULL, worker, &cpu2);

    pthread_join(t1, NULL);
    pthread_join(t2, NULL);

    printf("All threads complete.\n");
    return EXIT_SUCCESS;
}
Key points in this code:

  • syscall(SYS_gettid) is used instead of getpid() because each pthread has a distinct kernel TID.
  • pin_thread_to_cpu() calls sched_setaffinity(get_tid(), ...) — targeting the calling thread’s TID, not the process PID.
  • Child threads inherit the parent’s affinity mask at creation time (fork/exec both preserve affinity). The worker immediately overrides it.

Affinity Inheritance and exec()

Affinity Propagation Rules

fork()
Child inherits parent’s CPU affinity mask. If parent was pinned to CPU 2, child starts pinned to CPU 2.

exec()
Affinity mask is preserved across exec(). The new program starts with the same CPU restriction that was set before the exec() call.

pthread
_create()
New thread inherits the creating thread’s affinity mask. Override it inside the thread function using sched_setaffinity(gettid(), ...).

Use Cases & Kernel Options

Cache Performance

Pinning a hot process to one CPU avoids cache line invalidation on migration. Critical for memory-bandwidth-bound workloads or tight RT control loops.

🔗
Pipeline Throughput

Two processes communicating via pipe benefit from running on the same CPU. Shared L1 cache means the written data is immediately available to the reader without going to RAM.

🛡️
RT Process Isolation

Combine SCHED_FIFO + CPU pinning to guarantee a real-time thread always runs on a dedicated CPU with no non-RT competition. The gold standard for deterministic latency.

🧮
NUMA Optimization

On NUMA (Non-Uniform Memory Access) machines, pin processes to CPUs local to the memory they use. Cross-NUMA memory access has higher latency than local access.

isolcpus Boot Parameter

For the strongest isolation, use the isolcpus= kernel boot parameter. This removes specified CPUs from the general scheduler pool at boot time. Only processes that explicitly set their affinity to an isolated CPU can run on it.

# In /etc/default/grub, add to GRUB_CMDLINE_LINUX:
GRUB_CMDLINE_LINUX="... isolcpus=2,3"

# Then update grub and reboot:
sudo update-grub
sudo reboot

# After reboot, CPUs 2 and 3 are isolated.
# Only explicitly pinned processes run there.
# The general scheduler uses only CPUs 0 and 1.

# Verify:
cat /sys/devices/system/cpu/isolated   # shows: 2-3
cat /sys/devices/system/cpu/present    # shows: 0-3 (all still exist)
cpuset kernel option: For more flexible runtime CPU partitioning, the kernel cpuset feature (via cgroups) lets you assign groups of processes to CPU subsets without rebooting. Useful in containerized and cloud environments.

Interview Questions & Answers

Q1: What is the difference between soft and hard CPU affinity in Linux?
Soft affinity is the scheduler’s built-in tendency to keep a process on the same CPU for cache efficiency. It is advisory — the scheduler overrides it during load balancing. Hard affinity is enforced via sched_setaffinity(): the process is restricted to the specified CPU mask and the scheduler never migrates it outside that set, regardless of load.
Q2: Why can’t you manipulate the bits of cpu_set_t directly?
cpu_set_t is an opaque type — its internal representation is implementation-defined and may change across kernel versions or architectures. The provided macros (CPU_ZERO, CPU_SET, CPU_CLR, CPU_ISSET) are the stable interface. Direct bit manipulation would be non-portable and break on systems where the internal layout differs.
Q3: Why is gettid() needed for per-thread affinity instead of getpid()?
In Linux, CPU affinity is a per-thread attribute stored in the thread’s task_struct. All POSIX threads in a process share the same process ID (getpid()), so passing getpid() to sched_setaffinity() targets the main thread only. To target a specific thread, you must pass its kernel thread ID via gettid() (or syscall(SYS_gettid) on older glibc). Passing 0 targets the calling thread.
Q4: What privilege is required to set the CPU affinity of another process?
An unprivileged process may only set the CPU affinity of a target process if the caller’s real or effective UID matches the target’s real or saved set-UID. To set affinity of any arbitrary process (e.g., a system daemon), the caller needs the CAP_SYS_NICE capability. This is the same capability required for raising scheduling priorities.
Q5: Is CPU affinity preserved across fork() and exec()?
Yes to both. A child created by fork() inherits the parent’s CPU affinity mask. When a process calls exec(), the new program image also inherits the affinity mask that was set before the exec. This means a shell script that pins itself before launching a child process will cause the child to start with the same CPU restriction.
Q6: What does the isolcpus boot parameter do, and how does it differ from sched_setaffinity()?
isolcpus=N in the kernel command line removes CPU N from the general scheduler’s pool at boot time. No normal process will ever be placed there. A process must explicitly call sched_setaffinity() to run on an isolated CPU. This is stronger than just calling sched_setaffinity() on the process alone, because with isolcpus there is no background scheduler activity on that CPU at all — perfect for deterministic RT workloads.
Q7: Why does pinning a producer and consumer to the same CPU improve pipe throughput?
When the producer writes data to the pipe, that data resides in the L1/L2 cache of its CPU. If the consumer runs on the same CPU, it reads that data directly from the warm cache with very low latency. If the consumer runs on a different CPU, the cache lines must be transferred via the shared L3 cache or even main memory (especially on NUMA), which is significantly slower. Context-switching between them on one CPU is cheaper than cross-CPU cache coherence traffic.

Chapter 35 Complete ✓

You have now covered all the key topics in TLPI Chapter 35 — from nice values and realtime scheduling policies through the full scheduling API, runaway protection, and CPU affinity.

Nice values (Part 1) RT Overview (Part 2) SCHED_RR & FIFO (Part 3) BATCH & IDLE (Part 4) RT API (Part 5) yield & Timeslice (Part 6) CPU Affinity (Part 7) ← you are here

EmbeddedPathashala · Free Linux & Embedded Systems Tutorials

Leave a Reply

Your email address will not be published. Required fields are marked *