Voluntarily relinquishing CPU, querying the RR time slice

sched_yield(), RR Time Slice & Runaway Prevention

Chapter 35.3.3–35.3.4 — Voluntarily relinquishing CPU, querying the RR time slice, and protecting against runaway realtime processes

← Chapter 35 Index / sched_yield and Time Slice

sched_yield(): Voluntarily Give Up the CPU

A realtime process can voluntarily relinquish the CPU in two ways:

1. Blocking system call

Call something like read(), sleep(), or any blocking I/O. The process is taken off the CPU and added back to its priority queue when it unblocks.

2. sched_yield()

Explicitly moves to the back of the queue without blocking. Useful when you want to cooperatively share the CPU with other processes at the same priority level.


#include <sched.h>

/* Voluntarily relinquish the CPU.
 *
 * If there are other runnable processes at the same priority level:
 *   - This process is moved to the BACK of the queue.
 *   - The process at the HEAD of the queue gets to run.
 *
 * If NO other runnable processes exist at this priority:
 *   - Nothing happens; this process simply continues running.
 *
 * Returns 0 on success (always succeeds on Linux), -1 on error.
 * SUSv3 permits failure; portable code should check for -1.
 * sched_yield() is undefined for non-realtime (SCHED_OTHER) processes.
 */
int sched_yield(void);

Behavior Visualization

Priority Queue Before sched_yield():

P1 (HEAD — Running)

→

All at priority 50

After P1 calls sched_yield():

P2 (HEAD — Now Running)

P1 (moved to back)

⚠️ Non-realtime processes: The behavior of sched_yield() for SCHED_OTHER processes is undefined by POSIX. On Linux it works mechanically (moves to the back of the scheduling queue), but the semantics are not guaranteed. Only use sched_yield() in SCHED_FIFO or SCHED_RR processes.

sched_rr_get_interval(): Query the SCHED_RR Time Slice

SCHED_RR processes use a fixed-length time slice. You can query its length using sched_rr_get_interval(). This is useful when you need to know exactly how much CPU time you’ll get per turn in a multi-process system.


#include <sched.h>

/* Returns the time slice for SCHED_RR process 'pid'.
 * If pid = 0, returns the time slice of the calling process.
 * Result is stored in 'tp' as a timespec structure.
 * Returns 0 on success, -1 on error.
 */
int sched_rr_get_interval(pid_t pid, struct timespec *tp);

/* The timespec structure */
struct timespec {
    time_t tv_sec;   /* Seconds */
    long   tv_nsec;  /* Nanoseconds (0 to 999,999,999) */
};

/* Example usage */
struct timespec ts;
if (sched_rr_get_interval(0, &ts) == -1) {
    perror("sched_rr_get_interval");
} else {
    printf("SCHED_RR time slice: %ld.%09ld seconds\n",
           (long)ts.tv_sec, ts.tv_nsec);
    /* On recent 2.6 kernels: typically 0.100000000 seconds (100ms) */
}

📌 Typical value: On recent Linux 2.6+ kernels, the SCHED_RR time slice is 100 milliseconds (0.1 seconds). This value can vary between kernel versions and can be configured. Never hard-code it — always query at runtime.

⚠️ Preventing Runaway Realtime Processes

A runaway realtime process is one that accidentally enters an infinite loop or takes far longer than expected. Since SCHED_FIFO/SCHED_RR processes preempt all lower-priority processes, a runaway can lock up your entire system — you may not even be able to log in to fix it.

Linux provides multiple safeguards. Use at least one of these in any realtime application:

RLIMIT_CPU

CPU time resource limit. Set a soft CPU time limit using setrlimit(). When the process consumes more than the allowed CPU time, it receives SIGXCPU, which kills it by default. This is a hard cap on total CPU consumption.

struct rlimit rl;
rl.rlim_cur = 3;   /* 3 seconds CPU time soft limit */
rl.rlim_max = 5;   /* 5 seconds hard limit */
setrlimit(RLIMIT_CPU, &rl);

alarm()

Wall-clock timer. Call alarm(N) to set an alarm for N seconds of wall-clock (real) time. If the process is still running after N seconds, it receives SIGALRM, which kills it by default. Good when you want a wall-clock deadline.

alarm(10);  /* Kill this process after 10 seconds of real time */

Watchdog

Watchdog process. A separate high-priority process periodically checks the status of other realtime processes. If a process is misbehaving (using too much CPU, wrong priority), the watchdog can lower its priority, stop it with SIGSTOP, or kill it with SIGKILL. This is the most flexible approach.

RLIMIT_RTTIME

Realtime CPU burst limit (Linux 2.6.25+). Limits the amount of CPU time a realtime process may consume in a single burst without performing a blocking system call. Specified in microseconds. The count resets to 0 when the process makes a blocking call. When exceeded, SIGXCPU is sent.

struct rlimit rl;
rl.rlim_cur = 500000;  /* 500ms burst limit */
rl.rlim_max = RLIM_INFINITY;
setrlimit(RLIMIT_RTTIME, &rl);

📌 Note on RLIMIT_RTTIME: The count is NOT reset if the process is preempted by a higher-priority process, if a SCHED_RR time slice expires, or if the process calls sched_yield(). It only resets on a blocking system call. This ensures the limit applies to the actual continuous CPU burst.

💻 Code Example 1: sched_yield() with SCHED_FIFO Cooperation


/* fifo_cooperative.c
 * Two SCHED_FIFO processes (parent + child) at the same priority.
 * Without sched_yield(), one would hog the CPU forever.
 * With sched_yield(), they take turns.
 *
 * Compile: gcc fifo_cooperative.c -o fifo_cooperative
 * Run:     sudo ./fifo_cooperative
 */
#include <stdio.h>
#include <stdlib.h>
#include <sched.h>
#include <unistd.h>
#include <sys/wait.h>
#include <sys/resource.h>
#include <signal.h>

#define NUM_ROUNDS  5
#define RT_PRIORITY 10

void set_realtime(int priority)
{
    struct sched_param sp;
    sp.sched_priority = priority;
    if (sched_setscheduler(0, SCHED_FIFO, &sp) == -1) {
        perror("sched_setscheduler");
        exit(EXIT_FAILURE);
    }
}

void restore_normal(void)
{
    struct sched_param sp = { .sched_priority = 0 };
    sched_setscheduler(0, SCHED_OTHER, &sp);
}

int main(void)
{
    pid_t child;

    /* Safety: if we run for more than 10 seconds total, die */
    alarm(10);

    /* Safety: limit CPU time to 5 seconds */
    struct rlimit rl = { .rlim_cur = 5, .rlim_max = 10 };
    setrlimit(RLIMIT_CPU, &rl);

    child = fork();
    if (child == -1) { perror("fork"); exit(EXIT_FAILURE); }

    if (child == 0) {
        /* CHILD: set realtime after fork */
        set_realtime(RT_PRIORITY);

        for (int i = 0; i < NUM_ROUNDS; i++) {
            printf("[Child  PID=%d] Round %d\n", (int)getpid(), i + 1);
            fflush(stdout);

            /*
             * This is a SCHED_FIFO process with no time slice.
             * Without sched_yield(), we'd NEVER give the parent a turn!
             * sched_yield() moves us to the BACK of the priority queue.
             */
            sched_yield();
        }

        restore_normal();
        printf("[Child] Done.\n");
        exit(EXIT_SUCCESS);
    } else {
        /* PARENT: set realtime after fork */
        set_realtime(RT_PRIORITY);

        for (int i = 0; i < NUM_ROUNDS; i++) {
            printf("[Parent PID=%d] Round %d\n", (int)getpid(), i + 1);
            fflush(stdout);
            sched_yield();   /* Give child a turn */
        }

        restore_normal();
        printf("[Parent] Done.\n");
        wait(NULL);
    }

    return EXIT_SUCCESS;
}

Output (interleaved turns):
[Parent PID=…] Round 1
[Child PID=…] Round 1
[Parent PID=…] Round 2
[Child PID=…] Round 2 …

💻 Code Example 2: Query RR Time Slice and Set RLIMIT_RTTIME


/* rr_timeslice.c
 * 1. Queries the SCHED_RR time slice.
 * 2. Sets RLIMIT_RTTIME as a safety net.
 * 3. Demonstrates switching to SCHED_RR with protection.
 *
 * Compile: gcc rr_timeslice.c -o rr_timeslice
 * Run:     sudo ./rr_timeslice
 */
#include <stdio.h>
#include <stdlib.h>
#include <sched.h>
#include <sys/resource.h>
#include <unistd.h>
#include <time.h>

void print_timeslice(pid_t pid)
{
    struct timespec ts;
    if (sched_rr_get_interval(pid, &ts) == -1) {
        perror("sched_rr_get_interval");
        return;
    }
    printf("SCHED_RR time slice for PID %d: %ld sec + %ld ns (%.3f ms)\n",
           (int)pid,
           (long)ts.tv_sec,
           ts.tv_nsec,
           ts.tv_nsec / 1000000.0 + ts.tv_sec * 1000.0);
}

int main(void)
{
    struct sched_param sp;
    struct rlimit rl;

    /* === STEP 1: Check time slice BEFORE switching policy === */
    printf("=== Before switching to SCHED_RR ===\n");
    /* Note: sched_rr_get_interval returns 0 for non-RR processes
     * (the timespec will be {0, 0}) */
    print_timeslice(0);

    /* === STEP 2: Set RLIMIT_RTTIME safety net BEFORE going realtime ===
     * Limit: max 200ms CPU burst without a blocking syscall.
     * If exceeded → SIGXCPU → process dies.
     */
    rl.rlim_cur = 200000;    /* 200 milliseconds in microseconds */
    rl.rlim_max = 500000;    /* 500 milliseconds hard limit */
    if (setrlimit(RLIMIT_RTTIME, &rl) == -1) {
        /* RLIMIT_RTTIME added in 2.6.25; may not be available */
        perror("setrlimit(RLIMIT_RTTIME) [may not be supported]");
    } else {
        printf("RLIMIT_RTTIME set: soft=%lu us, hard=%lu us\n",
               rl.rlim_cur, rl.rlim_max);
    }

    /* === STEP 3: Switch to SCHED_RR === */
    sp.sched_priority = sched_get_priority_min(SCHED_RR) + 2;
    if (sched_setscheduler(0, SCHED_RR, &sp) == -1) {
        perror("sched_setscheduler(SCHED_RR) — need root");
        exit(EXIT_FAILURE);
    }
    printf("\n=== After switching to SCHED_RR (priority=%d) ===\n",
           sp.sched_priority);

    /* === STEP 4: Now query time slice for THIS RR process === */
    print_timeslice(0);

    /* === STEP 5: Do some work, then yield === */
    printf("\nDoing realtime work in short bursts with sched_yield()...\n");
    for (int i = 0; i < 3; i++) {
        /* Small burst of computation */
        volatile long x = 0;
        for (long j = 0; j < 1000000L; j++) x++;
        printf("Burst %d complete (x=%ld). Yielding CPU.\n", i+1, x);
        sched_yield();  /* Move to back of RR queue */
    }

    /* Restore normal scheduling */
    sp.sched_priority = 0;
    sched_setscheduler(0, SCHED_OTHER, &sp);
    printf("\nRestored to SCHED_OTHER. Done.\n");

    return EXIT_SUCCESS;
}

🎯 Interview Questions

Q1. What does sched_yield() do if no other process is at the same priority level?

If there are no other runnable processes at the same priority level, sched_yield() does nothing — the calling process simply continues running. It does not voluntarily drop its priority or give CPU to lower-priority processes.

Q2. What is the typical SCHED_RR time slice on Linux?

On recent Linux 2.6+ kernels, the SCHED_RR time slice is typically 100 milliseconds (0.1 seconds). However, this can vary between kernel versions and configurations. Always use sched_rr_get_interval() to query the actual value at runtime rather than assuming 100ms.

Q3. What is the difference between RLIMIT_CPU and RLIMIT_RTTIME for protecting against runaway realtime processes?

RLIMIT_CPU limits the total accumulated CPU time a process can consume (measured in seconds). Once exceeded, SIGXCPU is sent. It applies to all scheduling policies.

RLIMIT_RTTIME (Linux 2.6.25+) limits the CPU time consumed in a single continuous burst without a blocking system call, measured in microseconds. It resets to 0 whenever the process makes a blocking syscall. It’s specifically designed for realtime processes to prevent them from looping without ever yielding to the OS.

Q4. Does calling sched_yield() from a SCHED_FIFO process reset the RLIMIT_RTTIME counter?

No. The RLIMIT_RTTIME counter is not reset by sched_yield(). It is also not reset by being preempted by a higher-priority process, or by a SCHED_RR time slice expiring. The counter only resets when the process performs a blocking system call. This is by design — sched_yield() is a user-space cooperative yield and does not count as “blocking.”

Q5. What is the recommended safe pattern for writing a SCHED_FIFO realtime application?

A safe pattern includes: (1) Set RLIMIT_CPU before switching to realtime to cap total CPU. (2) Optionally set RLIMIT_RTTIME (Linux 2.6.25+) to limit burst CPU usage. (3) Use alarm() for a wall-clock deadline. (4) Call sched_yield() or a blocking syscall periodically so other processes get a chance to run. (5) Consider using SCHED_RESET_ON_FORK so child processes don’t inherit realtime policy.

Q6. Why is sched_yield() undefined for SCHED_OTHER processes?

POSIX leaves the behavior of sched_yield() for non-realtime processes undefined because the standard round-robin scheduler already has its own time-slice mechanism and does not rely on cooperative yielding. The SCHED_OTHER scheduler may not maintain a simple FIFO queue, so “moving to the back” has no well-defined meaning. In practice on Linux it does work, but portable code should not rely on it for SCHED_OTHER processes.

← Realtime Scheduling API Next: CPU Affinity →

embeddedpathashala.com