Linux Capabilities, chroot Jails, and Virtualization

 

Confine the Process
Chapter 38.5 โ€” Linux Capabilities, chroot Jails, and Virtualization
๐Ÿ”ง Capabilities
๐Ÿ›๏ธ chroot Jail
๐Ÿ’ป Virtual Server

Defense in Depth โ€” Confining What a Program Can Do

Even after applying least-privilege and other best practices, a program might still be exploited. Defense in depth means adding layers of containment so that even if an attacker does compromise the program, the damage they can do is limited. This section covers two major Linux confinement mechanisms: capabilities and chroot jails.

๐Ÿ”ง Linux Capabilities โ€” Fine-Grained Privilege

Traditional UNIX privilege is binary: either you are root (UID=0) and can do everything, or you are unprivileged and restricted from sensitive operations. This “all-or-nothing” model is a problem โ€” many daemons need just one root-level ability (like binding to port 80), but to get it they must run as full root, gaining every other root power along the way.

Linux capabilities solve this by splitting the all-or-nothing root privilege into about 40 independent units. A process can hold only the specific capabilities it needs, with all others disabled.

โŒ Traditional Root
Run as root to bind port 80?
You also get:
โœ— Full filesystem access
โœ— Load kernel modules
โœ— Override all file permissions
โœ— Kill any process
โœ— Change any user’s password
One exploit = full system compromise
โœ… With Capabilities
Need to bind port 80?
Grant only:
โœ“ CAP_NET_BIND_SERVICE
โœ— No filesystem override
โœ— No module loading
โœ— No process killing
One exploit = limited damage

Commonly Used Capabilities

Capability What it allows
CAP_NET_BIND_SERVICE Bind to privileged ports (below 1024)
CAP_NET_RAW Use raw sockets (for ping, packet capture)
CAP_DAC_OVERRIDE Bypass file read/write/execute permission checks
CAP_SETUID Make arbitrary changes to process UIDs
CAP_SYS_ADMIN Wide range of admin ops (mount, sethostname, etc.)
CAP_IPC_LOCK Lock memory (mlock) without limits
CAP_KILL Send signals to any process
CAP_SYS_CHROOT Use chroot() system call

A process has three sets of capabilities: permitted (the maximum it can have), effective (currently active), and inheritable (passed through exec). A fourth set, ambient, was added later for non-set-UID programs to inherit capabilities.

๐Ÿ›๏ธ The chroot Jail โ€” Limiting File System Access

A chroot jail is a technique that changes the root directory (/) of a process to a specific subdirectory. From that point on, the process cannot access anything outside that subtree โ€” it is “jailed” in its own mini filesystem.

This is useful for network daemons (FTP servers, DNS servers) that should only see their own data files and not the rest of the system. Even if the daemon is compromised, the attacker can’t access /etc/passwd or other system files because they’re outside the jail.

Filesystem View Comparison
Normal Process
/ (root)
โ”œโ”€โ”€ etc/
โ”‚ โ””โ”€โ”€ passwd โ† visible
โ”œโ”€โ”€ home/
โ”œโ”€โ”€ var/
โ””โ”€โ”€ srv/
โ””โ”€โ”€ ftp/ โ† jail root
Jailed Process (chroot /srv/ftp)
/ (jail root = /srv/ftp)
โ”œโ”€โ”€ pub/
โ”œโ”€โ”€ lib/ (needed libs)
โ”œโ”€โ”€ bin/ (needed bins)
โ† /etc/passwd INVISIBLE
โ† /home/ INVISIBLE
โ† Entire real / INVISIBLE

Critical Rules for chroot Jails

โš ๏ธ Rule 1: Always call chdir() after chroot()
After chroot(), call chdir("/") immediately. Without this, the process’s current working directory is still outside the jail โ€” and can be used to escape it!
โš ๏ธ Rule 2: chroot jail does NOT confine set-UID-root programs
A process with root privilege can escape a chroot jail using various techniques (e.g., creating a new root via mknod + mount, or using open FDs from before the chroot). chroot is useful for non-privileged daemons, not as a security layer for root programs.

๐Ÿ’ป Code Example 1: Setting Up a chroot Jail for a Daemon
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/stat.h>
#include <sys/types.h>

/*
 * setup_jail(): Establish a chroot jail for a daemon.
 *
 * Must be called while still root (before dropping privilege).
 * After this call, the process cannot access anything outside jail_path.
 *
 * jail_path: the directory to use as the new root
 * daemon_uid: UID to drop to after entering the jail
 * daemon_gid: GID to drop to after entering the jail
 */
void setup_jail(const char *jail_path, uid_t daemon_uid, gid_t daemon_gid)
{
    /* Step 1: Change root directory to jail (requires root privilege) */
    if (chroot(jail_path) == -1) {
        perror("chroot");
        exit(EXIT_FAILURE);
    }
    printf("[Jail] chroot to: %s\n", jail_path);

    /* Step 2: CRITICAL โ€” chdir to new root to prevent jail escape */
    /* Without this, CWD is still outside the jail! */
    if (chdir("/") == -1) {
        perror("chdir /");
        exit(EXIT_FAILURE);
    }
    printf("[Jail] Changed CWD to jail root.\n");

    /* Step 3: Now drop privileges โ€” inside the jail, as low-privilege user */
    if (setresgid(daemon_gid, daemon_gid, daemon_gid) == -1) {
        perror("setresgid");
        exit(EXIT_FAILURE);
    }
    if (setresuid(daemon_uid, daemon_uid, daemon_uid) == -1) {
        perror("setresuid");
        exit(EXIT_FAILURE);
    }

    /* Verify */
    printf("[Jail] Running as UID=%d GID=%d inside jail.\n",
           (int)getuid(), (int)getgid());
    printf("[Jail] Filesystem access confined to: %s\n", jail_path);
}

int main(void)
{
    /*
     * In a real daemon, jail_path would be the service directory,
     * populated with just the files the daemon needs.
     *
     * For this demo, using /tmp as the jail (always exists).
     * daemon_uid=65534 (nobody), daemon_gid=65534 (nogroup) are
     * common choices for unprivileged daemons on many systems.
     */
    const char *jail_path = "/tmp";
    uid_t daemon_uid = 65534;   /* nobody */
    gid_t daemon_gid = 65534;   /* nogroup */

    printf("Before jail: EUID=%d, CWD accessible\n", (int)geteuid());

    /* Only call if we are root */
    if (geteuid() == 0) {
        setup_jail(jail_path, daemon_uid, daemon_gid);
    } else {
        printf("Note: chroot requires root. EUID=%d โ€” skipping jail setup.\n",
               (int)geteuid());
    }

    printf("Daemon running in confined environment.\n");
    return 0;
}

๐Ÿ’ป Code Example 2: Reading Capabilities of the Current Process
#include <stdio.h>
#include <stdlib.h>
#include <sys/capability.h>

/*
 * show_capabilities(): Print the current process's capability sets.
 *
 * Compile with: gcc -o show_caps show_caps.c -lcap
 *
 * Install libcap-dev: sudo apt-get install libcap-dev
 */
void show_capabilities(void)
{
    cap_t caps;
    char *text;

    /* Get all capabilities of the current process */
    caps = cap_get_proc();
    if (caps == NULL) {
        perror("cap_get_proc");
        return;
    }

    /* Convert to human-readable text */
    text = cap_to_text(caps, NULL);
    if (text == NULL) {
        perror("cap_to_text");
        cap_free(caps);
        return;
    }

    printf("Process capabilities: %s\n", text);

    cap_free(text);
    cap_free(caps);
}

/*
 * check_capability(): Check if a specific capability is in the effective set.
 */
int has_capability(cap_value_t cap)
{
    cap_t caps;
    cap_flag_value_t value;

    caps = cap_get_proc();
    if (caps == NULL) return 0;

    if (cap_get_flag(caps, cap, CAP_EFFECTIVE, &value) == -1) {
        cap_free(caps);
        return 0;
    }

    cap_free(caps);
    return (value == CAP_SET);
}

int main(void)
{
    printf("=== Capability Check ===\n");
    printf("EUID: %d\n", (int)geteuid());

    show_capabilities();

    /* Check specific capabilities */
    printf("Has CAP_NET_BIND_SERVICE: %s\n",
           has_capability(CAP_NET_BIND_SERVICE) ? "YES" : "NO");
    printf("Has CAP_DAC_OVERRIDE     : %s\n",
           has_capability(CAP_DAC_OVERRIDE) ? "YES" : "NO");
    printf("Has CAP_SYS_ADMIN        : %s\n",
           has_capability(CAP_SYS_ADMIN) ? "YES" : "NO");

    return 0;
}

๐Ÿ“Œ Key Terms

Linux Capabilities CAP_NET_BIND_SERVICE CAP_SYS_ADMIN Permitted Set Effective Set Inheritable Set chroot() chdir() Chroot Jail Jail Escape Virtual Server Defense in Depth
๐ŸŽฏ Interview Questions โ€” Process Confinement
Q1. What problem do Linux capabilities solve?Traditional UNIX privilege is binary: root or not. A program that needs just one privileged operation (e.g., binding port 80) must run as full root, gaining all root powers. Capabilities split root privilege into ~40 independent units. A process can hold only what it needs, limiting damage if compromised.

Q2. What are the three capability sets a process has?Permitted: the maximum set of capabilities the process can ever have. Effective: currently active capabilities used for permission checks. Inheritable: capabilities that can be inherited by exec’d programs.

Q3. What is a chroot jail and what are its limitations?chroot() changes the apparent root directory to a subdirectory. The process can only see the filesystem subtree below that point. However, it does NOT confine root processes โ€” a root process can escape a chroot jail through several techniques. It’s primarily useful for non-privileged daemons.

Q4. Why must chdir(“/”) be called immediately after chroot()?chroot() changes where “/” resolves, but the process’s current working directory (CWD) is unchanged. If CWD is outside the jail, the process can use relative paths like “../../../../etc/passwd” to escape. chdir(“/”) inside the jail ensures the CWD is confined.

Q5. In what order should chroot() and privilege dropping be performed?chroot() must be called first while still root (it requires root or CAP_SYS_CHROOT). Then drop privileges. Dropping privilege before chroot() means you won’t have permission to call chroot() at all.

Next: Signals and Race Conditions โ†’
TOCTTOU attacks, signal handling in privileged programs

Continue to Part 6 โ† Part 4

Leave a Reply

Your email address will not be published. Required fields are marked *