Memory Mappings Advanced Topics & Quick Reference

 

Chapter 49: Memory Mappings
Part 5 of 5 β€” Advanced Topics & Quick Reference
βš™οΈ Topic
Advanced mmap
🎯 Level
Advanced
πŸ“š Source
TLPI Ch49
πŸ’Ό Target
TI / ST / Qualcomm

Advanced mmap Topics

This final part of the Chapter 49 series covers the more advanced operations you can perform on memory mappings: changing protection after creation (mprotect()), locking pages in RAM (mlock()), controlling placement with MAP_FIXED, using madvise() for performance hints, inspecting mappings via /proc/PID/maps, and a bridge to POSIX shared memory (Chapter 54).

Key Terms in This Part:

mprotect() mlock() munlock() mlockall() madvise() MAP_FIXED MAP_FIXED_NOREPLACE /proc/PID/maps POSIX shm_open Guard Pages Huge Pages

mprotect() β€” Changing Protection After Mapping

mprotect() changes the memory protection of an existing mapping (or any pages in the virtual address space). You can tighten or loosen the PROT flags at any time without remapping.

#include <sys/mman.h>

int mprotect(void *addr, size_t length, int prot);
/* Returns 0 on success, -1 on error */
/* addr must be page-aligned */

Common patterns:

/*
 * mprotect_demo.c β€” Change protection of a mapping at runtime
 * Compile: gcc -o mprotect_demo mprotect_demo.c
 */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <signal.h>
#include <setjmp.h>
#include <unistd.h>
#include <sys/mman.h>

static sigjmp_buf jmpbuf;

void segv_handler(int sig)
{
    printf("Caught SIGSEGV (signal %d) β€” write blocked by mprotect!\n", sig);
    siglongjmp(jmpbuf, 1);
}

int main(void)
{
    char  *buf;
    size_t size = 4096;

    /* Allocate anonymous page, start with read+write */
    buf = mmap(NULL, size, PROT_READ | PROT_WRITE,
               MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
    if (buf == MAP_FAILED) { perror("mmap"); exit(1); }

    /* Write works fine */
    strcpy(buf, "Hello mprotect");
    printf("Written: %s\n", buf);

    /* Make the page read-only */
    if (mprotect(buf, size, PROT_READ) == -1) {
        perror("mprotect");
        exit(1);
    }
    printf("Protection changed to PROT_READ only\n");

    /* Read still works */
    printf("Read: %s\n", buf);

    /* Attempt write β€” will SIGSEGV */
    signal(SIGSEGV, segv_handler);
    if (sigsetjmp(jmpbuf, 1) == 0) {
        buf[0] = 'X';   /* This will trap */
    }

    /* Restore write permission */
    mprotect(buf, size, PROT_READ | PROT_WRITE);
    buf[0] = 'Y';   /* Works again */
    printf("After restore write: %c%s\n", buf[0], buf + 1);

    munmap(buf, size);
    return 0;
}
Use case β€” JIT compiler pattern: Allocate page as PROT_READ|PROT_WRITE, write compiled machine code, then switch to PROT_READ|PROT_EXEC before executing. Never keep PROT_WRITE|PROT_EXEC simultaneously (W^X policy β€” required by security hardening).
/* JIT pattern: Write then Execute */
void *code_page = mmap(NULL, 4096,
                       PROT_READ | PROT_WRITE,
                       MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);

/* ... emit machine code bytes into code_page ... */

/* Switch to executable β€” remove write permission (W^X) */
mprotect(code_page, 4096, PROT_READ | PROT_EXEC);

/* Cast to function pointer and call */
typedef int (*fn_t)(void);
fn_t fn = (fn_t)code_page;
int result = fn();

Guard Pages β€” Detecting Buffer Overflows with mprotect()

A guard page is a page with PROT_NONE placed adjacent to a buffer. Any overflow into the guard page causes an immediate SIGSEGV, which is far better than silent memory corruption.

Guard Page
(PROT_NONE)
4 KB
Buffer
(PROT_READ|WRITE)
N Γ— 4 KB
Guard Page
(PROT_NONE)
4 KB
overflow β†’ SIGSEGV overflow β†’ SIGSEGV
/*
 * guard_page.c β€” Buffer sandwiched between two guard pages
 * Compile: gcc -o guard_page guard_page.c
 */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/mman.h>

#define BUFFER_PAGES  2    /* 2 pages = 8 KB buffer */

int main(void)
{
    long   page_size;
    char  *region;
    char  *buffer;
    size_t total;

    page_size = sysconf(_SC_PAGESIZE);
    total     = (BUFFER_PAGES + 2) * page_size;  /* guard + buf + guard */

    /* Allocate region with no access initially */
    region = mmap(NULL, total, PROT_NONE,
                  MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
    if (region == MAP_FAILED) { perror("mmap"); exit(1); }

    /* Make only the buffer pages accessible */
    buffer = region + page_size;     /* Skip first guard page */
    if (mprotect(buffer, BUFFER_PAGES * page_size,
                 PROT_READ | PROT_WRITE) == -1) {
        perror("mprotect"); exit(1);
    }

    printf("Guard page at: %p (PROT_NONE)\n", (void*)region);
    printf("Buffer  at:    %p (PROT_READ|WRITE, %ld bytes)\n",
           (void*)buffer, (long)(BUFFER_PAGES * page_size));
    printf("Guard page at: %p (PROT_NONE)\n",
           (void*)(buffer + BUFFER_PAGES * page_size));

    /* Safe access within buffer */
    memset(buffer, 0xCC, BUFFER_PAGES * page_size);
    printf("Buffer filled safely\n");

    /* Overflow into guard page would cause SIGSEGV here */
    /* buffer[BUFFER_PAGES * page_size] = 0;   <-- would crash */

    munmap(region, total);
    return 0;
}

mlock() β€” Pinning Pages in RAM (Preventing Swap)

mlock() prevents the pages of a mapping from being swapped out to disk. This is critical for real-time processes (to avoid unpredictable latency from page faults) and for security (to prevent sensitive data from appearing in swap space).

#include <sys/mman.h>

int mlock(const void *addr, size_t length);    /* Lock specific range */
int munlock(const void *addr, size_t length);  /* Unlock specific range */
int mlockall(int flags);   /* Lock entire process address space */
int munlockall(void);      /* Unlock entire address space */
mlockall() Flag Meaning
MCL_CURRENT Lock all pages currently mapped in the process
MCL_FUTURE Lock all pages that will be mapped in the future
MCL_CURRENT|MCL_FUTURE Lock both current and future mappings
/*
 * mlock_demo.c β€” Lock sensitive data in RAM
 * Compile: gcc -o mlock_demo mlock_demo.c
 * May need: sudo setcap cap_ipc_lock+ep mlock_demo
 *        or: run as root
 */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/mman.h>

#define KEY_SIZE  32   /* 256-bit key */

int main(void)
{
    char *secret;
    long  page_size = sysconf(_SC_PAGESIZE);

    /* Allocate one page for sensitive data */
    secret = mmap(NULL, page_size, PROT_READ | PROT_WRITE,
                  MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
    if (secret == MAP_FAILED) { perror("mmap"); exit(1); }

    /* Lock this page in RAM β€” never swap to disk */
    if (mlock(secret, page_size) == -1) {
        perror("mlock (need CAP_IPC_LOCK or root)");
        /* Non-fatal: continue without lock in demo */
    } else {
        printf("Sensitive page locked in RAM\n");
    }

    /* Store the "secret" */
    memcpy(secret, "SUPER_SECRET_KEY_12345678901234", KEY_SIZE);
    printf("Secret stored (won't appear in swap)\n");

    /* When done, zero the memory before freeing */
    memset(secret, 0, page_size);   /* Scrub the data */
    munlock(secret, page_size);
    munmap(secret, page_size);
    printf("Secret zeroed and memory released\n");

    return 0;
}
Privilege required: mlock() requires CAP_IPC_LOCK capability or root on Linux. Non-privileged processes have a limit on how much memory they can lock (see RLIMIT_MEMLOCK).

madvise() β€” Performance Hints to the Kernel

madvise() lets the application tell the kernel about its expected access pattern for a mapped region. The kernel is free to ignore the hints, but usually uses them to optimise page-in/page-out behaviour.

#include <sys/mman.h>

int madvise(void *addr, size_t length, int advice);
Advice Flag Meaning & Effect
MADV_NORMAL Default behaviour (some read-ahead)
MADV_SEQUENTIAL Pages will be accessed sequentially β†’ kernel does aggressive read-ahead, frees pages after use
MADV_RANDOM Pages will be accessed randomly β†’ disable read-ahead to avoid wasted I/O
MADV_WILLNEED Will need these pages soon β†’ kernel pre-faults them into RAM (non-blocking)
MADV_DONTNEED Won’t need these pages β†’ kernel may discard them. Next access will page-fault (re-load from file or zero)
MADV_FREE Pages can be freed when needed (Linux 4.5+). Like MADV_DONTNEED but lazy β€” page contents preserved until reclaimed
MADV_HUGEPAGE Use transparent huge pages for this region (reduce TLB pressure)
MADV_NOHUGEPAGE Do not use huge pages for this region
/*
 * madvise_demo.c β€” Usage examples of madvise
 * Compile: gcc -o madvise_demo madvise_demo.c
 */
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/stat.h>

int main(int argc, char *argv[])
{
    int    fd;
    char  *map;
    struct stat sb;

    if (argc != 2) { fprintf(stderr, "Usage: %s file\n", argv[0]); exit(1); }
    fd = open(argv[1], O_RDONLY);
    if (fd == -1) { perror("open"); exit(1); }
    fstat(fd, &sb);

    map = mmap(NULL, sb.st_size, PROT_READ, MAP_SHARED, fd, 0);
    if (map == MAP_FAILED) { perror("mmap"); exit(1); }
    close(fd);

    /* CASE 1: We will scan the file sequentially */
    madvise(map, sb.st_size, MADV_SEQUENTIAL);
    /* Now do sequential read... */

    /* CASE 2: After processing a region, hint we're done with it */
    madvise(map, sb.st_size / 2, MADV_DONTNEED);
    /* Kernel may reclaim those pages β€” saves RAM */

    /* CASE 3: We're about to do random lookups */
    madvise(map + sb.st_size / 2, sb.st_size / 2, MADV_RANDOM);

    /* CASE 4: Pre-load the last quarter before we need it */
    madvise(map + 3 * sb.st_size / 4, sb.st_size / 4, MADV_WILLNEED);

    printf("madvise hints applied successfully\n");
    munmap(map, sb.st_size);
    return 0;
}

/proc/PID/maps β€” Inspecting All Process Mappings

The Linux-specific /proc/PID/maps file lists every VMA (Virtual Memory Area) of a process. Each line has the format:

start-end  perms  offset  dev  inode  pathname

Example:
7f9a3c000000-7f9a3c021000 r--p 00000000 08:01 12345  /usr/lib/libc.so.6
7f9a3c021000-7f9a3c176000 r-xp 00021000 08:01 12345  /usr/lib/libc.so.6
7f9a3c176000-7f9a3c1c4000 r--p 00176000 08:01 12345  /usr/lib/libc.so.6
7f9a3c1c4000-7f9a3c1c8000 rw-p 001c4000 08:01 12345  /usr/lib/libc.so.6
7ffd8b200000-7ffd8b221000 rw-p 00000000 00:00 0      [stack]
7ffd8b300000-7ffd8b304000 r--p 00000000 00:00 0      [vvar]
7ffd8b304000-7ffd8b306000 r-xp 00000000 00:00 0      [vdso]
ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0   [vsyscall]

Permission field breakdown:

Position Char Meaning
1st r / - Readable
2nd w / - Writable
3rd x / - Executable
4th p / s p=Private (MAP_PRIVATE), s=Shared (MAP_SHARED)
/*
 * parse_maps.c β€” Parse /proc/self/maps and print a summary
 * Compile: gcc -o parse_maps parse_maps.c
 */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void)
{
    FILE  *fp;
    char   line[512];
    char   perms[8], pathname[256];
    unsigned long start, end, offset;
    unsigned int  dev_major, dev_minor;
    unsigned long inode;

    fp = fopen("/proc/self/maps", "r");
    if (!fp) { perror("fopen"); exit(1); }

    printf("%-20s %-6s %-12s %s\n",
           "Address Range", "Perms", "Size", "Mapping");
    printf("%-20s %-6s %-12s %s\n",
           "--------------------", "------", "------------", "-------");

    while (fgets(line, sizeof(line), fp)) {
        pathname[0] = '\0';
        sscanf(line, "%lx-%lx %s %lx %x:%x %lu %255[^\n]",
               &start, &end, perms, &offset,
               &dev_major, &dev_minor, &inode, pathname);

        unsigned long size_kb = (end - start) / 1024;
        const char *label = (pathname[0] ? pathname : "[anonymous]");

        printf("%012lx-%012lx %-6s %-8lu KB %s\n",
               start, end, perms, size_kb, label);
    }

    fclose(fp);
    return 0;
}

MAP_FIXED and MAP_FIXED_NOREPLACE

MAP_FIXED places the mapping at an exact address. Because it silently destroys overlapping mappings, Linux 4.17 added MAP_FIXED_NOREPLACE as a safer alternative β€” it fails with EEXIST if the range is already occupied.

#include <sys/mman.h>
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <unistd.h>
#include <string.h>

int main(void)
{
    long   page_size = sysconf(_SC_PAGESIZE);
    void  *addr;
    void  *map1, *map2;

    /* Step 1: Get a page at a kernel-chosen address */
    map1 = mmap(NULL, page_size, PROT_READ | PROT_WRITE,
                MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
    if (map1 == MAP_FAILED) { perror("mmap1"); exit(1); }
    printf("map1 at: %p\n", map1);

    /* Step 2: Try MAP_FIXED_NOREPLACE at same address β€” should fail */
    map2 = mmap(map1, page_size, PROT_READ | PROT_WRITE,
                MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED_NOREPLACE,
                -1, 0);
    if (map2 == MAP_FAILED) {
        if (errno == EEXIST)
            printf("MAP_FIXED_NOREPLACE: safely rejected (EEXIST)\n");
        else
            perror("mmap2 unexpected error");
    } else {
        printf("Unexpected success with MAP_FIXED_NOREPLACE\n");
        munmap(map2, page_size);
    }

    /* Step 3: MAP_FIXED at a DIFFERENT page-aligned address */
    addr = (void *)(((unsigned long)map1 + page_size * 16) & ~(page_size-1));
    map2 = mmap(addr, page_size, PROT_READ | PROT_WRITE,
                MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED,
                -1, 0);
    if (map2 != MAP_FAILED) {
        printf("MAP_FIXED mapping created at: %p\n", map2);
        munmap(map2, page_size);
    }

    munmap(map1, page_size);
    return 0;
}

Bridge to POSIX Shared Memory (Chapter 54)

POSIX shared memory (shm_open()) is the preferred modern way to share memory between unrelated processes without creating a real disk file. Internally it is just mmap() applied to an object in a tmpfs-based virtual filesystem (usually /dev/shm on Linux).

Shared File Mapping (Ch49)
  • Requires a real file on disk
  • File persists after process exits
  • Works between unrelated processes
  • No special API β€” just open() + mmap()
POSIX Shared Memory (Ch54)
  • No real disk file (tmpfs / RAM)
  • Object lives in /dev/shm (by name)
  • Works between unrelated processes
  • Uses shm_open() + mmap() + shm_unlink()
Shared Anonymous Mapping (Ch49)
  • No file, no name
  • Only between related (forked) processes
  • Simplest β€” just mmap(MAP_SHARED|MAP_ANON)
  • Automatic cleanup on process exit
/*
 * posix_shm_preview.c β€” Quick preview of POSIX shared memory API
 * (Full coverage in Chapter 54)
 * Compile: gcc -o posix_shm posix_shm_preview.c -lrt
 */
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <string.h>

#define SHM_NAME  "/ep_demo_shm"
#define SHM_SIZE  1024

int main(void)
{
    int   fd;
    char *ptr;

    /* Create/open POSIX shared memory object by name */
    fd = shm_open(SHM_NAME,
                  O_CREAT | O_RDWR,   /* Create if not exists, open R/W */
                  S_IRUSR | S_IWUSR); /* Permissions: owner rw */
    if (fd == -1) { perror("shm_open"); exit(1); }

    /* Set its size */
    if (ftruncate(fd, SHM_SIZE) == -1) { perror("ftruncate"); exit(1); }

    /* Map it β€” exactly the same mmap() call as a file mapping */
    ptr = mmap(NULL, SHM_SIZE, PROT_READ | PROT_WRITE,
               MAP_SHARED, fd, 0);
    if (ptr == MAP_FAILED) { perror("mmap"); exit(1); }
    close(fd);

    /* Use it */
    snprintf(ptr, SHM_SIZE, "Hello from PID %d via POSIX shm!", getpid());
    printf("Wrote: %s\n", ptr);

    /* Cleanup */
    munmap(ptr, SHM_SIZE);
    shm_unlink(SHM_NAME);   /* Delete from /dev/shm */
    return 0;
}

Chapter 49 β€” Complete Summary Table
Concept API / Flag Key Point
Create mapping mmap() Returns start address or MAP_FAILED
Remove mapping munmap() Releases virtual address space; does not guarantee disk flush
Flush to disk msync(MS_SYNC) Required for durability; MS_ASYNC is non-blocking
Change protection mprotect() addr must be page-aligned; use for guard pages and JIT
Lock in RAM mlock() Needs CAP_IPC_LOCK; prevents swap for RT/security
Access hints madvise() SEQUENTIAL, RANDOM, WILLNEED, DONTNEED
Private file map MAP_PRIVATE COW; writes not visible to others; used for ELF loading
Shared file map MAP_SHARED Writes visible to all; flushed to file; mmap I/O and IPC
Anon alloc MAP_ANONYMOUS fd=-1, offset=0; zeroed; demand-paged
Force address MAP_FIXED Dangerous β€” silently unmaps existing mappings at that range
Safe force addr MAP_FIXED_NOREPLACE Linux 4.17+; fails with EEXIST if range occupied
Inspect mappings /proc/PID/maps Shows all VMAs: address, perms, offset, file

🎯 Interview Questions β€” Advanced Topics
Q1. What is the W^X policy and how does mprotect() help enforce it?
W^X (Write XOR Execute) is a security policy stating that a page should never be simultaneously writable and executable. It prevents attackers from writing shellcode into a writable page and then executing it. mprotect() enables this by allowing a JIT compiler to first make a page writable (PROT_WRITE) to emit code, then switch it to executable (PROT_EXEC) before running it, removing write permission in the process.
Q2. What is the difference between MADV_DONTNEED and MADV_FREE?
MADV_DONTNEED tells the kernel to immediately release the pages β€” the next access will cause a page fault returning zeroed (anonymous) or re-loaded (file) pages. The data is definitively gone. MADV_FREE (Linux 4.5+) is lazy β€” the kernel marks the pages as available for reclamation but keeps the data in place. If the process accesses the pages before the kernel reclaims them, it still finds the old data. The kernel only discards the pages under memory pressure. MADV_FREE is used by allocators like jemalloc to cheaply “free” pages while retaining them if memory is plentiful.
Q3. Why do real-time processes use mlock()?
Real-time processes have strict latency requirements. If a page is swapped out to disk, the next access triggers a page fault which involves a disk read β€” this can take milliseconds, far exceeding real-time deadlines. mlock() pins pages in RAM, eliminating this source of unbounded latency. Real-time programs typically call mlockall(MCL_CURRENT|MCL_FUTURE) at startup to lock their entire address space.
Q4. Explain the /proc/PID/maps file and what its fields mean.
Each line represents one VMA (Virtual Memory Area). Fields: (1) start-end β€” virtual address range in hex; (2) perms β€” rwxp or rwxs (last char is ‘p’=private/COW, ‘s’=shared); (3) offset β€” offset within backing file; (4) device β€” major:minor of block device; (5) inode β€” inode number of file (0 for anonymous); (6) pathname β€” file path, or special names like [stack], [heap], [vdso], [anon]. It is invaluable for debugging memory layout, shared library loading, and memory leak analysis.
Q5. How does POSIX shared memory differ from a shared file mapping?
Both ultimately use mmap(MAP_SHARED) under the hood. The difference is the backing store: a shared file mapping uses a regular file on a persistent filesystem (data survives reboots if not deleted). POSIX shared memory (shm_open) uses an object in a tmpfs (/dev/shm) β€” it is RAM-backed, vanishes on reboot, and is accessed by name (no directory path needed). POSIX shm is cleaner for IPC because it does not pollute the filesystem and is automatically sized with ftruncate().
Q6. What happens to mappings when a process calls exec()?
All memory mappings (both anonymous and file) are destroyed when a process calls exec() because the entire virtual address space of the calling process is replaced with the new program’s image. The kernel creates fresh mappings for the new program’s text, data, BSS, and stack segments by loading the ELF. Only file descriptors (not marked FD_CLOEXEC) survive exec β€” the mappings themselves do not.
Q7. Describe a complete real-world scenario where you would use mmap() in embedded Linux development.
In embedded Linux, mmap() is commonly used to access hardware registers via /dev/mem or a UIO (Userspace I/O) device. For example, to control a GPIO controller: open /dev/mem, call mmap(NULL, BLOCK_SIZE, PROT_READ|PROT_WRITE, MAP_SHARED, fd, gpio_base_address) to map the GPIO register block into user space, then use pointer arithmetic to read/write registers directly. This gives zero-copy, low-latency hardware access without custom kernel drivers. The same pattern is used for framebuffers (/dev/fb0), DMA buffers, and SPI/I2C controller registers.

Chapter 49 Complete!

You have covered all of Chapter 49 β€” Memory Mappings.

← Part 4: Anonymous Mappings 🏠 EmbeddedPathashala Home

Leave a Reply

Your email address will not be published. Required fields are marked *