Memory Mappings The mmap() System Call in Depth

 

Chapter 49: Memory Mappings
Part 2 of 5 โ€” The mmap() System Call in Depth
๐Ÿ”ง Topic
mmap() API
๐ŸŽฏ Level
Intermediate
๐Ÿ“š Source
TLPI Ch49
๐Ÿ’ผ Target
TI / ST / Qualcomm

The mmap() Signature

The entire mmap() API lives in <sys/mman.h>. Its signature is:

#include <sys/mman.h>

void *mmap(void   *addr,    /* Hint for placement (or NULL) */
           size_t  length,  /* Size of mapping in bytes    */
           int     prot,    /* Memory protection flags     */
           int     flags,   /* Mapping type flags          */
           int     fd,      /* File descriptor (-1 for anon) */
           off_t   offset); /* Offset within file          */

/* Returns: starting address of new mapping on success,
            MAP_FAILED ((void*)-1) on error */

Every parameter controls a different aspect of how the mapping is created. We cover each one in detail below.

Key Terms in This Part:

addr length prot flags fd offset PROT_READ PROT_WRITE PROT_EXEC PROT_NONE MAP_FIXED PAGE_SIZE MAP_FAILED

Parameter 1: addr โ€” Where to Place the Mapping

The addr argument tells the kernel where in the process’s virtual address space to place the new mapping.

addr = NULL (Recommended)

Kernel chooses a free region in the address space automatically. This is the preferred approach โ€” it avoids address conflicts and is portable.

void *addr_hint = NULL;   /* Let kernel decide */
void *map = mmap(addr_hint, length, prot, flags, fd, offset);
addr = non-NULL (Hint)

Kernel treats it as a hint. The kernel will round it to a page boundary and may place the mapping at a different nearby address if the hint is occupied. Actual address returned by mmap().

void *hint = (void *)0x7f0000000000;
void *map = mmap(hint, length, prot, flags, fd, offset);
/* map may NOT equal hint */
addr with MAP_FIXED

Forces the mapping to be placed at exactly addr. Address must be page-aligned. If the range overlaps an existing mapping, that mapping is silently removed. Dangerous โ€” rarely used.

/* Dangerous: overlapping mappings silently unmapped */
void *map = mmap(fixed_addr, length,
                 PROT_READ | PROT_WRITE,
                 MAP_SHARED | MAP_FIXED,
                 fd, 0);

Page alignment: The kernel always works in page-sized units. On Linux, the default page size is 4096 bytes. Check it with:

#include <unistd.h>
long page_size = sysconf(_SC_PAGESIZE);   /* typically 4096 */

Parameter 2: length โ€” Size of the Mapping

length specifies how many bytes to map. It does not need to be a multiple of the page size, but the kernel always rounds it up to the next page boundary internally.

Important: Even if length = 1, at least one full page (4096 bytes) is mapped. You can access up to the end of the rounded-up page, but bytes beyond the original length are zeroed and not backed by the file.
#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>

int main(void)
{
    int fd;
    void *map;
    struct stat sb;
    size_t length;

    fd = open("data.bin", O_RDONLY);
    if (fd == -1) { perror("open"); exit(1); }

    /* Get exact file size to use as mapping length */
    if (fstat(fd, &sb) == -1) { perror("fstat"); exit(1); }
    length = sb.st_size;   /* Map exactly the file size */

    map = mmap(NULL, length, PROT_READ, MAP_PRIVATE, fd, 0);
    if (map == MAP_FAILED) { perror("mmap"); exit(1); }

    /* fd can be closed immediately after mmap โ€” mapping persists */
    close(fd);

    printf("First byte: 0x%02x\n", ((unsigned char *)map)[0]);

    /* Unmap when done */
    munmap(map, length);
    return 0;
}

Note: After mmap() succeeds, you can close(fd) immediately. The mapping keeps the file open internally until munmap() is called.

Parameter 3: prot โ€” Memory Protection

The prot argument controls what operations are permitted on the mapped region. It is either PROT_NONE or a bitwise OR of:

Constant Value (typical) Meaning Common Use
PROT_NONE 0x0 No access at all โ€” any access causes SIGSEGV Guard pages, memory reservations
PROT_READ 0x1 Region can be read Read-only file mappings
PROT_WRITE 0x2 Region can be written Writable data areas
PROT_EXEC 0x4 Region can be executed JIT compilers, code loading

Rules: prot must be compatible with the file’s open flags. If the file was opened with O_RDONLY, using PROT_WRITE with MAP_SHARED will fail with EACCES. MAP_PRIVATE with PROT_WRITE is fine even for read-only files (because COW means writes never reach the file).

/* Typical prot combinations */

/* Read-only access to a configuration file */
mmap(NULL, len, PROT_READ, MAP_PRIVATE, fd, 0);

/* Read-write shared IPC region */
mmap(NULL, len, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);

/* Guard page โ€” catches buffer overflows */
mmap(guard_addr, page_size, PROT_NONE, MAP_PRIVATE | MAP_ANON | MAP_FIXED, -1, 0);

/* Executable page for JIT code */
mmap(NULL, len, PROT_READ | PROT_WRITE | PROT_EXEC, MAP_PRIVATE | MAP_ANON, -1, 0);

You can change the protection of an existing mapping using mprotect():

#include <sys/mman.h>

/* Change protection after the fact */
int ret = mprotect(map_addr, length, PROT_READ);  /* Make read-only */
if (ret == -1) perror("mprotect");

Parameter 4: flags โ€” Mapping Behaviour

The flags argument controls the type of mapping. It must include exactly one of MAP_PRIVATE or MAP_SHARED, plus any optional flags:

Flag Meaning
MAP_PRIVATE Copy-on-write mapping. Changes private to process. Must use one of MAP_PRIVATE or MAP_SHARED.
MAP_SHARED Changes shared with other processes that map same region; written back to file (file mappings).
MAP_ANONYMOUS (or MAP_ANON) No file backing โ€” anonymous mapping. fd should be -1, offset should be 0.
MAP_FIXED Force placement at exactly addr (must be page-aligned). Risky โ€” existing mappings silently removed.
MAP_LOCKED Lock pages in RAM โ€” prevent swapping (like mlock()).
MAP_POPULATE Pre-fault all pages of the mapping into RAM (reduces later page faults).
MAP_NORESERVE Do not reserve swap space for this mapping. Risk of SIGBUS if swap exhausted on write.
MAP_HUGETLB Use huge pages (2 MB or 1 GB on x86_64) for the mapping โ€” reduces TLB pressure.

Parameters 5 & 6: fd and offset โ€” File Backing

For file mappings, fd is the open file descriptor and offset is the byte offset within the file where the mapping starts.

Constraint: offset must be a multiple of the system page size (sysconf(_SC_PAGESIZE)), otherwise mmap() returns EINVAL.

For anonymous mappings, fd must be -1 and offset must be 0.

#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>

/* Map the second 4096-byte block of a file */
int main(void)
{
    int fd;
    void *map;
    long page_size;

    page_size = sysconf(_SC_PAGESIZE);  /* e.g., 4096 */

    fd = open("bigfile.dat", O_RDONLY);
    if (fd == -1) { perror("open"); exit(1); }

    /* offset MUST be a multiple of page_size */
    off_t offset = 1 * page_size;   /* Skip first page */

    map = mmap(NULL,
               page_size,       /* Map exactly one page */
               PROT_READ,
               MAP_PRIVATE,
               fd,
               offset);         /* Start at second page */

    if (map == MAP_FAILED) { perror("mmap"); exit(1); }
    close(fd);

    /* Access bytes at file offset [4096..8191] */
    printf("Byte at offset 4096: 0x%02x\n", ((unsigned char*)map)[0]);

    munmap(map, page_size);
    return 0;
}

Return Value & Error Handling

On success, mmap() returns the starting virtual address of the new mapping as a void *. On error it returns MAP_FAILED, which is defined as ((void *) -1).

Warning: Never check mmap() result against NULL. NULL is a valid return value in theory (address 0 can be mapped with MAP_FIXED). Always compare against MAP_FAILED.
#include <sys/mman.h>
#include <stdio.h>
#include <errno.h>
#include <string.h>

void *safe_mmap(size_t length, int prot, int flags, int fd, off_t offset)
{
    void *map;

    map = mmap(NULL, length, prot, flags, fd, offset);

    if (map == MAP_FAILED) {
        /* Common errors:
         * EACCES  - prot/flags incompatible with fd open mode
         * EINVAL  - bad addr, length=0, or offset not page-aligned
         * ENOMEM  - no memory or address space
         * ENODEV  - filesystem does not support mapping
         */
        fprintf(stderr, "mmap failed: %s (errno=%d)\n",
                strerror(errno), errno);
        return NULL;
    }

    return map;
}

Common errno values from mmap():

errno Cause
EACCES File not opened with compatible flags (e.g., MAP_SHARED+PROT_WRITE on O_RDONLY fd)
EINVAL length=0, or offset not page-aligned, or neither MAP_PRIVATE nor MAP_SHARED set
ENOMEM Insufficient memory or process hit virtual address space limit
ENODEV Filesystem does not support memory mapping
EBADF fd is not a valid file descriptor (for non-anonymous mappings)

Complete Working Example: Map, Read, Write, Unmap
/*
 * mmap_basic.c โ€” Demonstrates mmap parameters end-to-end
 * Compile: gcc -o mmap_basic mmap_basic.c
 * Usage:   echo "Hello, mmap world!" > test.txt
 *          ./mmap_basic test.txt
 */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/stat.h>

int main(int argc, char *argv[])
{
    int         fd;
    void       *map;
    struct stat sb;
    size_t      length;
    long        page_size;

    if (argc != 2) {
        fprintf(stderr, "Usage: %s <file>\n", argv[0]);
        exit(EXIT_FAILURE);
    }

    page_size = sysconf(_SC_PAGESIZE);
    printf("System page size: %ld bytes\n", page_size);

    /* Open file read-write */
    fd = open(argv[1], O_RDWR);
    if (fd == -1) { perror("open"); exit(EXIT_FAILURE); }

    /* Get file size */
    if (fstat(fd, &sb) == -1) { perror("fstat"); exit(EXIT_FAILURE); }
    length = (size_t)sb.st_size;
    printf("File size: %zu bytes\n", length);

    /* Create shared file mapping โ€” reads/writes go to file */
    map = mmap(
        NULL,                   /* addr:   kernel chooses */
        length,                 /* length: whole file     */
        PROT_READ | PROT_WRITE, /* prot:   read + write   */
        MAP_SHARED,             /* flags:  shared mapping */
        fd,                     /* fd:     our file       */
        0                       /* offset: from start     */
    );
    if (map == MAP_FAILED) { perror("mmap"); exit(EXIT_FAILURE); }
    printf("Mapped at address: %p\n", map);

    /* fd no longer needed โ€” mapping holds its own reference */
    close(fd);

    /* Read from the mapping */
    printf("Content: %.*s\n", (int)length, (char *)map);

    /* Write through the mapping โ€” modifies the file on disk */
    if (length >= 5) {
        memcpy(map, "HELLO", 5);   /* Overwrite first 5 bytes */
    }

    /* Unmap โ€” releases the virtual address space region */
    if (munmap(map, length) == -1) {
        perror("munmap");
        exit(EXIT_FAILURE);
    }

    printf("Mapping released. File now contains updated data.\n");
    return 0;
}

๐ŸŽฏ Interview Questions โ€” mmap() System Call
Q1. What happens if you pass addr=NULL to mmap()?
The kernel selects a suitable free virtual address for the mapping. This is the recommended approach because it avoids conflicts with existing mappings and is portable across architectures. The actual starting address used is returned by the mmap() call.
Q2. Why must the offset parameter of mmap() be page-aligned?
The kernel manages virtual memory in page-sized units. The hardware MMU maps virtual pages to physical frames, and each frame corresponds to exactly one page of a file (as tracked in the page cache). Since the mapping granularity is one page, the starting offset in the file must align to a page boundary, otherwise there is no well-defined page for the kernel to map. Passing a non-page-aligned offset results in EINVAL.
Q3. Can you close the file descriptor after a successful mmap() call?
Yes. Once mmap() succeeds, the mapping exists independently of the file descriptor. The kernel increments an internal reference count on the file when creating the mapping, so closing the fd simply decrements that count. The mapping (and the file reference) persists until munmap() is called or the process exits. Closing the fd early is good practice to avoid fd leaks.
Q4. What is PROT_NONE used for?
PROT_NONE makes a mapped region completely inaccessible โ€” any read, write, or execute access generates SIGSEGV. It is used to create guard pages that detect buffer overflows (placed around stack regions), to reserve virtual address space without consuming physical RAM, or as a placeholder that will later have its protection changed with mprotect().
Q5. What is wrong with checking the return of mmap() against NULL?
The error sentinel for mmap() is MAP_FAILED, defined as ((void *) -1), not NULL. Address 0 (which NULL represents) is technically a valid mapping address (with MAP_FIXED). Checking against NULL would miss all real errors and could falsely flag a valid mapping at address 0.
Q6. Why does mmap() with PROT_WRITE + MAP_SHARED fail on an O_RDONLY file descriptor?
With MAP_SHARED and PROT_WRITE, any write to the mapped region is reflected back to the file. This requires write permission on the file. An O_RDONLY file descriptor lacks write permission, so the kernel rejects this combination with EACCES. Note: MAP_PRIVATE + PROT_WRITE on an O_RDONLY fd is allowed, because copy-on-write ensures writes never reach the file.
Q7. What does MAP_FIXED do and why is it dangerous?
MAP_FIXED forces the kernel to place the mapping at exactly the address specified by addr (which must be page-aligned). If that range overlaps any existing mapping, those overlapping pages are silently unmapped first. This makes it dangerous โ€” if a library or the stack happens to occupy that address range, it will be destroyed without any warning or error, leading to undefined behavior or crashes.

Chapter 49 Series

Part 2 of 5 โ€” The mmap() System Call

โ† Part 1: Overview Next: File Mappings โ†’ ๐Ÿ  Home

Leave a Reply

Your email address will not be published. Required fields are marked *