Private File Mappings — mmap() Linux Memory Mappings Series

 

Private File Mappings — mmap()
Chapter 49 · TLPI · Linux Memory Mappings Series
📖 Section 49.4.1
🔧 MAP_PRIVATE
🎯 Interview Ready

A private file mapping creates a mapping between a process’s virtual memory and a file on disk, but any changes made to the mapped memory are not written back to the file. The changes are also not visible to other processes. This is the MAP_PRIVATE mode of mmap(). Understanding private file mappings is essential for understanding how the Linux kernel loads programs and shared libraries.

🔑 Key Terms
MAP_PRIVATE mmap() Copy-on-Write (COW) Text Segment Data Segment Program Loader Dynamic Linker PROT_READ PROT_EXEC Page Fault /proc/PID/maps munmap()

1. What Is a Private File Mapping?

When you call mmap() with the MAP_PRIVATE flag and pass a file descriptor, the kernel sets up a mapping between a region of your process’s virtual address space and a region of the file. The mapped region is initialized with the file’s content.

The key behavior of MAP_PRIVATE is Copy-on-Write (COW). As long as the process only reads the memory, it shares the same physical page with the file’s page cache. The moment the process tries to write to that page, the kernel silently creates a private copy of that page for the process. After that, the process is writing to its own private copy — the original file and any other process mapping the same file are unaffected.

How MAP_PRIVATE Copy-on-Write Works
Process Virtual Memory
mapped region
(virtual addr)
Physical Page Cache
Shared page
(READ only)
Private COW copy
(on WRITE)
File on Disk
Original content
NEVER modified
Write to mapped region → kernel makes a private copy → file stays unchanged

2. mmap() System Call — Syntax

#include <sys/mman.h>

void *mmap(void *addr,      /* Hint for start address (usually NULL) */
           size_t length,   /* Number of bytes to map                */
           int prot,        /* Memory protection flags               */
           int flags,       /* MAP_PRIVATE | MAP_SHARED | etc.       */
           int fd,          /* File descriptor (open()ed file)       */
           off_t offset);   /* Offset into file (must be page-aligned)*/

/* Returns: start address of mapped region on success, MAP_FAILED on error */

For a private file mapping, the important parameters are:

Parameter Typical Value Notes
flags MAP_PRIVATE Changes are private (COW), never written to file
prot PROT_READ Read-only for text; PROT_READ|PROT_WRITE if writes needed
offset Must be a multiple of page size (usually 0) Use sysconf(_SC_PAGE_SIZE) to get page size

3. How the Kernel Uses Private File Mappings

The two most important and common uses of private file mappings happen automatically — you never call mmap() yourself. The program loader and dynamic linker do it for you whenever you run any executable.

Use 1: Mapping the Text Segment (Program Code)

The text segment contains the machine instructions of a program. When the kernel loads an ELF executable, it maps the .text section of the file using MAP_PRIVATE with PROT_READ | PROT_EXEC protection.

Why MAP_PRIVATE for text? Because a debugger (like gdb) or a self-modifying program might need to temporarily change the instructions (breakpoints, patches). These changes must not affect the file on disk or other processes running the same program. The COW mechanism ensures isolation.

Think of it this way: 10 users running the same binary (e.g., bash) all share the same physical pages for the text segment — no duplication in RAM. If any one process modifies the code (e.g., a debugger sets a breakpoint), only that process gets a private copy of the modified page.
Use 2: Mapping the Initialized Data Segment

The .data section of an ELF binary contains global and static variables that have explicit initial values. The kernel maps this using MAP_PRIVATE so that each process gets its own copy of the initialized data when it first writes to it. The on-disk values in the executable are never modified.

ELF Binary → Process Address Space via mmap()
ELF Binary on Disk
.text (code)
.data (init vars)
.rodata (constants)
.bss (zero init)
mmap()
MAP_PRIVATE
Process Virtual Address Space
.text — PROT_READ|PROT_EXEC
MAP_PRIVATE → COW if modified by debugger
.data — PROT_READ|PROT_WRITE
MAP_PRIVATE → each process gets own copy
.rodata — PROT_READ only
.bss — anonymous mapping (zeroed)

4. Verifying with /proc/PID/maps

You can directly see these private file mappings by reading /proc/PID/maps. The kernel creates this file automatically for every running process.

/* How to view your own process's memory mappings */
#include <stdio.h>
#include <unistd.h>

int main(void)
{
    char cmd[64];
    /* Print PID so user knows which /proc entry to check */
    printf("My PID: %d\n", getpid());
    printf("--- /proc/%d/maps ---\n", getpid());

    /* Build the command to cat this process's maps file */
    snprintf(cmd, sizeof(cmd), "cat /proc/%d/maps", getpid());
    system(cmd);  /* for illustration only */
    return 0;
}

Sample output from /proc/PID/maps for a typical process:

/* Format: start-end   perms  offset  dev  inode   pathname
   perms: r=read, w=write, x=execute, p=private, s=shared  */

55a3b2000000-55a3b2001000 r--p 00000000 08:01 123456  /usr/bin/cat
55a3b2001000-55a3b2002000 r-xp 00001000 08:01 123456  /usr/bin/cat  <-- text (MAP_PRIVATE)
55a3b2002000-55a3b2003000 r--p 00002000 08:01 123456  /usr/bin/cat  <-- rodata
55a3b2003000-55a3b2004000 r--p 00002000 08:01 123456  /usr/bin/cat  <-- data (MAP_PRIVATE)
7f8a12000000-7f8a12200000 r--p 00000000 08:01 999001  /lib/x86_64/libc.so.6
7f8a12200000-7f8a12380000 r-xp 00200000 08:01 999001  /lib/x86_64/libc.so.6
...
7ffd45678000-7ffd45699000 rw-p 00000000 00:00 0       [stack]

/* Notice 'p' in the perms column — that means MAP_PRIVATE */

5. Code Example: Private File Mapping for File Input

A less common but valid use of MAP_PRIVATE is simplifying file input. You map a file privately, read from it like memory, and optionally modify the in-memory content without touching the file.

/* private_mmap_read.c — Read a file using MAP_PRIVATE mmap() */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/stat.h>

int main(int argc, char *argv[])
{
    int         fd;
    char       *addr;
    struct stat sb;

    if (argc != 2) {
        fprintf(stderr, "Usage: %s <filename>\n", argv[0]);
        exit(EXIT_FAILURE);
    }

    /* Step 1: Open the file read-only */
    fd = open(argv[1], O_RDONLY);
    if (fd == -1) {
        perror("open");
        exit(EXIT_FAILURE);
    }

    /* Step 2: Get file size using fstat() */
    if (fstat(fd, &sb) == -1) {
        perror("fstat");
        exit(EXIT_FAILURE);
    }

    printf("File size: %ld bytes\n", (long)sb.st_size);

    /* Step 3: Create a PRIVATE mapping — changes won't touch the file */
    addr = mmap(NULL,           /* kernel chooses address */
                sb.st_size,     /* map the entire file */
                PROT_READ,      /* read-only access */
                MAP_PRIVATE,    /* private: changes not written back */
                fd,
                0);             /* offset 0 = start from beginning */

    if (addr == MAP_FAILED) {
        perror("mmap");
        exit(EXIT_FAILURE);
    }

    /* Step 4: File descriptor can be closed immediately after mmap() */
    /* The mapping keeps a reference to the underlying file internally */
    close(fd);

    /* Step 5: Access the file content like a normal memory buffer */
    printf("First 80 characters of file:\n%.80s\n", addr);

    /* Count newlines — just access memory, no read() calls needed */
    long newlines = 0;
    for (size_t i = 0; i < (size_t)sb.st_size; i++) {
        if (addr[i] == '\n')
            newlines++;
    }
    printf("Number of lines: %ld\n", newlines);

    /* Step 6: Unmap when done */
    if (munmap(addr, sb.st_size) == -1) {
        perror("munmap");
        exit(EXIT_FAILURE);
    }

    return 0;
}
/* Compile and run:
   gcc -o private_mmap_read private_mmap_read.c
   echo "Hello World" > test.txt
   ./private_mmap_read test.txt

   Expected output:
   File size: 12 bytes
   First 80 characters of file:
   Hello World
   Number of lines: 1
*/

6. Code Example: Verifying Copy-on-Write

This example proves that writing to a MAP_PRIVATE mapping does not change the original file.

/* cow_demo.c — Proves MAP_PRIVATE doesn't modify the underlying file */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/stat.h>

int main(void)
{
    const char *filename = "/tmp/cow_test.txt";
    const char *original = "Hello, World!\n";
    char       *addr;
    int         fd;
    struct stat sb;

    /* Create a test file */
    fd = open(filename, O_CREAT | O_RDWR | O_TRUNC, 0644);
    if (fd == -1) { perror("open"); exit(1); }
    write(fd, original, strlen(original));
    close(fd);

    /* Open for mapping */
    fd = open(filename, O_RDWR);
    if (fd == -1) { perror("open"); exit(1); }
    fstat(fd, &sb);

    /* MAP_PRIVATE + PROT_WRITE — we can write, but changes stay private */
    addr = mmap(NULL, sb.st_size,
                PROT_READ | PROT_WRITE,
                MAP_PRIVATE,   /* <-- KEY: private, not shared */
                fd, 0);
    if (addr == MAP_FAILED) { perror("mmap"); exit(1); }
    close(fd);

    /* Print before modification */
    printf("Before write: in-memory = [%.*s]\n",
           (int)sb.st_size, addr);

    /* Modify the in-memory mapping */
    memcpy(addr, "MODIFIED!!!!!\n", 14);

    /* Print after — in-memory changed */
    printf("After  write: in-memory = [%.*s]\n",
           (int)sb.st_size, addr);

    munmap(addr, sb.st_size);

    /* Now read the actual file to verify it was NOT changed */
    fd = open(filename, O_RDONLY);
    char buf[64] = {0};
    read(fd, buf, sizeof(buf) - 1);
    close(fd);
    printf("File on disk: [%s]\n", buf);
    /* Output will still show "Hello, World!" — file unchanged! */

    return 0;
}

/* Expected output:
   Before write: in-memory = [Hello, World!]
   After  write: in-memory = [MODIFIED!!!!!]
   File on disk: [Hello, World!]      <-- file unchanged: COW worked!
*/

7. munmap() and mprotect()

#include <sys/mman.h>

/* Remove a mapping — addr must be page-aligned */
int munmap(void *addr, size_t length);

/* Change protection on a mapped region */
int mprotect(void *addr, size_t length, int prot);
/* prot can be: PROT_NONE, PROT_READ, PROT_WRITE, PROT_EXEC */

/* Example: Make a read-only MAP_PRIVATE mapping temporarily writable */
/* (This is exactly what a debugger does to set breakpoints)         */
mprotect(addr, page_size, PROT_READ | PROT_WRITE | PROT_EXEC);
/* ... modify the code / set breakpoint ... */
mprotect(addr, page_size, PROT_READ | PROT_EXEC); /* restore */

Key points about munmap():

  • You must pass the same addr and length that was used in mmap() (or any sub-region).
  • After munmap(), accessing the region causes a segmentation fault.
  • Mappings are automatically unmapped when a process exits.
  • For MAP_PRIVATE, unmapping simply discards any COW copies.

8. Interview Questions & Answers

Q1: What is the difference between MAP_PRIVATE and MAP_SHARED in mmap()?
Answer: With MAP_PRIVATE, any writes to the mapped region trigger Copy-on-Write — a private copy of the modified page is created for the process, and the original file and other processes’ views are unaffected. With MAP_SHARED, writes are immediately visible to all other processes mapping the same region, and the changes are eventually written back to the underlying file.
Q2: Why does the Linux kernel map the text segment of an ELF binary using MAP_PRIVATE even though the process never modifies it?
Answer: Because a debugger or self-modifying program may need to modify the text segment (e.g., inserting breakpoints by replacing instructions with int 0x3). Using MAP_PRIVATE means such modifications remain local to that process and do not affect the binary file on disk or other processes running the same program. It also means the modification is possible after calling mprotect() to temporarily add write permission.
Q3: After calling mmap(), can you close() the file descriptor? What happens?
Answer: Yes. Once mmap() returns successfully, the mapping is established and the kernel holds an internal reference to the underlying file. The file descriptor is no longer needed. Closing it does not affect the mapping. This is common practice to avoid fd leaks.
Q4: What happens if you call mmap() with offset that is not page-aligned?
Answer: mmap() returns MAP_FAILED and sets errno to EINVAL. The offset parameter must be a multiple of the system’s page size (typically 4096 bytes). You can get the page size with sysconf(_SC_PAGE_SIZE) or the getpagesize() call.
Q5: What are the protection flags for mmap() and what do they control?
Answer:

  • PROT_READ — the region can be read
  • PROT_WRITE — the region can be written
  • PROT_EXEC — the region can be executed as code
  • PROT_NONE — no access allowed (useful for guard pages)

They must be consistent with the file’s open mode. For example, you cannot use PROT_WRITE if the file was opened O_RDONLY, unless the flag is MAP_PRIVATE (where writes only go to COW copies, not the file).

Q6: How is mmap() MAP_PRIVATE used for loading shared libraries?
Answer: When the dynamic linker (ld-linux.so) loads a shared library (e.g., libc.so), it uses mmap(MAP_PRIVATE) to map the library’s text and data sections. The text segment is mapped as PROT_READ|PROT_EXEC — all processes share the same physical pages (no RAM wasted). The data segment is mapped as PROT_READ|PROT_WRITE|MAP_PRIVATE — each process gets its own copy only when it writes to a data page (COW). You can verify this in /proc/PID/maps by looking for shared library paths with the p permission flag.
Q7: Can you use mmap() MAP_PRIVATE with PROT_WRITE on a file opened with O_RDONLY?
Answer: Yes, this is one special case where it works. With MAP_PRIVATE, writes never go back to the file — they only go to private COW copies in memory. So the kernel allows PROT_READ|PROT_WRITE with MAP_PRIVATE on a read-only fd. With MAP_SHARED, writes must reach the file, so PROT_WRITE|MAP_SHARED requires the file to be opened with at least O_RDWR.
Q8: What does /proc/PID/maps tell you about a process’s memory mappings?
Answer: Each line shows: virtual address range, permissions (r/w/x + p for private or s for shared), offset into the mapped file, device, inode, and the file path (or [heap], [stack], [vdso] for special regions). By checking whether the permissions column shows p or s, you can immediately tell whether a region is a private or shared mapping.

Next: Shared File Mappings →

Learn how MAP_SHARED enables memory-mapped I/O and fast IPC between processes, with the classic t_mmap.c example.

Shared File Mappings → Back to EmbeddedPathashala

Leave a Reply

Your email address will not be published. Required fields are marked *