File Mappings
Intermediate
TLPI Ch49
TI / ST / Qualcomm
A file mapping maps a region of a file directly into a process’s virtual address space. Reading bytes from that memory region is the same as reading from the file; writing bytes to that region (if shared) is the same as writing to the file โ but all through normal load/store CPU instructions, with no system calls needed per byte.
File mappings come in two flavours: private (copy-on-write, changes not reflected to file) and shared (changes go to file, visible to all mappers). Both are created with the same mmap() call; the difference is just the MAP_PRIVATE vs MAP_SHARED flag.
Multiple processes mapping the same file region initially share the same physical pages (from the page cache). When any process writes to its copy, the kernel performs copy-on-write: a new private page is allocated for that process. The underlying file is never modified.
Primary use cases:
- ELF loading: The OS maps a program’s text segment (code) and initialized data segment from the executable file using
MAP_PRIVATE. The text segment usesPROT_READ|PROT_EXEC, data usesPROT_READ|PROT_WRITE. When a process writes to its data, COW kicks in. - Shared libraries: Multiple processes share the same physical pages for a library’s code via private mappings โ zero extra RAM used for code even with 100 processes using the same .so.
- Read-only configuration files: Load a large config file into memory once, parse it without modifying the file.
/*
* private_file_map.c
* Read a binary file using MAP_PRIVATE
* Compile: gcc -o private_file_map private_file_map.c
*/
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/stat.h>
int main(int argc, char *argv[])
{
int fd;
struct stat sb;
char *data;
size_t length;
if (argc != 2) {
fprintf(stderr, "Usage: %s <filename>\n", argv[0]);
exit(EXIT_FAILURE);
}
/* Open read-only is sufficient for MAP_PRIVATE */
fd = open(argv[1], O_RDONLY);
if (fd == -1) { perror("open"); exit(EXIT_FAILURE); }
if (fstat(fd, &sb) == -1) { perror("fstat"); exit(EXIT_FAILURE); }
length = sb.st_size;
/* MAP_PRIVATE: changes never reach the file */
data = mmap(NULL, length, PROT_READ | PROT_WRITE,
MAP_PRIVATE, fd, 0);
if (data == MAP_FAILED) { perror("mmap"); exit(EXIT_FAILURE); }
close(fd); /* Safe to close fd now */
/* We can "modify" the mapping without touching the file */
data[0] = 'X'; /* COW: kernel creates private page for us */
printf("First byte (in our private copy): %c\n", data[0]);
printf("Original file is unchanged on disk.\n");
munmap(data, length);
return 0;
}
All processes that create a MAP_SHARED mapping of the same file region share the same physical pages in RAM. There is no copy-on-write โ a write by one process is immediately visible to all others sharing the mapping, and is also reflected back to the underlying file (via the page cache).
virtual addr 0x7fโฆ
Page (Page Cache)
virtual addr 0x7eโฆ
Primary use cases:
- Memory-mapped I/O: Use pointer operations instead of
read()/write()to access file data. Efficient for random access patterns. - IPC between unrelated processes: Two processes share data through a file without System V shared memory setup overhead.
Instead of lseek()+read() for random access, use a shared mapping and access bytes directly:
/*
* mmap_io.c โ Memory-mapped I/O vs traditional I/O comparison
* Compile: gcc -O2 -o mmap_io mmap_io.c
*
* Creates a 1MB file then:
* (a) writes using mmap
* (b) reads using mmap
*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/stat.h>
#define FILE_SIZE (1024 * 1024) /* 1 MB */
#define FILENAME "/tmp/mmap_test.dat"
/* Create a file of a given size (filled with zeros) */
static void create_file(const char *path, size_t size)
{
int fd = open(path, O_RDWR | O_CREAT | O_TRUNC, 0644);
if (fd == -1) { perror("open"); exit(1); }
/* Seek to last byte and write one byte to set file size */
if (lseek(fd, size - 1, SEEK_SET) == -1) { perror("lseek"); exit(1); }
if (write(fd, "\0", 1) != 1) { perror("write"); exit(1); }
close(fd);
}
int main(void)
{
int fd;
char *map;
size_t i;
/* Step 1: Create the backing file */
create_file(FILENAME, FILE_SIZE);
/* Step 2: Open and map for read+write (shared) */
fd = open(FILENAME, O_RDWR);
if (fd == -1) { perror("open"); exit(1); }
map = mmap(NULL, FILE_SIZE, PROT_READ | PROT_WRITE,
MAP_SHARED, fd, 0);
if (map == MAP_FAILED) { perror("mmap"); exit(1); }
close(fd);
/* Step 3: Write using pointer arithmetic โ no write() calls */
for (i = 0; i < FILE_SIZE; i++) {
map[i] = (char)(i & 0xFF);
}
printf("Written %zu bytes via mmap pointer operations\n", FILE_SIZE);
/* Step 4: Read back โ again, no read() calls */
unsigned long sum = 0;
for (i = 0; i < FILE_SIZE; i++) {
sum += (unsigned char)map[i];
}
printf("Checksum of mapped data: %lu\n", sum);
/* Step 5: Explicitly flush dirty pages to disk */
if (msync(map, FILE_SIZE, MS_SYNC) == -1) {
perror("msync");
}
printf("Data flushed to disk via msync(MS_SYNC)\n");
munmap(map, FILE_SIZE);
/* Verify: read the file with traditional I/O */
FILE *f = fopen(FILENAME, "rb");
int first_byte = fgetc(f);
fclose(f);
printf("First byte in file (should be 0): %d\n", first_byte);
return 0;
}
With a shared file mapping, the kernel writes dirty pages back to the file lazily (when memory pressure increases or on munmap()). If you need data flushed to disk immediately (e.g., for durability), use msync():
#include <sys/mman.h>
int msync(void *addr, size_t length, int flags);
/* Returns 0 on success, -1 on error */
| Flag | Behaviour |
|---|---|
MS_SYNC |
Blocks until all dirty pages are written to disk. Equivalent to fsync(). |
MS_ASYNC |
Schedules the write but returns immediately. Non-blocking. |
MS_INVALIDATE |
Invalidate cached copies of the mapping so subsequent reads come fresh from disk. Useful when another process modified the file through non-mmap means. |
/* Flush first 4096 bytes synchronously */
if (msync(map, 4096, MS_SYNC) == -1) {
perror("msync");
}
/* Schedule async flush of entire mapping */
if (msync(map, file_size, MS_ASYNC) == -1) {
perror("msync async");
}
munmap() does NOT guarantee that dirty pages are flushed to disk before the process exits. Always call msync(MS_SYNC) before munmap() if you need durability guarantees.When you are done with a mapping, release it with munmap(). This removes the mapping from the process’s virtual address space and decrements the file reference count.
#include <sys/mman.h>
int munmap(void *addr, size_t length);
/* Returns 0 on success, -1 on error */
/*
* Rules for munmap():
* 1. addr must be the address returned by mmap() (or page-aligned within it)
* 2. length must match what was passed to mmap() (or a subset)
* 3. A mapping can be partially unmapped โ the rest stays valid
* 4. After munmap(), accessing the unmapped range causes SIGSEGV
*/
/* Full unmap */
munmap(map, file_size);
/* Partial unmap: unmap just the second half */
size_t half = file_size / 2;
munmap((char *)map + half, half); /* First half still accessible */
free(), failing to call munmap() only leaks virtual address space, not physical RAM (the OS reclaims physical pages on process exit). However, in long-running processes with many mappings, virtual address space exhaustion is a real risk.Two unrelated processes (no fork relationship) can communicate through a shared file mapping. Process A writes, Process B reads โ both through the same physical pages backed by a file.
/*
* ipc_writer.c โ Write a message via shared file mapping
* Compile: gcc -o writer ipc_writer.c
* Run: ./writer; ./ipc_reader
*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/mman.h>
#define SHM_FILE "/tmp/ipc_shared.dat"
#define SHM_SIZE 256
int main(void)
{
int fd;
char *map;
/* Create/truncate backing file */
fd = open(SHM_FILE, O_RDWR | O_CREAT | O_TRUNC, 0666);
if (fd == -1) { perror("open"); exit(1); }
ftruncate(fd, SHM_SIZE); /* Set file size */
map = mmap(NULL, SHM_SIZE, PROT_READ | PROT_WRITE,
MAP_SHARED, fd, 0);
if (map == MAP_FAILED) { perror("mmap"); exit(1); }
close(fd);
/* Write message into shared region */
snprintf(map, SHM_SIZE, "Hello from PID %d!", (int)getpid());
msync(map, SHM_SIZE, MS_SYNC); /* Ensure it's on disk */
printf("Writer: wrote '%s'\n", map);
munmap(map, SHM_SIZE);
return 0;
}
/*
* ipc_reader.c โ Read message via shared file mapping
* Compile: gcc -o reader ipc_reader.c
*/
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/mman.h>
#define SHM_FILE "/tmp/ipc_shared.dat"
#define SHM_SIZE 256
int main(void)
{
int fd;
char *map;
fd = open(SHM_FILE, O_RDONLY);
if (fd == -1) { perror("open"); exit(1); }
map = mmap(NULL, SHM_SIZE, PROT_READ, MAP_SHARED, fd, 0);
if (map == MAP_FAILED) { perror("mmap"); exit(1); }
close(fd);
printf("Reader: received '%s'\n", map);
munmap(map, SHM_SIZE);
return 0;
}
Note: This example has no synchronization. In production code, use a semaphore or mutex placed inside the shared region to coordinate access.
If the mapping’s length is larger than the actual file size (e.g., you truncated the file after mapping it), accessing the pages beyond the file end causes SIGBUS, not SIGSEGV. This is different from accessing outside the mapped range (which gives SIGSEGV).
Access to an address completely outside any mapped region. E.g., accessing map+length+100.
Access to a mapped address that has no corresponding backing store (file was truncated). E.g., map[file_size+1] when file was shortened.
/* Example: SIGBUS scenario
*
* File is 4000 bytes. We map 4096 (one page).
* Bytes 0..3999 โ backed by file โ OK
* Bytes 4000..4095 โ in mapped range but past file end
* โ SIGBUS on access
*/
#include <signal.h>
#include <stdio.h>
void sigbus_handler(int sig)
{
printf("Caught SIGBUS (signal %d) โ access beyond file end!\n", sig);
/* In real code: use siglongjmp to recover, not return */
_exit(1);
}
/* Install before mmap if you want graceful handling */
signal(SIGBUS, sigbus_handler);
MAP_PRIVATE | PROT_READ | PROT_EXEC) of the text segment from the ELF binary file. Since it is private, writes by one process do not affect others running the same binary. All processes running the same binary initially share the same physical code pages from the page cache โ only when self-modifying code or debugger patches cause a write does COW produce per-process copies.read()/write(), plus a copy between kernel buffer and user buffer. Memory-mapped I/O maps the kernel’s page cache pages directly into user space โ accesses are plain CPU load/store instructions with no system call overhead and no extra copy. Memory-mapped I/O is especially efficient for random access patterns; sequential access with large reads via read() can be equally fast due to kernel read-ahead.msync() is needed when you require a guarantee that modified pages have been written to disk (e.g., for crash safety/durability). If you don’t call it, the kernel will eventually flush dirty pages at its own pace (e.g., after 30 seconds by default, or when the page cache is under pressure). On munmap(), the kernel will also flush, but this is not guaranteed before process exit. For durability-sensitive code (like a database), always call msync(MS_SYNC) before considering data committed.SIGBUS is delivered when a process accesses a memory address that is within a valid mapped region but has no backing store โ specifically when the underlying file has been truncated after the mapping was created, so pages beyond the (now smaller) file end exist in the mapping but cannot be loaded from the file. This is distinct from SIGSEGV, which occurs when access is made to an address completely outside any mapped region.MAP_SHARED mapping of the same file region. Because the kernel maps the same physical page cache pages into both virtual address spaces, a write by one process is immediately visible to the other (no explicit sync needed for in-memory visibility). The file acts as a rendezvous point, and both processes simply use pointer-based memory operations to communicate. Synchronization (mutual exclusion) must be provided separately, e.g., using a POSIX semaphore or a mutex with PTHREAD_PROCESS_SHARED attribute placed inside the shared region.munmap() removes the mapping from the process’s virtual address space but does not guarantee that dirty (modified) pages are flushed to physical disk. The kernel may defer the actual disk write. To guarantee disk persistence, call msync(addr, length, MS_SYNC) before munmap(). This is analogous to how close(fd) does not guarantee disk sync โ you need an explicit fsync(fd).MAP_PRIVATE | PROT_READ | PROT_EXEC. All processes mapping the same .so file initially point to the same physical pages in the page cache. Since the code is read-only and copy-on-write is triggered only on writes, the code pages are never copied โ all processes literally run from the same physical RAM pages. Only if the code section is patched (e.g., by a debugger or self-modifying code) would COW allocate a private copy.Part 3 of 5 โ File Mappings
โ Part 2: mmap() API Next: Anonymous Mappings โ ๐ Home
