mincore() — Memory Residency Find Out Which Pages Are Currently in RAM

 

mincore() — Memory Residency
Chapter 50 · File 3 of 3 — Find Out Which Pages Are Currently in RAM

What is mincore()?

We have seen how to lock pages into RAM with mlock() and mlockall(). But what if you just want to inspect which pages are currently in RAM without locking them?

mincore() (think: memory in-core) does exactly this. Given a virtual address range, it returns a byte array telling you, for each page in the range, whether it is currently resident in physical RAM or not.

This is the complement to the memory locking calls — instead of controlling residency, it reports it.

  • Check if a buffer is in RAM before a time-critical section
  • Diagnose excessive page faults in your application
  • Understand how the kernel’s page cache is loaded for a file mapping
  • Implement prefetch logic — load pages before they are needed

The mincore() API

#define _BSD_SOURCE   /* Or: #define _SVID_SOURCE */
#include <sys/mman.h>

int mincore(void *addr, size_t length, unsigned char *vec);

/* Returns 0 on success, or -1 on error */

addr — Start of the virtual address range to inspect. Must be page-aligned (unlike mlock(), which accepts any address).

length — Number of bytes to inspect. Internally rounded up to the next page boundary.

vec — Pointer to an output array of unsigned char. Must have at least (length + PAGE_SIZE - 1) / PAGE_SIZE bytes. Each byte in vec corresponds to one page. The least significant bit (bit 0) of each byte is 1 if that page is currently in RAM, 0 if it is swapped out or not yet loaded.

How the vec[] Array Works

vec[0] vec[1] vec[2] vec[3] vec[4] vec[5]
Page 0 Page 1 Page 2 Page 3 Page 4 Page 5
0x01 ✓ In RAM 0x00 ✗ Swapped 0x01 ✓ In RAM 0x01 ✓ In RAM 0x00 ✗ Swapped 0x01 ✓ In RAM

Only bit 0 (the least significant bit) is defined. Always test with vec[i] & 1. The upper 7 bits may have undefined values on some UNIX systems — never rely on them for portability.

Calculating the vec[] Array Size

#include <unistd.h>
#include <stdio.h>

int main(void)
{
    long page_size = sysconf(_SC_PAGESIZE);

    /* For a region of 'length' bytes, vec must have at least: */
    size_t length    = 100 * 1024;   /* 100 KB */
    size_t num_pages = (length + page_size - 1) / page_size;

    printf("Page size      : %ld bytes\n", page_size);
    printf("Region length  : %zu bytes\n", length);
    printf("vec[] size needed: %zu bytes (one per page)\n", num_pages);

    /* Example with 4096-byte pages:
     * 100 * 1024 = 102400 bytes
     * (102400 + 4095) / 4096 = 25 pages
     * vec needs 25 bytes
     */
    return 0;
}

Example 1: Basic mincore() Usage

#define _BSD_SOURCE
#include <sys/mman.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void)
{
    long page_size  = sysconf(_SC_PAGESIZE);
    size_t num_pages = 8;
    size_t length    = num_pages * page_size;
    unsigned char *vec;
    char *region;

    /* Allocate page-aligned region (mincore requires page-aligned addr) */
    region = mmap(NULL, length, PROT_READ | PROT_WRITE,
                  MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
    if (region == MAP_FAILED) {
        perror("mmap");
        return 1;
    }

    /* vec: one byte per page */
    vec = malloc(num_pages);
    if (!vec) { perror("malloc"); munmap(region, length); return 1; }

    /* Check residency BEFORE touching pages */
    if (mincore(region, length, vec) == -1) {
        perror("mincore");
        free(vec); munmap(region, length); return 1;
    }

    printf("Residency BEFORE touching pages:\n");
    for (size_t i = 0; i < num_pages; i++)
        printf("  Page %zu: %s\n", i, (vec[i] & 1) ? "IN RAM" : "not resident");

    /* Touch alternate pages — bring them into RAM */
    for (size_t i = 0; i < num_pages; i += 2)
        region[i * page_size] = 'X';

    /* Check residency AFTER touching alternate pages */
    mincore(region, length, vec);
    printf("\nResidency AFTER touching pages 0,2,4,6:\n");
    for (size_t i = 0; i < num_pages; i++)
        printf("  Page %zu: %s\n", i, (vec[i] & 1) ? "IN RAM" : "not resident");

    free(vec);
    munmap(region, length);
    return 0;
}

/* Sample output (approximate — kernel may prefetch):
 * Residency BEFORE touching pages:
 *   Page 0: not resident
 *   Page 1: not resident
 *   ...
 * Residency AFTER touching pages 0,2,4,6:
 *   Page 0: IN RAM
 *   Page 1: not resident
 *   Page 2: IN RAM
 *   Page 3: not resident
 *   ...
 */

⚠️ addr must be page-aligned for mincore()

Unlike mlock() which accepts any addr and rounds it down, mincore() requires addr to be page-aligned. If you pass an unaligned address, mincore() returns -1 with errno = EINVAL. Always align your address using the page size before calling mincore().

Example 2: Graphical Page Residency Display (TLPI Listing 50-2 Style)

This is the displayMincore() function from the TLPI book example. It shows a visual map of which pages are in RAM — very useful for debugging:

#define _BSD_SOURCE
#include <sys/mman.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

/* Print a graphical map: 'R' = in RAM, '.' = not resident */
static void display_mincore(char *addr, size_t length)
{
    long page_size = sysconf(_SC_PAGESIZE);
    long num_pages = (length + page_size - 1) / page_size;

    unsigned char *vec = malloc(num_pages);
    if (!vec) { perror("malloc"); return; }

    if (mincore(addr, length, vec) == -1) {
        perror("mincore");
        free(vec);
        return;
    }

    printf("Page residency map (%ld pages, page_size=%ld):\n[", num_pages, page_size);
    for (long j = 0; j < num_pages; j++) {
        printf("%c", (vec[j] & 1) ? 'R' : '.');
        if ((j + 1) % 64 == 0 && j + 1 < num_pages)
            printf("]\n[");
    }
    printf("]\n");
    printf("  'R' = In RAM (resident)   '.' = Not in RAM\n");

    free(vec);
}

int main(void)
{
    long page_size  = sysconf(_SC_PAGESIZE);
    int  num_pages  = 32;
    size_t length   = num_pages * page_size;

    char *region = mmap(NULL, length, PROT_READ | PROT_WRITE,
                        MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
    if (region == MAP_FAILED) { perror("mmap"); return 1; }

    printf("--- BEFORE mlock ---\n");
    display_mincore(region, length);

    /* Lock every other pair of pages */
    for (int i = 0; i < num_pages; i += 4) {
        mlock(region + i * page_size, 2 * page_size);
    }

    printf("\n--- AFTER mlock (every alternate 2-page group) ---\n");
    display_mincore(region, length);

    munmap(region, length);
    return 0;
}

/* Sample output:
 * --- BEFORE mlock ---
 * Page residency map (32 pages, page_size=4096):
 * [................................]
 *
 * --- AFTER mlock (every alternate 2-page group) ---
 * [RR..RR..RR..RR..RR..RR..RR..RR..]
 */

Example 3: Check Which Pages of a File Are in Page Cache

When you mmap() a file, the pages are not immediately loaded. mincore() lets you see exactly which parts of the file are currently in the kernel’s page cache:

#define _BSD_SOURCE
#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[])
{
    if (argc < 2) {
        fprintf(stderr, "Usage: %s <file>\n", argv[0]);
        return 1;
    }

    long page_size = sysconf(_SC_PAGESIZE);

    /* Open and mmap the file */
    int fd = open(argv[1], O_RDONLY);
    if (fd == -1) { perror("open"); return 1; }

    struct stat st;
    fstat(fd, &st);
    size_t file_size = st.st_size;

    char *mapped = mmap(NULL, file_size, PROT_READ, MAP_SHARED, fd, 0);
    close(fd);
    if (mapped == MAP_FAILED) { perror("mmap"); return 1; }

    /* Calculate pages */
    size_t num_pages = (file_size + page_size - 1) / page_size;
    unsigned char *vec = calloc(num_pages, 1);
    if (!vec) { perror("calloc"); munmap(mapped, file_size); return 1; }

    /* Query page cache residency */
    if (mincore(mapped, file_size, vec) == -1) {
        perror("mincore");
        free(vec); munmap(mapped, file_size); return 1;
    }

    /* Count resident pages */
    size_t resident = 0;
    for (size_t i = 0; i < num_pages; i++)
        if (vec[i] & 1) resident++;

    printf("File: %s\n", argv[1]);
    printf("Size: %zu bytes (%zu pages)\n", file_size, num_pages);
    printf("Pages in page cache: %zu / %zu (%.1f%%)\n",
           resident, num_pages, 100.0 * resident / num_pages);

    /* Print map */
    printf("Cache map: [");
    for (size_t i = 0; i < num_pages && i < 64; i++)
        printf("%c", (vec[i] & 1) ? 'C' : '.');
    if (num_pages > 64) printf("...");
    printf("]\n  'C' = cached   '.' = not cached\n");

    free(vec);
    munmap(mapped, file_size);
    return 0;
}

/* Usage:
 *   ./check_cache /etc/passwd
 *   ./check_cache /usr/bin/ls
 *
 * After reading a file: most pages will be 'C' (cached)
 * After a cold boot:    most pages will be '.' (not cached)
 */

Example 4: Use mincore() to Decide When to Prefetch

You can use mincore() to check if a region is in RAM before a critical section, and use madvise(MADV_WILLNEED) to prefetch it if not:

#define _BSD_SOURCE
#include <sys/mman.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

/* Returns 1 if ALL pages in range are resident, 0 otherwise */
int all_pages_resident(void *addr, size_t length)
{
    long page_size = sysconf(_SC_PAGESIZE);
    size_t num_pages = (length + page_size - 1) / page_size;

    /* Align addr to page boundary */
    uintptr_t aligned_addr = (uintptr_t)addr & ~(page_size - 1);
    size_t aligned_len = length + ((uintptr_t)addr - aligned_addr);

    unsigned char *vec = malloc(num_pages);
    if (!vec) return -1;

    int result = 1;
    if (mincore((void *)aligned_addr, aligned_len, vec) == 0) {
        for (size_t i = 0; i < num_pages; i++) {
            if (!(vec[i] & 1)) { result = 0; break; }
        }
    } else {
        result = -1;
    }

    free(vec);
    return result;
}

/* Prefetch pages if not already resident */
void ensure_resident(void *addr, size_t length)
{
    int status = all_pages_resident(addr, length);
    if (status == 0) {
        printf("Pages not resident — prefetching with MADV_WILLNEED\n");
        madvise(addr, length, MADV_WILLNEED);
        /* Note: MADV_WILLNEED is asynchronous — kernel starts loading
         * but may not be done immediately */
    } else if (status == 1) {
        printf("All pages already resident — no prefetch needed\n");
    }
}

int main(void)
{
    long page_size = sysconf(_SC_PAGESIZE);
    size_t size = 16 * page_size;

    char *buf = mmap(NULL, size, PROT_READ | PROT_WRITE,
                     MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
    if (buf == MAP_FAILED) { perror("mmap"); return 1; }

    ensure_resident(buf, size);   /* Not resident yet */

    /* Touch first half */
    memset(buf, 0, size / 2);
    ensure_resident(buf, size);   /* First half resident, second half not */

    munmap(buf, size);
    return 0;
}

Example 5: Full Demo — mlock() + mincore() Together

This is the complete demonstration program from the TLPI listing (Listing 50-2). It maps a region, uses mincore() to show residency before and after mlock():

#define _BSD_SOURCE
#include <sys/mman.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

static void display_residency(char *addr, long num_pages, long page_size,
                               const char *label)
{
    unsigned char *vec = malloc(num_pages);
    size_t length = num_pages * page_size;

    if (!vec || mincore(addr, length, vec) == -1) {
        perror("mincore/malloc"); free(vec); return;
    }

    printf("\n%s:\n[", label);
    for (long j = 0; j < num_pages; j++)
        printf("%c", (vec[j] & 1) ? 'R' : '.');
    printf("]\n");
    free(vec);
}

int main(int argc, char *argv[])
{
    /* Defaults: 32 total pages, lock every other page */
    int total_pages = (argc > 1) ? atoi(argv[1]) : 32;
    int lock_step   = (argc > 2) ? atoi(argv[2]) : 2;   /* lock 1, skip 1 */

    long page_size = sysconf(_SC_PAGESIZE);
    size_t total_len = total_pages * page_size;

    /* Map a large anonymous region */
    char *region = mmap(NULL, total_len, PROT_READ | PROT_WRITE,
                        MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
    if (region == MAP_FAILED) { perror("mmap"); return 1; }

    printf("Mapped %d pages (%zu bytes) at %p\n",
           total_pages, total_len, (void *)region);
    printf("Page size: %ld bytes\n", page_size);

    /* Show residency before any locking */
    display_residency(region, total_pages, page_size,
                      "BEFORE mlock (no pages touched or locked)");

    /* Lock pages at regular intervals */
    int locked_count = 0;
    for (int i = 0; i < total_pages; i += lock_step) {
        if (mlock(region + i * page_size, page_size) == -1) {
            perror("mlock");
            break;
        }
        locked_count++;
    }
    printf("\nLocked %d pages (every %d pages)\n", locked_count, lock_step);

    /* Show residency after locking */
    display_residency(region, total_pages, page_size,
                      "AFTER mlock (locked pages should be R)");

    /* Unlock all and show final state */
    for (int i = 0; i < total_pages; i += lock_step)
        munlock(region + i * page_size, page_size);

    display_residency(region, total_pages, page_size,
                      "AFTER munlock (still in RAM but no longer locked)");

    munmap(region, total_len);
    return 0;
}

/* Compile and run:
 *   gcc -o memlock_demo memlock_demo.c
 *   ./memlock_demo 32 2      # 32 pages, lock every 2nd
 *
 * Expected output pattern:
 *   BEFORE: [................................]
 *   AFTER mlock: [R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.]
 *   AFTER munlock: [R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.R.]
 *   (pages remain in RAM after unlock until kernel needs the memory)
 */

Portability and Linux-Specific Notes

Note Detail
Not in POSIX/SUSv3 mincore() is not standardized by SUSv3 but is available on Linux, FreeBSD, macOS, and several other Unix systems. Not on all platforms.
Available since Linux kernel 2.4
Feature macro Define _BSD_SOURCE or _SVID_SOURCE before including <sys/mman.h>
vec type On Linux: unsigned char *. On some other UNIX: char *. Use unsigned char for Linux.
Only bit 0 is portable Always test with vec[i] & 1. Other bits are undefined on some implementations.
Bug before 2.6.21 Prior to Linux 2.6.21, mincore() gave incorrect results for MAP_PRIVATE mappings and nonlinear mappings created with remap_file_pages().
Result is a snapshot The residency status can change at any moment after mincore() returns. The kernel can evict pages at any time. Only locked pages (mlock/mlockall) are guaranteed to remain resident.

⚠️ Important: mincore() gives a snapshot, not a guarantee

The information returned by mincore() is only valid at the instant of the call. By the time your code reads vec[i], the kernel may have already evicted that page. The ONLY pages guaranteed to stay resident are those locked with mlock() or mlockall(). Use mincore() for diagnostics and monitoring, not as a guarantee of future access latency.

When to Use mlock() + mincore() Together

Goal Use How
Keep pages in RAM always mlock() Call mlock() on the region — guaranteed resident
Verify lock worked mincore() after mlock() Check that all vec bits are set after mlock()
Debug page fault causes mincore() before access Log which pages are not resident before a slow operation
Warm up page cache for file mincore() + madvise() Find uncached pages, call MADV_WILLNEED to prefetch
Monitor memory pressure Periodic mincore() Poll mincore() on critical buffers and alert if pages are evicted

Interview Questions and Answers

Q1: What does mincore() do and what is its primary use?

mincore() reports the memory residency of pages in a virtual address range. For each page, it tells you whether it is currently in physical RAM or not. It is used for diagnostics, performance tuning (identify pages likely to cause page faults), monitoring page cache status for file mappings, and deciding when to prefetch data before a time-critical section.

Q2: How is the vec[] array in mincore() structured?

vec[] has one byte per page in the queried range. The least significant bit (bit 0) of each byte is 1 if that page is resident in RAM, 0 if it is not. The other 7 bits are implementation-defined — portable code must only test bit 0 using vec[i] & 1. The size of vec must be at least (length + PAGE_SIZE - 1) / PAGE_SIZE bytes.

Q3: What is the key difference between addr requirements for mlock() vs mincore()?

mlock() accepts any addr — the kernel rounds it down to the page boundary automatically. mincore() REQUIRES that addr be page-aligned. Passing an unaligned addr to mincore() returns -1 with errno EINVAL. This is one of the most common bugs when using mincore().

Q4: Can you rely on mincore() output to guarantee no page fault will occur?

No. mincore() returns a snapshot — the information was accurate at the moment of the call, but the kernel can evict any unlocked page at any time after that. By the time you read the vec[] array and access the memory, a page may already have been evicted. The ONLY guarantee against future page faults is using mlock() or mlockall() to lock the pages, not mincore() which is read-only observation.

Q5: Is mincore() standardized? Which kernels and systems support it?

mincore() is NOT specified by POSIX or SUSv3. It is a BSD-originated syscall available on Linux (since kernel 2.4), FreeBSD, macOS, and several other Unix systems. On Linux, you need to define _BSD_SOURCE (or _SVID_SOURCE) before including <sys/mman.h> to get the prototype. It is not available on all Unix platforms, so portable code should guard its use with #ifdef checks.

Q6: What was the Linux bug with mincore() before kernel 2.6.21?

Prior to Linux 2.6.21, mincore() did not correctly report residency information for MAP_PRIVATE mappings or for nonlinear mappings created with remap_file_pages(). The reported bits could be incorrect in these cases. This was fixed in 2.6.21. Code relying on mincore() for MAP_PRIVATE regions should ideally target kernels 2.6.21 and later.

Q7: How would you use mincore() to detect if a real-time buffer might cause page faults?

Before entering a time-critical section, call mincore() on the buffer. Scan the vec[] array — if any bit 0 is 0, that page is not resident and will cause a page fault when accessed. You can then either prefetch the page using madvise(MADV_WILLNEED) to asynchronously bring it in, or lock it with mlock() to guarantee it stays resident for the duration of the critical section.

Q8: What is the relationship between mincore() and the page cache for file-backed mappings?

When you mmap() a file with MAP_SHARED, the pages correspond to the kernel’s page cache for that file. mincore() lets you check which parts of the file are currently cached. If a page is not resident (bit 0 = 0), accessing it will trigger a page fault and a disk read. You can use mincore() + madvise(MADV_WILLNEED) to warm up the page cache ahead of time, reducing latency when the data is actually needed.

Chapter 50 Summary — All System Calls at a Glance

System Call Purpose Scope addr alignment? Privilege?
mlock() Lock a specific region addr..addr+length Not required (recommended) RLIMIT_MEMLOCK applies
munlock() Unlock a specific region addr..addr+length Not required No privilege needed
mlockall() Lock all process memory Entire address space N/A (no addr param) RLIMIT_MEMLOCK applies
munlockall() Unlock all process memory + cancel MCL_FUTURE Entire address space N/A No privilege since 2.6.9
mincore() Query page residency (read-only) addr..addr+length Required (EINVAL if not) No privilege needed

Leave a Reply

Your email address will not be published. Required fields are marked *