How malloc() and free() Really Work in Linux

 

 

How malloc() and free() Really Work in Linux

Go beyond the surface — understand the free-list, block metadata, memory recycling, debugging tools, and the bugs that bite every C programmer

⏱️ 15 min read
🎯 Intermediate
💻 Linux C / Systems Programming

The malloc() Myth

Most programmers think of malloc() as “give me N bytes” and free() as “take it back.” That mental model is correct for basic usage but dangerously incomplete when you need to debug memory corruption, leaks, or performance bottlenecks in a long-running system.

The truth: malloc() manages a private free list inside your process. It calls sbrk() (or mmap() for large blocks) only when necessary, and reuses previously freed blocks to minimize expensive system calls. free() almost never shrinks the heap — it just returns a block to the internal list. This article explains how, with examples that show what can go wrong and how to diagnose it.

🔑 Topics Covered

malloc() internals free() free-list Block metadata Memory leaks Double free bug Use-after-free Valgrind debugging MALLOC_CHECK_ mtrace() mcheck()

1. What malloc() Actually Does

When you call malloc(n), glibc’s allocator does the following:

  1. It scans the internal free list for a previously freed block that is at least as large as n bytes.
  2. If a suitable block exists, it is removed from the free list and returned (possibly split if it is much larger than needed).
  3. If no suitable block exists, it calls sbrk() to grow the heap — but it requests a larger chunk than you asked for, placing the surplus on the free list.
  4. Before returning the pointer, malloc() writes hidden metadata (the block’s size) just before the returned address. This is how free() later knows the size of the block.

Block Memory Layout (Conceptual)

[ Block Length L ]
(hidden metadata)
← malloc() returns pointer here
[ Memory available to caller — L bytes ]

Free Block Layout (on the free list)

[ Length L ] [ Ptr → Prev Free ] [ Ptr → Next Free ] [ Remaining bytes ]

When a block is on the free list, the first bytes of its usable area are repurposed to hold the previous and next free-block pointers (doubly linked list). This is clever: no extra memory is needed for list management — the free space itself stores the pointers.

Heap Snapshot: Allocated Blocks Mixed With Free List

ALLOCATED (64B)
ptr_A
FREE (128B)
← free list node
ALLOCATED (32B)
ptr_B
FREE (256B)
← free list node
ALLOCATED (96B)
ptr_C
↑ Program Break

This interleaved state is normal after several malloc()/free() cycles. The blocks at the very top of the heap must all be free before free() will even consider calling sbrk() to shrink the heap.

2. What free() Actually Does (It Probably Doesn’t Shrink the Heap)

When you call free(ptr), the allocator:

  1. Reads the block length from the hidden metadata just before ptr.
  2. Coalesces the block with any adjacent free blocks to prevent fragmentation.
  3. Inserts the resulting free block back onto the free list.
  4. Only calls sbrk() to shrink the heap if the free block at the very top of the heap is “sufficiently” large (glibc default: ≥128 KB).

Coalescing is critical. If block A (128 bytes) and block B (64 bytes) are both free and adjacent, the allocator merges them into a single 192-byte free block. Without this, the free list would fill up with tiny unusable fragments.

💡 Example 1: Dynamic Linked List for Network Packet Buffering

Consider a network packet processing application. Packets arrive at variable rates and sizes. A linked list where each node is dynamically allocated is a natural fit. This example shows correct malloc/free usage and common pitfall avoidance.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>

#define MAX_PAYLOAD  1500  /* Standard Ethernet MTU */

/* Packet node in our receive queue */
typedef struct Packet {
    uint32_t     src_ip;
    uint32_t     dst_ip;
    uint16_t     port;
    uint16_t     length;
    uint8_t     *payload;      /* separately allocated payload buffer */
    struct Packet *next;
} Packet;

/* Allocate a new packet — malloc() for struct AND payload separately */
Packet *packet_create(uint32_t src, uint32_t dst,
                      uint16_t port, const uint8_t *data, uint16_t len)
{
    if (len > MAX_PAYLOAD) return NULL;

    Packet *pkt = malloc(sizeof(Packet));  /* struct allocation */
    if (pkt == NULL) {
        perror("malloc struct");
        return NULL;
    }

    pkt->payload = malloc(len);            /* payload allocation */
    if (pkt->payload == NULL) {
        free(pkt);                         /* free struct on payload failure */
        perror("malloc payload");
        return NULL;
    }

    memcpy(pkt->payload, data, len);
    pkt->src_ip = src;
    pkt->dst_ip = dst;
    pkt->port   = port;
    pkt->length = len;
    pkt->next   = NULL;
    return pkt;
}

/* Free a single packet — MUST free payload first, then struct */
void packet_destroy(Packet *pkt)
{
    if (pkt == NULL) return;       /* safe: free(NULL) is a no-op anyway */
    free(pkt->payload);           /* free inner allocation first */
    pkt->payload = NULL;          /* null out to catch use-after-free early */
    free(pkt);                    /* then free the struct */
}

/* Free the entire packet queue */
void queue_destroy(Packet *head)
{
    Packet *current = head;
    while (current != NULL) {
        Packet *next = current->next;  /* save next BEFORE freeing current */
        packet_destroy(current);
        current = next;                /* never access current after free! */
    }
}

int main(void)
{
    Packet *head = NULL, *tail = NULL;
    uint8_t data[] = { 0x48, 0x65, 0x6C, 0x6C, 0x6F };  /* "Hello" */

    /* Build a 3-packet queue */
    for (int i = 0; i < 3; i++) {
        Packet *p = packet_create(0xC0A80001, 0xC0A80002, 8080, data, 5);
        if (p == NULL) { queue_destroy(head); return 1; }

        if (head == NULL) { head = tail = p; }
        else              { tail->next = p; tail = p; }
    }

    /* Process and display */
    int count = 0;
    for (Packet *p = head; p != NULL; p = p->next) {
        printf("Packet %d: src=0x%08X port=%d len=%d\n",
               ++count, p->src_ip, p->port, p->length);
    }

    /* Clean up EVERYTHING */
    queue_destroy(head);
    head = tail = NULL;  /* null out list head after freeing */
    printf("All %d packets freed cleanly.\n", count);
    return 0;
}

Best practices shown: Always check malloc() return value. When multiple allocations are nested (struct + payload), free in reverse order of allocation. Null out pointers after freeing to make use-after-free bugs immediately obvious. Save next pointer before freeing a node.

Nested malloc/free Linked list cleanup NULL pointer safety

💡 Example 2: Demonstrating Common malloc/free Bugs

The three most dangerous heap bugs are: buffer overflow (writing past allocation), double free (freeing the same pointer twice), and use-after-free (reading/writing after free). The following code intentionally demonstrates each bug — do NOT run this in production; it is for study purposes only.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

/*
 * EDUCATIONAL ONLY — These functions contain INTENTIONAL BUGS
 * Compile with: gcc -g -fsanitize=address bug_demo.c -o bug_demo
 * AddressSanitizer will catch and report each bug category.
 */

void demo_buffer_overflow(void)
{
    char *buf = malloc(8);        /* allocate exactly 8 bytes */
    if (!buf) return;

    /* BUG: writing 12 bytes into an 8-byte buffer */
    /* This overwrites malloc's metadata for the NEXT block */
    /* strcpy(buf, "HelloWorld!");  <-- DANGEROUS — commented out */

    /* Safe version: */
    strncpy(buf, "Hi!", 8);
    buf[7] = '\0';
    printf("Safe write: %s\n", buf);

    /* Imagine if we wrote beyond buf[7]: we corrupt the length  */
    /* field of the next allocated block, causing chaos in       */
    /* subsequent malloc()/free() calls.                         */
    free(buf);
}

void demo_double_free(void)
{
    int *data = malloc(sizeof(int) * 4);
    if (!data) return;

    data[0] = 42;
    printf("data[0] = %d\n", data[0]);

    free(data);           /* correct first free */
    data = NULL;          /* THIS LINE PREVENTS double-free */

    /* Without the NULL assignment above, calling free(data) again */
    /* passes a dangling pointer to free(), which corrupts the      */
    /* free-list. On glibc, this often triggers SIGSEGV or a       */
    /* "double free or corruption" abort.                           */

    if (data != NULL) free(data);   /* safe: only frees if non-NULL */
    printf("No double-free thanks to NULL guard\n");
}

void demo_use_after_free(void)
{
    char *msg = malloc(64);
    if (!msg) return;

    snprintf(msg, 64, "BLE device: HM-10");
    printf("Before free: %s\n", msg);

    free(msg);
    /* msg is now a DANGLING POINTER — the memory has been returned  */
    /* to the free-list. DO NOT access it. The memory may be         */
    /* reallocated for a completely different purpose.                */

    /* BUG (commented out): printf("After free: %s\n", msg); */
    /* Safe practice: */
    msg = NULL;
    if (msg != NULL) printf("%s\n", msg);   /* would not execute */
    printf("Use-after-free avoided by NULL assignment\n");
}

int main(void)
{
    demo_buffer_overflow();
    demo_double_free();
    demo_use_after_free();
    return 0;
}
Buffer overflow Double free Use-after-free AddressSanitizer

💡 Example 3: Detecting Memory Leaks with MALLOC_CHECK_ and mtrace()

A memory leak occurs when allocated memory is never freed, and your program holds references to it forever — eventually exhausting virtual memory. Leaks are common in long-running daemons. Linux provides two low-overhead built-in tools to catch them: the MALLOC_CHECK_ environment variable and the mtrace()/muntrace() functions.

#include <stdio.h>
#include <stdlib.h>
#include <mcheck.h>    /* for mtrace() and muntrace() */

/* Simulating a daemon that processes configuration entries */
typedef struct ConfigEntry {
    int  key;
    char value[128];
} ConfigEntry;

ConfigEntry *load_config_entry(int key, const char *val)
{
    ConfigEntry *entry = malloc(sizeof(ConfigEntry));
    if (!entry) return NULL;
    entry->key = key;
    snprintf(entry->value, 128, "%s", val);
    return entry;
}

int main(void)
{
    /*
     * Enable mtrace BEFORE any allocations.
     * Set the trace output file via environment variable:
     *   export MALLOC_TRACE=/tmp/mem_trace.log
     * Then run:
     *   mtrace ./your_program /tmp/mem_trace.log
     * to get a human-readable leak report.
     */
    mtrace();

    ConfigEntry *e1 = load_config_entry(1, "uart_baud=115200");
    ConfigEntry *e2 = load_config_entry(2, "i2c_speed=400000");
    ConfigEntry *e3 = load_config_entry(3, "spi_mode=0");

    printf("Loaded: key=%d val=%s\n", e1->key, e1->value);
    printf("Loaded: key=%d val=%s\n", e2->key, e2->value);
    printf("Loaded: key=%d val=%s\n", e3->key, e3->value);

    /* Intentionally NOT freeing e3 — this is a memory leak */
    free(e1);
    free(e2);
    /* free(e3);  <-- missing! mtrace will report this */

    muntrace();  /* stop tracing */

    /*
     * After running, analyse the trace:
     *   $ mtrace ./program $MALLOC_TRACE
     * Output will show something like:
     *   Memory not freed:
     *   Address     Size     Caller
     *   0x55a3b...  0x90     at /path/to/main.c:22
     *
     * Valgrind is more powerful:
     *   $ valgrind --leak-check=full ./program
     */

    printf("\nTo detect the leak, run with:\n");
    printf("  export MALLOC_TRACE=/tmp/trace.log\n");
    printf("  ./program\n");
    printf("  mtrace ./program /tmp/trace.log\n");

    return 0;
}

MALLOC_CHECK_ Values (Quick Reference)

Value Behavior
0 Ignore errors silently (default if unset)
1 Print diagnostic message to stderr
2 Call abort() — terminates process with core dump
3 Print message AND abort()
# Usage at the shell — no recompilation needed:
MALLOC_CHECK_=2 ./your_daemon

# Example: double-free now calls abort() and produces a core dump
# which gdb can analyse:
# gdb ./your_daemon core
mtrace() tracing MALLOC_CHECK_ Memory leak detection Valgrind

4. malloc Debugging Tools: Comparison

Tool How to Use Catches Overhead
MALLOC_CHECK_=2 Shell env variable, no recompile Double free, invalid ptr, heap corruption Very low
mtrace() Add calls in code + MALLOC_TRACE=file Memory leaks (allocations without matching free) Low
mcheck() Link with -lmcheck Block consistency, writes past end Low
Valgrind valgrind --leak-check=full Leaks, use-after-free, uninit reads, overflows High (5-20x slower)
AddressSanitizer gcc -fsanitize=address Buffer overflow, use-after-free, double-free Medium (2x slower)

🎯 Interview Questions: malloc() and free()

# Question Answer / Key Points
1 Does free() immediately return memory to the OS? Generally no. It returns the block to the internal free list. The program break only decreases if a sufficiently large contiguous free region exists at the very top of the heap (glibc threshold: ~128 KB).
2 How does free() know the size of the block it is freeing? malloc() stores the block size in a hidden integer field just before the pointer returned to the caller. free() reads this field to determine the block size.
3 What is heap fragmentation and how does glibc mitigate it? Fragmentation occurs when free blocks are too small to satisfy requests. glibc mitigates it through coalescing — merging adjacent free blocks into larger ones when blocks are freed.
4 What happens when you call free(NULL)? Nothing — it is defined as a safe no-op by the C standard. This is why setting a pointer to NULL after freeing it makes subsequent accidental free() calls harmless.
5 What is a memory leak and what are its symptoms in production? A leak is when allocated memory is never freed while the program still holds a reference. Symptoms: steadily growing RSS in /proc/PID/status, eventual malloc() failures, OOM-killer termination.
6 What is the danger of a double-free bug? The block gets inserted into the free list twice. A later malloc() may return the same address twice to two different callers, creating aliased pointers that corrupt each other’s data.
7 What alignment guarantees does malloc() provide? malloc() returns memory aligned to a boundary suitable for any fundamental C data type — 8 bytes on 32-bit architectures, 16 bytes on 64-bit.
8 What does malloc(0) return on Linux? On Linux (glibc), it returns a unique non-NULL pointer to a zero-size allocation that can (and should) be freed. This is implementation-defined behavior; portability requires NULL-checking regardless.
9 How do you detect memory leaks without modifying source code? Use Valgrind: valgrind --leak-check=full ./program. Or set the MALLOC_TRACE env variable and call mtrace()/muntrace() in the code, then analyze the trace file with the mtrace script.

Summary

  • malloc() searches its internal free list first; calls sbrk() only when the list has no suitable block.
  • Block size is stored in hidden metadata just before the returned pointer — this is how free() knows the size.
  • free() coalesces adjacent free blocks and usually does NOT shrink the heap unless a large contiguous free region sits at the very top.
  • The three most dangerous heap bugs are: buffer overflow (corrupts metadata), double free (corrupts free list), and use-after-free (aliased access to recycled memory).
  • Always check malloc() return values, NULL pointers after freeing, and use Valgrind or AddressSanitizer during development.

Next in the Series

Part 3 covers calloc(), realloc(), and aligned memory allocation — including DMA-aligned buffers with memalign() and posix_memalign().

Next: calloc, realloc and Aligned Memory → Back to EmbeddedPathashala

Leave a Reply

Your email address will not be published. Required fields are marked *