Linux File I/O โ€“ dup/dup2 and pread/pwrite

Linux File I/O โ€“ dup/dup2 and pread/pwrite

How shell redirection works under the hood โ€” and how to do I/O without moving the file pointer

๐Ÿ“‹
dup()
fd duplication
๐ŸŽฏ
dup2()
Exact fd number
๐Ÿ”
dup3()
With cloexec
๐Ÿ“
pread/pwrite
Offset I/O

What You Will Learn

Two groups of system calls in this post. First: dup, dup2, dup3 โ€” the calls that create additional file descriptor numbers pointing to the same underlying file. We will see exactly how the shell implements > and 2>&1 redirection using these calls. Second: pread and pwrite โ€” which read and write at a specified offset without touching the shared file position, making them safe in multi-threaded code.

Topics Covered

dup() โ€” lowest available fd dup2() โ€” specific fd number dup3() โ€” with O_CLOEXEC Shell redirection internals stdout โ†’ file redirection pread() โ€” read at offset pwrite() โ€” write at offset Thread-safe file access Seekable vs non-seekable fds

๐Ÿ“‹ Section 1 โ€” dup() โ€” Make a Copy of a File Descriptor

dup(oldfd) creates a new file descriptor number that points to the exact same Open File Description (OFD) as oldfd. The kernel picks the lowest available fd number for the new one. After the call, both fd numbers are interchangeable โ€” they see the same file at the same position.

#include <unistd.h>

int dup(int oldfd);

/* Returns: new fd number (lowest available), or -1 on error */
/* The new fd shares the same OFD as oldfd: */
/* – same file offset */
/* – same status flags (O_APPEND, O_NONBLOCK, etc.) */
/* The new fd gets its own FD_CLOEXEC โ€” always cleared (OFF) */

Before and After dup(1) โ€” fd 3 is created pointing to the same OFD as fd 1
BEFORE dup(1)
fd 0 โ†’ OFD #A (stdin)
fd 1 โ†’ OFD #B (stdout)
fd 2 โ†’ OFD #C (stderr)
fd 3 (unused)
โ†’
AFTER dup(1) โ†’ returns 3
fd 0 โ†’ OFD #A (stdin)
fd 1 โ†’ OFD #B โ—„โ”€โ”
fd 2 โ†’ OFD #C
fd 3 โ†’ OFD #B โ—„โ”€โ”˜ (same!)

Example 1 โ€” Basic dup() Usage and Shared Offset Proof

/*
 * dup_basics.c  โ€”  Learn dup() step by step
 * Build: gcc dup_basics.c -o dup_basics
 */
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>

int main(void)
{
    /* Write a test file */
    int wfd = open("dup_test.txt", O_WRONLY | O_CREAT | O_TRUNC, 0644);
    write(wfd, "ABCDEFGHIJKLMNOP", 16);
    close(wfd);

    /* Open for reading โ€” one OFD is created */
    int fd1 = open("dup_test.txt", O_RDONLY);

    /* dup() gives us a new fd number that shares fd1's OFD */
    int fd2 = dup(fd1);

    printf("fd1 = %d, fd2 = %d\n", fd1, fd2);
    printf("fd2 is the lowest unused fd (was %d)\n\n", fd2);

    char buf[5] = {0};

    /* Read 4 bytes via fd1 โ†’ OFD offset advances from 0 to 4 */
    read(fd1, buf, 4);
    printf("Read via fd1: '%s'  (offset now 4 in shared OFD)\n", buf);

    /* Read 4 more via fd2 โ†’ picks up at offset 4, not 0 */
    memset(buf, 0, 5);
    read(fd2, buf, 4);
    printf("Read via fd2: '%s'  (continued from offset 4!)\n\n", buf);

    /*
     * Closing fd1 does NOT close the underlying file.
     * The OFD stays open because fd2 still references it.
     */
    close(fd1);
    memset(buf, 0, 5);
    read(fd2, buf, 4);
    printf("After close(fd1), read via fd2: '%s'  (still works!)\n", buf);
    printf("OFD is only closed when ALL fds pointing to it are closed.\n");

    close(fd2);
    return 0;
}

๐ŸŽฏ Section 2 โ€” dup2() โ€” Redirect to a Specific fd Number

dup() picks the lowest free fd. That is fine for some uses, but for I/O redirection you need to redirect a specific fd number โ€” always fd 1 for stdout, always fd 2 for stderr. dup2(oldfd, newfd) does exactly that.

#include <unistd.h>

int dup2(int oldfd, int newfd);

/* Makes newfd point to the same OFD as oldfd. */
/* If newfd is already open, it is closed first (silently)*/
/* If oldfd == newfd: does nothing, returns newfd */
/* Returns: newfd on success, -1 on error */

How the Shell Implements > Redirection

When you type ./myprog > output.txt, the shell does exactly these steps in the child process before calling exec():

Shell Steps for: ./prog > out.txt 2>&1
Step Call Result
1 open(“out.txt”, O_WRONLY|O_CREAT|O_TRUNC, 0644) out.txt gets opened as fd 3 (or some high number)
2 dup2(3, 1) fd 1 (stdout) now points to out.txt’s OFD. Old fd 1 (terminal) is closed.
3 dup2(1, 2) (for 2>&1) fd 2 (stderr) now points to same OFD as fd 1 โ†’ out.txt
4 close(3) fd 3 no longer needed. File stays open via fd 1 and fd 2.
5 execvp(“./prog”, argv) Program runs. Its printf โ†’ fd 1 โ†’ out.txt. Its perror โ†’ fd 2 โ†’ out.txt.

Example 1 โ€” Redirect stdout to a File

/*
 * redirect_stdout.c  โ€”  Make printf write to a file instead of terminal
 * This is exactly what the shell does for ./prog > file.txt
 * Build: gcc redirect_stdout.c -o redirect_stdout
 * Run: ./redirect_stdout   then: cat redir_output.txt
 */
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>

int main(void)
{
    /* Open the file that will receive our output */
    int file_fd = open("redir_output.txt",
                       O_WRONLY | O_CREAT | O_TRUNC,
                       0644);
    if (file_fd == -1) { perror("open"); return 1; }

    printf("This line goes to the TERMINAL (before dup2)\n");
    fflush(stdout);

    /*
     * dup2(file_fd, STDOUT_FILENO):
     *   - closes fd 1 (the terminal)
     *   - makes fd 1 point to file_fd's OFD (our file)
     * After this, fd 1 = output to file
     */
    if (dup2(file_fd, STDOUT_FILENO) == -1) {
        perror("dup2");
        return 1;
    }

    /* file_fd (e.g. 3) is no longer needed separately.
     * The OFD stays alive through fd 1. */
    close(file_fd);

    /* All printf calls now go to redir_output.txt */
    printf("Line 1 โ€” this goes to the FILE\n");
    printf("Line 2 โ€” this also goes to the FILE\n");
    printf("File descriptor 1 now points to redir_output.txt\n");

    return 0;
}
/* After running:
   cat redir_output.txt
   You will see the three lines that used printf after dup2 */

Example 2 โ€” Redirect Both stdout and stderr to One File (2>&1)

/*
 * redirect_both.c  โ€”  Implement ./prog > out.txt 2>&1 in C
 * After this, printf (stdout) and perror (stderr) both go to the file.
 * Build: gcc redirect_both.c -o redirect_both
 * Run:   ./redirect_both   then: cat combined_output.txt
 */
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <errno.h>
#include <string.h>

int main(void)
{
    int out_fd = open("combined_output.txt",
                      O_WRONLY | O_CREAT | O_TRUNC,
                      0644);
    if (out_fd == -1) { perror("open"); return 1; }

    /* Step 1: redirect stdout (fd 1) โ†’ file */
    dup2(out_fd, STDOUT_FILENO);

    /*
     * Step 2: redirect stderr (fd 2) โ†’ same OFD as stdout
     * Now fd1 and fd2 share the same OFD so they share the
     * file offset โ€” writes interleave correctly in order.
     *
     * Important: must do dup2(fd1, fd2) AFTER fd1 is pointing
     * to the file, not before. Order matters!
     */
    dup2(STDOUT_FILENO, STDERR_FILENO);

    close(out_fd);  /* no longer needed separately */

    /* stdout and stderr both go to combined_output.txt */
    printf("stdout: Program started successfully\n");
    fprintf(stderr, "stderr: Warning โ€” this also goes to the file\n");
    printf("stdout: Processing data...\n");

    /* Trigger a real errno error to show perror goes to file too */
    FILE *missing = fopen("/nonexistent/path.txt", "r");
    if (!missing) {
        perror("stderr via perror: fopen failed"); /* goes to file */
    }

    printf("stdout: Done.\n");
    return 0;
}

Example 3 โ€” Capture printf Output into a Buffer

A useful testing technique: temporarily redirect stdout into a pipe, call a function that uses printf, then read what it printed back as a string.

/*
 * capture_output.c  โ€”  Redirect stdout to a pipe, capture printf output
 * Useful for testing functions that print to stdout.
 * Build: gcc capture_output.c -o capture_output
 */
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>

/* The function we want to test โ€” it uses printf internally */
void compute_and_print(int x)
{
    printf("Square of %d = %d\n", x, x * x);
    printf("Cube   of %d = %d\n", x, x * x * x);
}

int main(void)
{
    int pipefd[2];
    pipe(pipefd);  /* pipefd[0]=read, pipefd[1]=write */

    /* Save original stdout so we can restore it */
    int saved_stdout = dup(STDOUT_FILENO);

    /* Redirect stdout โ†’ pipe write end */
    dup2(pipefd[1], STDOUT_FILENO);
    close(pipefd[1]);  /* not needed separately */

    /* Call the function โ€” its printf output goes into the pipe */
    compute_and_print(5);
    fflush(stdout);  /* flush stdio buffer into the pipe */

    /* Restore stdout back to the terminal */
    dup2(saved_stdout, STDOUT_FILENO);
    close(saved_stdout);

    /* Now read what was captured from the pipe read end */
    char captured[256] = {0};
    ssize_t n = read(pipefd[0], captured, sizeof(captured) - 1);
    close(pipefd[0]);

    /* Back on terminal โ€” can print normally again */
    printf("=== Captured output (%zd bytes) ===\n", n);
    printf("%s", captured);

    /* Verify the output programmatically */
    if (strstr(captured, "25") && strstr(captured, "125"))
        printf("TEST PASSED\n");
    else
        printf("TEST FAILED\n");

    return 0;
}

๐Ÿ” Section 3 โ€” dup3() โ€” Duplicate with O_CLOEXEC Atomically

dup() and dup2() both clear the FD_CLOEXEC flag on the new fd. This means if you duplicate an fd and then call exec(), the duplicate is inherited by the new program โ€” which may be a security issue.

The naive fix is to call dup2() then fcntl(F_SETFD, FD_CLOEXEC) separately. But in a multi-threaded program, another thread could fork()+exec() in the tiny gap between those two calls, and the new fd would leak without the flag. dup3() sets the flag atomically inside the kernel call:

#define _GNU_SOURCE
#include <unistd.h>

int dup3(int oldfd, int newfd, int flags);

/* flags: 0 (same as dup2) or O_CLOEXEC */
/* Unlike dup2: if oldfd == newfd โ†’ returns error */
/* Linux 2.6.27+ only, Linux-specific */

Function New fd number FD_CLOEXEC on new fd Notes
dup(old) Lowest available Always OFF POSIX, simplest
dup2(old, new) You specify Always OFF POSIX, for redirection
dup3(old, new, O_CLOEXEC) You specify SET atomically Linux 2.6.27+, thread-safe cloexec
fcntl(F_DUPFD_CLOEXEC) Lowest โ‰ฅ start SET atomically Linux 2.6.24+, range control
/*
 * dup3_demo.c  โ€”  Show why dup3 / FD_CLOEXEC matters
 * Build: gcc -D_GNU_SOURCE dup3_demo.c -o dup3_demo
 */
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>

int main(void)
{
    int fd_src = open("/etc/hostname", O_RDONLY);
    if (fd_src == -1) { perror("open"); return 1; }

    printf("Source fd: %d\n\n", fd_src);

    /* โ”€โ”€ dup2: new fd has NO cloexec โ”€โ”€ */
    int fd_dup2 = dup2(fd_src, 10);
    int flags_dup2 = fcntl(fd_dup2, F_GETFD);
    printf("dup2(fd_src, 10):\n");
    printf("  fd = %d, FD_CLOEXEC = %s\n\n",
           fd_dup2, (flags_dup2 & FD_CLOEXEC) ? "SET" : "NOT SET");

    /* โ”€โ”€ dup3 with O_CLOEXEC: new fd HAS cloexec, atomically โ”€โ”€ */
    int fd_dup3 = dup3(fd_src, 11, O_CLOEXEC);
    int flags_dup3 = fcntl(fd_dup3, F_GETFD);
    printf("dup3(fd_src, 11, O_CLOEXEC):\n");
    printf("  fd = %d, FD_CLOEXEC = %s\n\n",
           fd_dup3, (flags_dup3 & FD_CLOEXEC) ? "SET" : "NOT SET");

    /*
     * Why atomic matters:
     *
     * Non-atomic (BAD in threaded code):
     *   int fd = dup2(old, new);        โ† no cloexec
     *   fcntl(fd, F_SETFD, FD_CLOEXEC);โ† race: exec() in another thread here!
     *
     * Atomic (GOOD):
     *   int fd = dup3(old, new, O_CLOEXEC); โ† cloexec set inside the syscall
     */
    printf("With fd_dup2 (%d): if fork()+exec() is called from another\n"
           "thread right now, the new program inherits fd %d.\n\n",
           fd_dup2, fd_dup2);
    printf("With fd_dup3 (%d): the O_CLOEXEC was set atomically, so\n"
           "exec() always closes it, no race possible.\n",
           fd_dup3);

    close(fd_src); close(fd_dup2); close(fd_dup3);
    return 0;
}

๐Ÿ“ Section 4 โ€” pread() and pwrite() โ€” I/O at a Specific Offset

The Thread Problem with lseek + read

In a multi-threaded program, all threads share the same fd table and therefore the same OFD offsets. If Thread A calls lseek(fd, 100, SEEK_SET) and then Thread B calls lseek(fd, 200, SEEK_SET) before Thread A’s read(), Thread A will read from position 200 โ€” completely wrong.

pread() and pwrite() solve this by taking the offset as a parameter and never changing the OFD’s stored offset. Each thread can read/write at its own position independently with no locking needed.

#include <unistd.h>

ssize_t pread (int fd, void *buf, size_t count, off_t offset);
ssize_t pwrite(int fd, const void *buf, size_t count, off_t offset);

/* Identical to read/write except: */
/* – I/O happens at ‘offset’ in the file */
/* – The OFD’s stored offset is NEVER CHANGED */
/* – fd must be seekable (not a pipe or socket) */

Thread A and Thread B accessing the same fd
โŒ lseek + read (threads race)
Thread A: lseek(fd, 100, SEEK_SET)
โ† Thread B runs here โ†’
Thread B: lseek(fd, 200, SEEK_SET)
Thread A: read(fd, buf, 50)
โ†’ reads from 200 not 100!
โœ… pread (no race possible)
Thread A: pread(fd, buf, 50, 100)
Thread B: pread(fd, buf, 50, 200)
Both run independently.
OFD offset unchanged.
No locking needed.

Example 1 โ€” pread() Reads Without Moving the File Pointer

/*
 * pread_demo.c  โ€”  Read from specific offsets without moving the pointer
 * Build: gcc pread_demo.c -o pread_demo
 */
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>

int main(void)
{
    /* Create a file with known data at known positions */
    int fd = open("pread_test.txt", O_RDWR | O_CREAT | O_TRUNC, 0644);
    write(fd, "NAME:Alice  AGE:30  CITY:Mumbai ", 32);

    printf("File created. Offset after write: %ld\n\n",
           (long)lseek(fd, 0, SEEK_CUR));

    char buf[11] = {0};

    /*
     * pread() reads from the specified offset WITHOUT
     * changing where the OFD's current offset is.
     * We can read any field in any order, no lseek needed.
     */

    /* Read "Alice" at offset 5, length 5 */
    pread(fd, buf, 5, 5);
    buf[5] = '\0';
    printf("Name field (offset 5):  '%s'\n", buf);

    /* Read "30" at offset 15, length 2 */
    pread(fd, buf, 2, 15);
    buf[2] = '\0';
    printf("Age  field (offset 15): '%s'\n", buf);

    /* Read "Mumbai" at offset 25, length 6 */
    pread(fd, buf, 6, 25);
    buf[6] = '\0';
    printf("City field (offset 25): '%s'\n", buf);

    /* KEY POINT: the OFD offset has NOT moved from where write() left it */
    printf("\nOFD offset after all pread() calls: %ld (unchanged!)\n",
           (long)lseek(fd, 0, SEEK_CUR));

    close(fd);
    return 0;
}

Example 2 โ€” Update a Specific Field in a Binary File with pwrite()

/*
 * pwrite_demo.c  โ€”  Update a field in a binary file at a known offset
 * Common pattern: update a counter or record in a database-style file.
 * Build: gcc pwrite_demo.c -o pwrite_demo
 */
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdint.h>

/*
 * Simple binary file layout:
 *   Offset 0 : uint32_t  version    (4 bytes)
 *   Offset 4 : uint32_t  record_count (4 bytes)
 *   Offset 8 : uint32_t  checksum   (4 bytes)
 */
#define OFF_VERSION  0
#define OFF_RECORDS  4
#define OFF_CHECKSUM 8

int main(void)
{
    int fd = open("records.bin", O_RDWR | O_CREAT | O_TRUNC, 0644);
    if (fd == -1) { perror("open"); return 1; }

    /* Write initial values */
    uint32_t version = 1, records = 0, checksum = 0;
    write(fd, &version,  4);
    write(fd, &records,  4);
    write(fd, &checksum, 4);
    printf("Created file: version=%u  records=%u\n\n", version, records);

    /* Simulate adding 5 records one by one.
     * Each time, update only the record_count field at offset 4.
     * pwrite() updates just those 4 bytes without disturbing anything else
     * and without changing the OFD's current file offset. */
    for (int i = 1; i <= 5; i++) {
        records = i;

        /* Update only the record_count field โ€” 4 bytes at offset 4 */
        if (pwrite(fd, &records, sizeof(records), OFF_RECORDS) == -1) {
            perror("pwrite");
            break;
        }
        printf("Added record %d. pwrite updated offset %d.\n", i, OFF_RECORDS);
    }

    /* Read it back to verify */
    uint32_t read_ver, read_rec;
    pread(fd, &read_ver, 4, OFF_VERSION);
    pread(fd, &read_rec, 4, OFF_RECORDS);
    printf("\nFinal state: version=%u  record_count=%u\n", read_ver, read_rec);

    /* OFD offset is wherever write() left it (byte 12) โ€” pread/pwrite never moved it */
    printf("OFD offset (never touched by pread/pwrite): %ld\n",
           (long)lseek(fd, 0, SEEK_CUR));

    close(fd);
    return 0;
}

Example 3 โ€” Simulating Two Threads Using pread()

/*
 * pread_threads.c  โ€”  Two threads read from same fd at different offsets
 * pread() makes this safe without any mutex for the I/O itself.
 * Build: gcc pread_threads.c -o pread_threads -lpthread
 */
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <pthread.h>

/* Shared file descriptor โ€” intentional, showing pread() is safe */
static int shared_fd;

typedef struct {
    int    thread_id;
    off_t  start_offset;
    int    read_bytes;
} ThreadArg;

void *reader_thread(void *arg)
{
    ThreadArg *a = (ThreadArg *)arg;
    char buf[64] = {0};

    /*
     * pread() reads at a_start_offset WITHOUT touching the shared OFD offset.
     * Thread 1 and Thread 2 can call pread() at the same time safely.
     * No mutex needed for the I/O itself.
     */
    ssize_t n = pread(shared_fd, buf, a->read_bytes, a->start_offset);

    printf("Thread %d: pread at offset %ld โ†’ '%.*s' (%zd bytes)\n",
           a->thread_id, (long)a->start_offset, (int)n, buf, n);
    return NULL;
}

int main(void)
{
    /* Create test file with known content */
    shared_fd = open("thread_test.txt", O_RDWR | O_CREAT | O_TRUNC, 0644);
    write(shared_fd, "SECTION-A::SECTION-B::SECTION-C::", 33);

    ThreadArg args[3] = {
        {1, 0,  9},   /* Thread 1 reads "SECTION-A" */
        {2, 11, 9},   /* Thread 2 reads "SECTION-B" */
        {3, 22, 9}    /* Thread 3 reads "SECTION-C" */
    };

    pthread_t threads[3];
    for (int i = 0; i < 3; i++) {
        pthread_create(&threads[i], NULL, reader_thread, &args[i]);
    }
    for (int i = 0; i < 3; i++) {
        pthread_join(threads[i], NULL);
    }

    printf("\nAll threads finished. OFD offset: %ld (never changed by pread)\n",
           (long)lseek(shared_fd, 0, SEEK_CUR));

    close(shared_fd);
    return 0;
}
โš ๏ธ pread/pwrite need seekable files: They require the fd to support lseek(). Works on: regular files, block devices. Fails with ESPIPE on: pipes, sockets, FIFOs. If you call pread on a pipe you will get an error.
๐Ÿ“‹ Quick Summary
Call Use when Key behaviour
dup(fd) Need a spare fd alias Picks lowest available fd. Shares OFD.
dup2(old, new) I/O redirection (stdout, stderr) Forces specific fd number. Closes new first.
dup3(old, new, O_CLOEXEC) Thread-safe dup with cloexec Atomically sets close-on-exec. Linux only.
pread(fd, buf, n, off) Multi-thread random access reads Read at offset. OFD position unchanged.
pwrite(fd, buf, n, off) Update a field in a binary file Write at offset. OFD position unchanged.

Next in This Series

Post 4 โ€” readv/writev scatter-gather I/O ยท O_NONBLOCK ยท Large File Support ยท /dev/fd ยท Safe Temporary Files

โ† Post 2: fcntl & Internals Post 4: readv, Non-blocking & More โ†’

Leave a Reply

Your email address will not be published. Required fields are marked *