read() and write() — Moving Data In and Out of Files

read() and write() — Moving Data In and Out of Files
How the kernel reads data into your program and writes data from your program to files — in detail
Topic
read() & write()
Level
Beginner–Intermediate
Part
3 of 7

Keywords:

read() write() buffer file descriptor byte count partial read end of file ssize_t null terminator

The Core of File I/O — Reading and Writing

After opening a file with open(), you have a file descriptor. The two things you most often do with that descriptor are read data from the file and write data to the file. In Linux, these are done with the read() and write() system calls.

These calls work at the byte level — they deal with raw bytes, not lines, not strings, not records. It is your job as the programmer to interpret what those bytes mean. This gives you maximum flexibility, but also requires you to be careful about details like null terminators and partial reads.

The read() System Call

Function Signature
#include <unistd.h>

ssize_t read(int fd, void *buffer, size_t count);

/* Returns:
     number of bytes actually read (0 to count)  on success
     0   if end-of-file was reached
    -1   on error (errno is set) */

Three parameters:

  • fd — the file descriptor you got from open()
  • buffer — a pointer to memory in your program where the read data will be stored. You must allocate this memory yourself — the kernel does not allocate it for you.
  • count — the maximum number of bytes you want to read
What Happens Internally When You Call read()

When your program calls read(fd, buf, 100), here is exactly what happens step by step:

read() Internal Flow
Your Program Kernel ────────────────────────────────────────────────────────── 1. Call read(fd=3, buf, count=100) ──────────────────────────────► 2. Find the file table entry for fd=3 3. Check current file offset (say: 512) 4. Go to disk, read up to 100 bytes starting at offset 512 5. Copy bytes into your buf 6. Update file offset (now: 512 + bytes_read) ◄────────────────────────────── 7. receive: bytes_actually_read (e.g. 100) 8. Your buffer now has the data!
Return Values — What Can read() Return?

The return value of read() is one of the most important things to understand correctly:

Positive number (1 to count)

Normal success. This many bytes were read and put into your buffer. This is what you usually expect.

Zero (0)

End-of-file. You have read everything. There is nothing more to read. This is normal termination — not an error.

-1 (negative one)

An error occurred. Check errno for details. Do not use the contents of buffer — they are unreliable.

Partial Reads — When You Get Less Than You Asked For

This is one of the most confusing aspects for beginners: read() may return fewer bytes than you asked for, even though there are more bytes in the file. This is not an error. It is just how the kernel works in certain situations.

For regular disk files, this usually happens when:

  • You asked for more bytes than remain in the file (you are near the end)

For other file types (pipes, terminals, sockets), partial reads happen more often:

  • A terminal read stops at a newline character — even if your buffer is bigger
  • A pipe read returns whatever data is currently available, even if it is less than requested
  • A network socket may return data in chunks depending on network timing

This is why robust programs use a loop when reading:

/* Reading exactly 'total' bytes — handles partial reads */
ssize_t read_all(int fd, char *buf, size_t total) {
    size_t bytes_read = 0;
    while (bytes_read < total) {
        ssize_t n = read(fd, buf + bytes_read, total - bytes_read);
        if (n == -1) return -1;  /* error */
        if (n == 0) break;       /* end of file */
        bytes_read += n;
    }
    return bytes_read;
}

The Null Terminator Trap — A Classic Beginner Mistake

In C, strings end with a null byte (\0). Functions like printf("%s") need that null byte to know where the string ends. But read() does not add a null byte. It just copies raw bytes into your buffer and returns the count.

If you do this — it will likely print garbage or random memory content:

/* WRONG — buffer is not null-terminated! */
char buffer[20];
read(fd, buffer, 20);
printf("%s\n", buffer);  /* DANGER — reads past end of buffer! */

The correct approach — always add the null byte yourself:

/* CORRECT — manually null-terminate */
char buffer[21];   /* one extra byte for the null */
ssize_t n = read(fd, buffer, 20);
if (n == -1) { perror("read"); exit(1); }
buffer[n] = '\0';   /* add null byte at position n */
printf("%s\n", buffer);  /* safe! */
Rule: Your buffer must be at least count + 1 bytes if you plan to treat the result as a C string.

Complete read() Example — Reading a File
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>

int main() {
    int fd = open("hello.txt", O_RDONLY);
    if (fd == -1) {
        perror("open");
        return 1;
    }

    char buf[256];
    ssize_t n;

    /* Read in a loop until end-of-file */
    while ((n = read(fd, buf, sizeof(buf) - 1)) > 0) {
        buf[n] = '\0';           /* null-terminate */
        printf("%s", buf);       /* print what we read */
    }

    if (n == -1) {
        perror("read");
    }

    close(fd);
    return 0;
}

The write() System Call

Function Signature
#include <unistd.h>

ssize_t write(int fd, void *buffer, size_t count);

/* Returns:
     number of bytes actually written  on success (can be less than count!)
    -1   on error (errno is set) */

Parameters:

  • fd — file descriptor of the open file (must be open for writing)
  • buffer — pointer to the data you want to write
  • count — number of bytes from buffer to write
What Happens Internally When You Call write()
write() Internal Flow
Your Program Kernel ────────────────────────────────────────────────────────── 1. Call write(fd=3, buf, count=50) ──────────────────────────────► 2. Find the file table entry for fd=3 3. Check current file offset (say: 1024) 4. Copy bytes from your buf into the kernel’s buffer (page cache) 5. Mark those pages as “dirty” (modified) 6. Update file offset (now: 1024 + 50) ◄────────────────────────────── 7. receive: 50 (bytes written to kernel buffer) NOTE: Data is NOT on disk yet! The kernel will flush dirty pages to disk later. Use O_SYNC or fsync() if you need disk guarantee.
Partial Writes — When Less Gets Written

Just like read(), write() can also write fewer bytes than you asked it to. This is rare for disk files but can happen if:

  • The disk is full (you hit the filesystem’s space limit)
  • Your process hit its file size limit (RLIMIT_FSIZE)
  • You are writing to a pipe or socket where the other end’s buffer is full

For robust programs, especially those writing to non-disk targets, handle partial writes with a loop:

/* Write all bytes — handles partial writes */
ssize_t write_all(int fd, const char *buf, size_t total) {
    size_t written = 0;
    while (written < total) {
        ssize_t n = write(fd, buf + written, total - written);
        if (n == -1) return -1;  /* error */
        written += n;
    }
    return written;
}

Important: write() Does NOT Immediately Write to Disk

When write() returns successfully, the data has been copied into the kernel’s memory buffer (page cache). It has NOT necessarily been written to the physical disk yet. The kernel writes it to disk later in the background for performance reasons — this is called buffered I/O.

This is usually fine. But if your computer crashes before the kernel flushes the buffer, you could lose data. For critical data (databases, financial records), you need to force the kernel to flush to disk. The options are:

  • Use the O_SYNC flag in open() — every write waits for the disk to confirm
  • Call fsync(fd) after writing — forces all pending writes to disk
  • Call fdatasync(fd) — like fsync but skips metadata updates for speed
/* Write to file */
write(fd, data, len);

/* Force kernel to flush this file to disk NOW */
fsync(fd);
/* After fsync returns, data is safely on disk */

Practical write() Example — Writing to a File
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>

int main() {
    /* Create (or overwrite) a file */
    int fd = open("greeting.txt",
                  O_WRONLY | O_CREAT | O_TRUNC,
                  0644);
    if (fd == -1) {
        perror("open");
        return 1;
    }

    const char *msg = "Hello, Linux!\n";
    ssize_t n = write(fd, msg, strlen(msg));

    if (n == -1) {
        perror("write");
        close(fd);
        return 1;
    }

    printf("Wrote %ld bytes\n", n);
    close(fd);
    return 0;
}

Combining read() and write() — A File Copy Program

The Simplest cp Command You Can Write

The classic example that demonstrates both read() and write() together is a simple file copy program. The logic is straightforward: read a chunk from the source file, write that chunk to the destination file, repeat until no more data.

#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>

#define BUF_SIZE 4096  /* Read/write 4KB at a time */

int main(int argc, char *argv[]) {
    if (argc != 3) {
        fprintf(stderr, "Usage: %s source dest\n", argv[0]);
        return 1;
    }

    /* Open source for reading */
    int src = open(argv[1], O_RDONLY);
    if (src == -1) { perror("open source"); return 1; }

    /* Create destination */
    int dst = open(argv[2],
                   O_WRONLY | O_CREAT | O_TRUNC,
                   0644);
    if (dst == -1) { perror("open dest"); close(src); return 1; }

    char buf[BUF_SIZE];
    ssize_t n;

    /* Loop: read a chunk, write the same chunk */
    while ((n = read(src, buf, BUF_SIZE)) > 0) {
        if (write(dst, buf, n) != n) {
            perror("write");
            break;
        }
    }

    if (n == -1) perror("read");

    close(src);
    close(dst);
    return 0;
}

/* Run as: ./copy source.txt destination.txt */
Copy Loop — How Data Flows
source.txt Your Program destination.txt ┌──────────┐ ┌────────────┐ ┌──────────────┐ │ │ read(src, │ │ write(dst, │ │ │ Hello Li │─────────────► │ buf[4096] │ ────────────►│ Hello Li │ │ nux! Thi │ buf, 4096) │ “Hello Li │ buf, n) │ nux! Thi │ │ s is… │ │ nux!…” │ │ s is… │ └──────────┘ └────────────┘ └──────────────┘ Loop repeats until read() returns 0 (EOF)

A Note on Data Types — size_t vs ssize_t

Why Two Different Types?

You will notice that read() and write() use two different integer types:

  • size_t — used for the count parameter (how many bytes you want). This is an unsigned integer because you cannot ask to read a negative number of bytes.
  • ssize_t — used for the return value. This is a signed integer because it needs to represent both a byte count (positive) and the error value -1. The ‘s’ stands for signed.

Always use ssize_t for your return value variable — if you use size_t or int you can get subtle bugs when comparing against -1.

/* CORRECT type for return value: */
ssize_t n = read(fd, buf, 100);
if (n == -1) { /* error */ }
if (n == 0)  { /* EOF */ }

/* WRONG — using size_t hides the error value -1 */
size_t n = read(fd, buf, 100);
if (n == (size_t)-1) { /* this works but is confusing — don't do this */ }

Key Takeaways from This Post

  • read() copies bytes from a file into your buffer — you must allocate the buffer yourself
  • read() returns: positive count (data read), 0 (end of file), or -1 (error)
  • Partial reads are normal — use a loop if you need to read exactly N bytes
  • read() does NOT add a null byte — add it manually if treating the result as a C string
  • write() copies bytes from your buffer into the kernel’s page cache — NOT necessarily to disk
  • Partial writes can happen — use a loop for non-disk I/O targets
  • Use O_SYNC or fsync() if you need a guarantee that data is on disk
  • Use ssize_t (signed) for read/write return values, not size_t

Up Next: Closing Files with close()

After you are done reading and writing, you must close the file. The next post covers close() — why it matters more than you might think, and what can go wrong if you skip it.

Next Post: close() →

Leave a Reply

Your email address will not be published. Required fields are marked *