Linux File I/O: Advanced Concepts – Part 1

Linux File I/O: Advanced Concepts – Part 1

Atomicity · fcntl() · File Status Flags · File Descriptor Internals

⚛️
Atomicity
Race-free ops
🔧
fcntl()
File control
🗂️
fd / OFD / inode
Kernel internals
🎯
Interview Ready
Key Linux concepts

Why This Matters for Interviews

When interviewers ask “what happens when two processes write to the same file simultaneously?” or “how does shell redirection work internally?”, they expect answers rooted in kernel data structures. This post breaks down the exact kernel mechanisms behind file I/O in simple language, with real code examples. No jargon without explanation.

🔑 Key Terms Covered

Atomicity Race Condition O_EXCL O_APPEND fcntl() F_GETFL / F_SETFL File Descriptor Open File Description i-node dup() / dup2()

⚛️ 1. Atomicity – The Cornerstone of Safe File Operations

What is Atomicity?

An atomic operation is one that either completes fully or does not happen at all — there is no in-between state visible to any other process or thread. The Linux kernel guarantees that every system call is atomic: no other process can sneak in and run between the individual steps of your system call.

Think of it like a bank transfer: money is either moved from A to B completely, or the transfer never happened. A partial transfer (money leaves A but never reaches B) would be a disaster — and that is exactly what a race condition causes in files.

Race Conditions – A Concrete Problem

A race condition happens when two processes share a resource and the final result depends on whichever one gets CPU time first. It is unpredictable and dangerous.

Race Condition – Two Processes Creating the Same File
Process A Process B
open(“log.txt”) → ENOENT (file absent)
⚠ CPU switches to Process B here
open(“log.txt”) → ENOENT (file absent)
creat(“log.txt”) → ✅ creates file
creat(“log.txt”) → ✅ also “creates” it!
❌ Both processes think they are the exclusive creator — WRONG!

🩹 Fix: O_CREAT | O_EXCL – Atomic File Creation

Using O_CREAT | O_EXCL together in open() makes the check-and-create a single unbreakable step. If the file exists, open() returns an error (EEXIST) immediately — no gap for another process to sneak in.

/*
 * safe_create.c
 * Demonstrates atomic file creation using O_CREAT | O_EXCL.
 * Only one process/run will succeed in creating "lock.txt".
 */
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <errno.h>

int main(void)
{
    int fd;

    /* O_EXCL guarantees: check + create happen as ONE atomic step */
    fd = open("lock.txt", O_WRONLY | O_CREAT | O_EXCL, 0644);

    if (fd == -1) {
        if (errno == EEXIST)
            printf("File already exists — someone else got here first.\n");
        else
            perror("open");
        exit(EXIT_FAILURE);
    }

    printf("PID %ld: I am the exclusive creator!\n", (long)getpid());
    write(fd, "locked\n", 7);
    close(fd);
    return 0;
}
🧠 Interview Tip: “Why do we need O_EXCL? Can’t we just check with stat() first?” — No! Between stat() and open() there is a window where another process can create the file. O_EXCL eliminates that window.

🩹 Fix: O_APPEND – Atomic Appends to a Shared File

Multiple processes writing to a shared log file face the same race condition. Without O_APPEND, a process must seek to the end and then write — two separate steps that can interleave badly.

O_APPEND makes seek-to-end + write into a single atomic operation. The kernel moves to end-of-file and writes before any other process can interfere.

/*
 * shared_logger.c
 * Multiple processes safely appending to the same log file.
 * Compile: gcc shared_logger.c -o shared_logger
 * Run two instances simultaneously to see safe interleaving.
 */
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>

int main(void)
{
    int fd;
    char msg[64];

    /* O_APPEND: every write() atomically goes to end of file */
    fd = open("shared.log", O_WRONLY | O_CREAT | O_APPEND, 0644);
    if (fd == -1) { perror("open"); exit(1); }

    for (int i = 0; i < 5; i++) {
        snprintf(msg, sizeof(msg), "PID %ld: entry %d\n", (long)getpid(), i);
        write(fd, msg, strlen(msg));
        usleep(10000); /* small delay to simulate work */
    }

    close(fd);
    return 0;
}
📌 Note: Some network file systems (like older NFS versions) do not guarantee O_APPEND atomicity. In that case, you need explicit file locking (fcntl() locks).

🔧 2. fcntl() – The Swiss Army Knife for Open Files

fcntl() (file control) is a multipurpose system call that performs various control operations on an already open file descriptor. Think of it as a remote control for your open file.

#include <fcntl.h>

int fcntl(int fd, int cmd, );

/* Returns: value depends on cmd, -1 on error */

Common fcntl() Commands
Command Purpose Returns
F_GETFL Get open file status flags Flag bitmask
F_SETFL Set open file status flags 0 on success
F_DUPFD Duplicate file descriptor New fd number
F_GETLK / F_SETLK Get / Set file lock Lock info / 0

🚩 3. Reading and Changing Open File Status Flags

When you call open(), you pass flags like O_RDONLY, O_APPEND, O_NONBLOCK. These are stored by the kernel. You can read them back later using fcntl(F_GETFL) and even modify some of them using fcntl(F_SETFL) — without closing and reopening the file.

Flags You Can Modify After open()

Modifiable vs Read-Only Status Flags via F_SETFL
✅ Can Modify with F_SETFL ❌ Cannot Modify (ignored)
O_APPEND – auto seek to end before each write
O_NONBLOCK – non-blocking I/O
O_ASYNC – signal-driven I/O
O_DIRECT – bypass kernel buffer cache
O_NOATIME – don’t update access time
O_RDONLY – access mode (read-only)
O_WRONLY – access mode (write-only)
O_RDWR – access mode (read+write)
O_CREAT – create if absent
O_EXCL – exclusive creation
O_TRUNC – truncate on open

Practical Example: Enable Non-Blocking on a Pipe

A common real-world use: you receive a file descriptor from another source (e.g., a pipe or socket) and need to enable O_NONBLOCK on it. You cannot re-open it — you use fcntl().

/*
 * toggle_flags.c
 * Shows how to read and modify file status flags using fcntl().
 * Use case: enabling O_APPEND on an fd obtained from elsewhere.
 * Compile: gcc toggle_flags.c -o toggle_flags
 */
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>

void print_flags(int fd)
{
    int flags = fcntl(fd, F_GETFL);
    if (flags == -1) { perror("fcntl F_GETFL"); return; }

    printf("  O_APPEND   : %s\n", (flags & O_APPEND)   ? "ON" : "OFF");
    printf("  O_NONBLOCK : %s\n", (flags & O_NONBLOCK) ? "ON" : "OFF");

    /* Access mode needs a mask because it occupies 2 bits */
    int mode = flags & O_ACCMODE;
    if (mode == O_RDONLY) printf("  Access mode: READ-ONLY\n");
    else if (mode == O_WRONLY) printf("  Access mode: WRITE-ONLY\n");
    else printf("  Access mode: READ-WRITE\n");
}

int main(void)
{
    int fd = open("demo.txt", O_WRONLY | O_CREAT | O_TRUNC, 0644);
    if (fd == -1) { perror("open"); exit(1); }

    printf("=== Flags BEFORE modification ===\n");
    print_flags(fd);

    /* Step 1: read current flags */
    int flags = fcntl(fd, F_GETFL);
    /* Step 2: turn ON O_APPEND */
    flags |= O_APPEND;
    /* Step 3: write flags back */
    if (fcntl(fd, F_SETFL, flags) == -1) { perror("fcntl F_SETFL"); exit(1); }

    printf("\n=== Flags AFTER enabling O_APPEND ===\n");
    print_flags(fd);

    close(fd);
    return 0;
}
⚠️ Gotcha: Always use the read-modify-write pattern (F_GETFL → modify bits → F_SETFL). Never just pass a hardcoded value to F_SETFL — you will wipe out other flags the kernel or the program had set.

🗂️ 4. Inside the Kernel: fd → Open File Description → i-node

This is one of the most important concepts for Linux interviews. Most developers think “file descriptor = file”. That is only part of the story. The kernel uses three separate layers between your program and the actual file data.

The Three Layers Explained Simply

Kernel Data Structures for File I/O
Layer 1: File Descriptor Table
(per-process)
Layer 2: Open File Description Table
(system-wide)
Layer 3: i-node Table
(system-wide, on-disk)
• One entry per open fd (0, 1, 2, 3 …)
• Stores: close-on-exec flag
• Stores: pointer → OFD entry

Think of it as your process’s “phone book” of open files

• Shared across all processes
• Stores: current file offset
• Stores: access mode (r/w/rw)
• Stores: status flags (O_APPEND…)
• Stores: pointer → i-node

Created fresh each time open() is called

• One entry per physical file
• Stores: file type, permissions
• Stores: file size, timestamps
• Stores: list of locks
• Stores: disk block locations

Persists on disk; loaded into RAM when file is opened

Visual: How It All Connects

fd → OFD → i-node Relationships (Two Processes)

Process A – fd table
fd 0 → OFD #10
fd 1 → OFD #20
fd 4 → OFD #20 ⬅ same!

Open File Description Table (kernel)
OFD #10 offset=0, r/w → inode #7
OFD #20 offset=512, r/w → inode #9
OFD #30 offset=0, r/o → inode #9

i-node Table
inode #7 /home/user/a.txt
inode #9 /var/log/app.log

Process B – fd table
fd 0 → OFD #30

Key Rules That Follow From This Architecture

What Is Shared vs What Is Private
Property Where Stored Shared Between dup’d fds?
File offset (position) Open File Description ✅ YES – both fds see same offset
Status flags (O_APPEND etc.) Open File Description ✅ YES – changing one affects both
Close-on-exec flag fd table entry ❌ NO – private to each fd
File size, permissions, timestamps i-node ✅ YES – same physical file

Practical Demo: Shared File Offset Between Duplicated fds

/*
 * shared_offset.c
 * When two file descriptors point to the same OFD,
 * reading via one fd advances the offset for the other too.
 * Compile: gcc shared_offset.c -o shared_offset
 */
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>

int main(void)
{
    /* Create a small test file */
    int wfd = open("test.txt", O_WRONLY | O_CREAT | O_TRUNC, 0644);
    write(wfd, "ABCDEFGHIJ", 10);
    close(wfd);

    /* Open the file — one OFD is created */
    int fd1 = open("test.txt", O_RDONLY);

    /* dup() creates fd2 pointing to the SAME OFD */
    int fd2 = dup(fd1);

    char buf[4] = {0};

    /* Read 3 bytes via fd1 — file offset in OFD moves to 3 */
    read(fd1, buf, 3);
    printf("fd1 read: %s\n", buf);  /* prints: ABC */

    /* Now read via fd2 — it picks up at offset 3 (not 0!) */
    read(fd2, buf, 3);
    printf("fd2 read: %s\n", buf);  /* prints: DEF */

    /* Compare: open file AGAIN — this creates a NEW OFD */
    int fd3 = open("test.txt", O_RDONLY);
    read(fd3, buf, 3);
    printf("fd3 read: %s\n", buf);  /* prints: ABC — independent offset */

    close(fd1); close(fd2); close(fd3);
    return 0;
}
🧠 Interview Question: “If I dup() a file descriptor and write via the new fd, does the original fd’s position change?” — Yes! Because both fds share the same Open File Description, and the offset lives there.

📋 5. Quick Reference – System Calls Covered
System Call Summary — Part 1
System Call / Flag Purpose Key Point
open(O_CREAT|O_EXCL) Exclusive file creation Atomic check+create; fails with EEXIST
open(O_APPEND) Safe multi-process append Atomic seek-to-end + write
fcntl(F_GETFL) Read current file flags Returns bitmask; mask with O_ACCMODE for access mode
fcntl(F_SETFL) Modify open file flags Only modifiable flags take effect; always read-modify-write
dup(fd) Duplicate file descriptor New fd shares same OFD (offset + flags)

📌 Coming Up in Part 2

dup() / dup2() in depth · pread() & pwrite() · Scatter-Gather I/O (readv/writev) · Non-blocking I/O · Large File Support · /dev/fd · Temporary Files

← Back to Linux Course Index Part 2: Advanced I/O Calls →

Leave a Reply

Your email address will not be published. Required fields are marked *