Why doing two steps is never the same as doing one – and how the kernel protects you
Atomicity
What it means
Race Condition
How it happens
O_EXCL
Safe file creation
O_APPEND
Safe shared writes
What You Will Learn in This Post
Before diving into advanced file I/O calls, you need to understand one foundational idea: atomicity. It is the reason certain open() flags exist, and ignoring it leads to bugs that are nearly impossible to reproduce. In this post we will start from a very simple story, build up the concept step by step, and then look at two classic problems in Linux file I/O — creating a file exclusively and appending to a shared file — and how the kernel solves both with a single flag.
The word atomic literally means “cannot be cut”. In computer science an atomic operation is one that runs to completion without any other process being able to observe or interrupt it midway. It either finishes completely, or it never started. There is no visible in-between state.
Every individual Linux system call (open, read, write …) is atomic by the kernel’s guarantee. But two consecutive system calls are NOT atomic together. Between any two system calls, the Linux scheduler is perfectly allowed to pause your process and run something else. That gap is where trouble lives.
A Simple Everyday Analogy
Imagine a shared whiteboard in an office. Two people want to write their name on it only if no one else has. The steps are: (1) look at the board, (2) write your name if it is blank.
If Person A looks, sees it is blank, then Person B also looks, sees it is blank, and both write simultaneously — both believe they are the only writer. That is a race condition born from a non-atomic check-then-act.
The solution is a lock on the whiteboard that lets only one person see-and-write as a single indivisible step. That is exactly what O_CREAT | O_EXCL does for files.
|
❌ Non-Atomic (two steps)
Step 1: Does file exist?
← OS can pause here! Step 2: If no → create file Problem: Another process can |
✅ Atomic (one step)
Step 1+2 combined: Check if
absent AND create — together. No pause possible between the check and the create. Safe — exactly what |
A race condition happens when two or more processes (or threads) access shared data at the same time, and the final outcome depends on which one gets there first. The behaviour becomes unpredictable — it may work correctly most of the time but fail occasionally, making it one of the hardest bugs to find and fix.
Why Processes Get Paused Between Calls
Linux uses preemptive scheduling. Every process is given a small time slice (a few milliseconds). When the slice runs out, the OS saves the process state and runs something else. Your process has no control over when this happens — it can be paused between any two instructions, including between two consecutive system calls in your code.
| Time | Process A | Process B |
|---|---|---|
| T1 | Checks: does app.log exist? → No | waiting… |
| T2 | ⚡ OS pauses Process A here | ⚡ OS runs Process B |
| T3 | waiting… | Checks: does app.log exist? → No |
| T4 | waiting… | Creates app.log → success ✅ |
| T5 | Creates app.log → also “succeeds” ✅ | waiting… |
| ❌ Both A and B believe they exclusively created the file. The lock is broken. | ||
Notice that this bug is nearly impossible to reproduce reliably. It only happens if the OS pauses Process A at exactly that moment. In testing it may never appear. In production under load it may cause corruption. This is the defining trait of a race condition.
The open() system call accepts a combination of flags. Two of them, when used together, solve the exclusive creation problem atomically:
| Flag | What it does |
|---|---|
| O_CREAT | Create the file if it does not exist. If it already exists, open it normally (no error). |
| O_EXCL | Combined with O_CREAT: if the file already exists, fail with EEXIST. Without O_CREAT, O_EXCL has no effect. |
| O_CREAT | O_EXCL | Atomic check-and-create: the kernel checks existence AND creates in one unbreakable step. Guaranteed to work correctly even with 1000 processes racing simultaneously. |
Example 1 — Bad Way (Two Separate Steps)
First, here is the broken approach so you understand exactly why it is wrong:
/*
* bad_create.c — DO NOT USE this pattern
* This has a race condition between the stat() and the open().
* Build: gcc bad_create.c -o bad_create
*/
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/stat.h>
int main(void)
{
struct stat info;
/* Step 1: check if file already exists */
if (stat("mylock.txt", &info) == 0) {
printf("File exists, someone else is using it.\n");
return 1;
}
/*
* ===================================================
* THE GAP: OS can switch to another process right here.
* The other process runs the same code, also passes
* the stat() check above, and creates the file first.
* When our process resumes, it creates the file too.
* Both processes now believe they own the lock.
* ===================================================
*/
/* Step 2: create the file */
int fd = open("mylock.txt", O_WRONLY | O_CREAT, 0644);
if (fd == -1) {
perror("open");
return 1;
}
printf("PID %d: I think I created the lock file.\n", getpid());
/* But another process may have the same belief! */
close(fd);
return 0;
}
Example 2 — Correct Way (Atomic O_CREAT | O_EXCL)
Now the fixed version. The entire check-and-create is inside a single system call:
/*
* good_create.c — Correct exclusive file creation
* Uses O_CREAT | O_EXCL for atomic check-and-create.
* Build: gcc good_create.c -o good_create
* Test : run two terminals simultaneously to see only one wins.
*/
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <errno.h>
int main(void)
{
/*
* O_CREAT | O_EXCL together:
* - If the file does not exist → create it, return a valid fd.
* - If the file already exists → return -1, errno = EEXIST.
*
* The kernel does the check and the create in one atomic step.
* No other process can sneak in between them.
*/
int fd = open("mylock.txt",
O_WRONLY | O_CREAT | O_EXCL,
0644);
if (fd == -1) {
if (errno == EEXIST) {
/* Someone else created it first — that is expected */
printf("PID %d: File already exists. Another process owns it.\n",
getpid());
} else {
/* Some unexpected error */
perror("open");
}
return 1;
}
/* Only ONE process will ever reach here */
printf("PID %d: I created the lock file exclusively.\n", getpid());
/* Write our PID so others know who holds it */
char buf[32];
int n = snprintf(buf, sizeof(buf), "owner_pid=%d\n", getpid());
write(fd, buf, n);
/* Simulate doing critical work */
sleep(3);
/* Cleanup: close and remove the lock */
close(fd);
unlink("mylock.txt");
printf("PID %d: Done. Lock removed.\n", getpid());
return 0;
}
./good_create at the same time. You will see exactly one process win and the other immediately print “File already exists.” No matter how many times you try, only one process creates the file.Example 3 — Lock File with Retry Loop
In real programs you usually want to wait if the lock is held rather than give up immediately. Here is a practical pattern with a timeout:
/*
* lock_retry.c — Lock file with timeout and retry
* Practical pattern used by build systems, backup tools, etc.
* Build: gcc lock_retry.c -o lock_retry
*/
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <errno.h>
#define LOCK_FILE "/tmp/my_app.lock"
#define MAX_RETRIES 10 /* try up to 10 times */
#define RETRY_WAIT 500000 /* wait 500 ms between retries */
/* Returns 0 on success (lock acquired), -1 on failure */
int acquire_lock(void)
{
for (int attempt = 1; attempt <= MAX_RETRIES; attempt++) {
/* Atomic check-and-create — the only safe way */
int fd = open(LOCK_FILE,
O_WRONLY | O_CREAT | O_EXCL,
0600);
if (fd != -1) {
/* We got the lock */
char info[64];
int n = snprintf(info, sizeof(info), "pid=%d\n", getpid());
write(fd, info, n);
close(fd);
printf("Lock acquired on attempt %d.\n", attempt);
return 0;
}
if (errno == EEXIST) {
/* Lock is held by someone else — wait and retry */
printf("Attempt %d: lock busy, waiting 500ms...\n", attempt);
usleep(RETRY_WAIT);
} else {
/* Unexpected error */
perror("open");
return -1;
}
}
printf("Could not acquire lock after %d attempts.\n", MAX_RETRIES);
return -1;
}
void release_lock(void)
{
unlink(LOCK_FILE);
printf("Lock released.\n");
}
int main(void)
{
if (acquire_lock() != 0) {
fprintf(stderr, "Cannot proceed without lock.\n");
return 1;
}
printf("PID %d doing exclusive work...\n", getpid());
sleep(2);
release_lock();
return 0;
}
There is a second classic race condition that catches developers off guard: multiple processes writing to the same file. The most common scenario is a shared log file where each process appends lines.
Why lseek + write is Broken for Shared Files
Appending means: move to the end of the file, then write. If you do this with two separate calls — lseek(fd, 0, SEEK_END) then write() — another process can write between your seek and your write, leaving both writes at the same offset and one overwriting the other.
| Time | Logger A | Logger B | File state |
|---|---|---|---|
| T1 | lseek(SEEK_END) → offset 200 | waiting… | file size = 200 bytes |
| T2 | ⚡ A paused | ⚡ B runs | |
| T3 | waiting… | lseek(SEEK_END) → offset 200 | still 200 bytes |
| T4 | waiting… | write(“B data”) at offset 200 | bytes 200-215 = B’s data |
| T5 | write(“A data”) at offset 200 ← A’s saved offset! | waiting… | bytes 200-215 = A’s data B’s data is OVERWRITTEN! |
The Fix: O_APPEND Makes Seek + Write Atomic
When you open a file with O_APPEND, every single write() call automatically seeks to the current end of file and writes there as one unbreakable operation. You never need to call lseek() manually. Even with a thousand processes writing simultaneously, every write lands after the previous one with no overlap.
|
❌ Broken (lseek + write)
lseek(fd, 0, SEEK_END);
/* ← gap here! */ write(fd, line, len); |
✅ Correct (O_APPEND)
/* seek+write is atomic */
write(fd, line, len); /* no lseek needed */ |
Example 1 — Single Process Writing a Log (Understanding O_APPEND)
/*
* append_single.c — Understand O_APPEND with one process first
* Build: gcc append_single.c -o append_single
*/
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
int main(void)
{
/*
* O_APPEND: every write() atomically moves to end-of-file
* and then writes. Even if the file was grown by another
* process since our last write, we always append correctly.
*/
int fd = open("app.log",
O_WRONLY | O_CREAT | O_APPEND,
0644);
if (fd == -1) {
perror("open");
return 1;
}
/* Notice: we never call lseek() — O_APPEND handles positioning */
const char *lines[] = {
"INFO : Server started\n",
"DEBUG : Listening on port 8080\n",
"INFO : Connection accepted\n",
NULL
};
for (int i = 0; lines[i] != NULL; i++) {
ssize_t len = strlen(lines[i]);
ssize_t written = write(fd, lines[i], len);
if (written != len) {
fprintf(stderr, "Warning: partial write at line %d\n", i);
}
}
/* Check where the file offset ended up */
off_t pos = lseek(fd, 0, SEEK_CUR);
printf("Current file offset after writes: %ld bytes\n", (long)pos);
close(fd);
printf("Done. Check app.log\n");
return 0;
}
Example 2 — Multiple Processes Appending Safely
This example forks several child processes that all write to the same log file. With O_APPEND, all entries are preserved. Run it and count the lines — they will always match.
/*
* multiproc_logger.c — 5 child processes all log to the same file safely
* Build: gcc multiproc_logger.c -o multiproc_logger
* Run: ./multiproc_logger
* wc -l shared.log (should always print 50 = 5 procs × 10 lines)
*/
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
#include <sys/wait.h>
#define NUM_WORKERS 5
#define LINES_PER_PROC 10
#define LOG_FILE "shared.log"
/* Remove old log so we start fresh */
void reset_log(void)
{
unlink(LOG_FILE);
}
/* Each worker writes LINES_PER_PROC entries then exits */
void worker(int worker_id)
{
/*
* Open with O_APPEND. Every write() is atomic w.r.t. end-of-file.
* Safe even when 5 workers share this same file path.
*/
int fd = open(LOG_FILE, O_WRONLY | O_CREAT | O_APPEND, 0644);
if (fd == -1) {
perror("worker open");
exit(1);
}
for (int line = 1; line <= LINES_PER_PROC; line++) {
char entry[80];
int n = snprintf(entry, sizeof(entry),
"worker=%d pid=%d line=%02d\n",
worker_id, getpid(), line);
/* Single write() = atomic append. Safe. */
if (write(fd, entry, n) != n) {
fprintf(stderr, "worker %d: write error\n", worker_id);
}
usleep(5000); /* 5 ms — interleave workers intentionally */
}
close(fd);
exit(0);
}
int main(void)
{
reset_log();
printf("Launching %d workers, each writing %d lines...\n",
NUM_WORKERS, LINES_PER_PROC);
/* Fork all workers */
for (int i = 1; i <= NUM_WORKERS; i++) {
pid_t pid = fork();
if (pid == 0) {
worker(i); /* child never returns from worker() */
}
if (pid == -1) {
perror("fork");
}
}
/* Parent waits for all children */
for (int i = 0; i < NUM_WORKERS; i++) {
wait(NULL);
}
/* Count lines — must equal NUM_WORKERS * LINES_PER_PROC */
FILE *f = fopen(LOG_FILE, "r");
int count = 0;
char line[80];
while (fgets(line, sizeof(line), f)) count++;
fclose(f);
int expected = NUM_WORKERS * LINES_PER_PROC;
printf("Lines written: %d Expected: %d %s\n",
count, expected,
(count == expected) ? "✅ CORRECT" : "❌ DATA LOST");
return 0;
}
write() call is atomic. If your log entry requires two or more write() calls, they are not atomic as a group and could be interleaved. For multi-call log entries, use a single large write() or add file locking.Example 3 — Proving the Race Without O_APPEND
Run this example and compare line counts. Without O_APPEND, lines get overwritten and the count drops below expected.
/*
* race_vs_safe.c — Side-by-side: broken vs safe appender
* Build: gcc race_vs_safe.c -o race_vs_safe
* Usage: ./race_vs_safe broken (produces a file with missing lines)
* ./race_vs_safe safe (produces a file with all lines)
*/
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
#include <sys/wait.h>
#define WORKERS 4
#define WRITES 100
void run_broken(void)
{
unlink("broken_out.txt");
for (int w = 0; w < WORKERS; w++) {
if (fork() == 0) {
int fd = open("broken_out.txt",
O_WRONLY | O_CREAT, 0644); /* no O_APPEND */
for (int i = 0; i < WRITES; i++) {
lseek(fd, 0, SEEK_END); /* race lives here */
char line[32];
int n = snprintf(line, sizeof(line), "w%d-i%d\n", w, i);
write(fd, line, n);
}
close(fd);
exit(0);
}
}
for (int i = 0; i < WORKERS; i++) wait(NULL);
}
void run_safe(void)
{
unlink("safe_out.txt");
for (int w = 0; w < WORKERS; w++) {
if (fork() == 0) {
int fd = open("safe_out.txt",
O_WRONLY | O_CREAT | O_APPEND, 0644);
for (int i = 0; i < WRITES; i++) {
char line[32];
int n = snprintf(line, sizeof(line), "w%d-i%d\n", w, i);
write(fd, line, n); /* atomic append, no lseek needed */
}
close(fd);
exit(0);
}
}
for (int i = 0; i < WORKERS; i++) wait(NULL);
}
int main(int argc, char *argv[])
{
if (argc < 2) {
printf("Usage: %s [broken|safe]\n", argv[0]);
return 1;
}
int expected = WORKERS * WRITES;
if (strcmp(argv[1], "broken") == 0) {
run_broken();
FILE *f = fopen("broken_out.txt", "r");
int n = 0;
char buf[64];
while (fgets(buf, sizeof(buf), f)) n++;
fclose(f);
printf("BROKEN — got %d lines, expected %d (lost %d)\n",
n, expected, expected - n);
} else {
run_safe();
FILE *f = fopen("safe_out.txt", "r");
int n = 0;
char buf[64];
while (fgets(buf, sizeof(buf), f)) n++;
fclose(f);
printf("SAFE — got %d lines, expected %d %s\n",
n, expected,
n == expected ? "✅ ALL LINES PRESENT" : "❌ LINES MISSING");
}
return 0;
}
| Operation | Atomic? | When to use |
|---|---|---|
| open(O_CREAT | O_EXCL) | ✅ Yes | Lock files, any “create if absent” pattern |
| stat() then open(O_CREAT) | ❌ No | Never use this two-step pattern for exclusive creation |
| write() with O_APPEND | ✅ Yes | Shared log files, multiple writers to one file |
| lseek(SEEK_END) + write() | ❌ No | Never use this two-step append with shared files |
| Single write() | ✅ Yes | Each individual write() call is atomic by itself |
| Two separate write() calls | ❌ No | Can be interleaved. Use a single write() or locking |
Next in This Series
Post 2 — fcntl() in depth: reading and changing file flags at runtime, and the three-layer kernel model every Linux developer must know.
