epoll in Action — Complete Working Program

Linux Chapter 63 · Alternative I/O Models · EmbeddedPathashala

Topic
Full epoll Program

Level
Intermediate

Part
3 of 3

Putting It All Together

In the previous two parts you learned how to use epoll_ctl() to register file descriptors and epoll_wait() to receive events. Now let’s build a real program that monitors multiple FIFOs (named pipes) for input simultaneously — something that would be painful to do with blocking reads but is clean and efficient with epoll.

This program demonstrates the complete epoll workflow: create → register → wait loop → handle events → cleanup.

Key Concepts in This Tutorial

complete epoll program monitoring multiple FIFOs EINTR handling EPOLLHUP detection numOpenFds counter event loop pattern SIGCONT

What the Program Does — Step by Step

Program Flow

Create epoll instance
Call epoll_create1(0) to get an epoll file descriptor

Open each FIFO and add to interest list
Open each FIFO path from command-line args, add with EPOLLIN

Enter the event loop — call epoll_wait()
Block until one or more FIFOs have data. Check for EINTR (signal interrupted) and retry.

For each ready fd: check EPOLLIN, EPOLLHUP, EPOLLERR
If EPOLLIN → read and print data. If HUP/ERR → close fd, decrement counter.

Exit when all FIFOs are closed
Track open count with numOpenFds. When it reaches 0, exit the loop.

Complete Program: Monitor Multiple FIFOs with epoll

/* epoll_multi_fifo.c
 * Monitors multiple FIFOs for input using epoll.
 * Usage: ./epoll_multi_fifo /tmp/fifo1 /tmp/fifo2 ...
 *
 * Before running, create FIFOs:
 *   mkfifo /tmp/fifo1 /tmp/fifo2
 */

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <errno.h>
#include <fcntl.h>
#include <sys/epoll.h>

#define MAX_EVENTS  10
#define BUF_SIZE   512

int main(int argc, char *argv[])
{
    int epfd, fd, nready, i, numOpenFds;
    struct epoll_event ev, evlist[MAX_EVENTS];
    char buf[BUF_SIZE];
    ssize_t n;

    if (argc < 2) {
        fprintf(stderr, "Usage: %s fifo...\n", argv[0]);
        exit(EXIT_FAILURE);
    }

    /* ---- Step 1: Create epoll instance ---- */
    epfd = epoll_create1(0);
    if (epfd == -1) {
        perror("epoll_create1");
        exit(EXIT_FAILURE);
    }

    /* ---- Step 2: Open each FIFO and add to interest list ---- */
    numOpenFds = 0;
    for (i = 1; i < argc; i++) {
        /* O_RDONLY | O_NONBLOCK: don't block waiting for a writer */
        fd = open(argv[i], O_RDONLY);
        if (fd == -1) {
            perror(argv[i]);
            continue;
        }

        /* Register fd: watch for readable data */
        ev.events  = EPOLLIN;
        ev.data.fd = fd;     /* store fd so we know it when event fires */

        if (epoll_ctl(epfd, EPOLL_CTL_ADD, fd, &ev) == -1) {
            perror("epoll_ctl ADD");
            close(fd);
            continue;
        }

        printf("Opened and watching: %s (fd=%d)\n", argv[i], fd);
        numOpenFds++;
    }

    if (numOpenFds == 0) {
        fprintf(stderr, "No FIFOs could be opened\n");
        exit(EXIT_FAILURE);
    }

    /* ---- Step 3: Event loop ---- */
    while (numOpenFds > 0) {

        /* Block until at least one fd is ready (timeout=-1: wait forever) */
        nready = epoll_wait(epfd, evlist, MAX_EVENTS, -1);
        if (nready == -1) {
            if (errno == EINTR) {
                /*
                 * EINTR means a signal interrupted us (e.g., SIGSTOP + SIGCONT).
                 * This is not an error — just restart the call.
                 */
                printf("epoll_wait interrupted by signal, retrying...\n");
                continue;
            }
            perror("epoll_wait");
            exit(EXIT_FAILURE);
        }

        printf("\n--- epoll_wait() returned %d ready fds ---\n", nready);

        /* ---- Step 4: Process each ready fd ---- */
        for (i = 0; i < nready; i++) {

            fd = evlist[i].data.fd;  /* which fd fired? */

            printf("fd=%d  events: %s%s%s%s\n",
                fd,
                (evlist[i].events & EPOLLIN)  ? "EPOLLIN "  : "",
                (evlist[i].events & EPOLLHUP) ? "EPOLLHUP " : "",
                (evlist[i].events & EPOLLERR) ? "EPOLLERR " : "",
                (evlist[i].events & EPOLLRDHUP) ? "EPOLLRDHUP " : "");

            if (evlist[i].events & EPOLLIN) {
                /* Data is available: read and display it */
                n = read(fd, buf, BUF_SIZE - 1);
                if (n > 0) {
                    buf[n] = '\0';
                    printf("  Data: %s", buf);
                } else if (n == 0) {
                    /* EOF: writer closed the FIFO */
                    printf("  EOF on fd %d\n", fd);
                    epoll_ctl(epfd, EPOLL_CTL_DEL, fd, NULL);
                    close(fd);
                    numOpenFds--;
                } else {
                    perror("  read");
                }
            }

            if (evlist[i].events & (EPOLLHUP | EPOLLERR)) {
                /*
                 * EPOLLHUP: other end of FIFO was closed (no writers left)
                 * EPOLLERR: some I/O error on this fd
                 * Either way: remove from interest list and close
                 */
                printf("  HUP or ERR on fd %d — closing\n", fd);
                epoll_ctl(epfd, EPOLL_CTL_DEL, fd, NULL);
                close(fd);
                numOpenFds--;
            }
        }
    }

    /* ---- Step 5: All FIFOs closed, clean up ---- */
    printf("\nAll FIFOs closed. Exiting.\n");
    close(epfd);
    return 0;
}

Compile: gcc epoll_multi_fifo.c -o epoll_multi_fifo

How to Test This Program

You need two terminal windows. Here is the exact sequence:

Terminal 1 — Run the monitor

# First create the FIFOs

$ mkfifo /tmp/fifo1 /tmp/fifo2
# Run the program (will block waiting for writers)

$ ./epoll_multi_fifo /tmp/fifo1 /tmp/fifo2
Opened and watching: /tmp/fifo1 (fd=3)

Opened and watching: /tmp/fifo2 (fd=4)

Terminal 2 — Send data to FIFOs

# Write to fifo1 (Terminal 1 will wake up)

$ echo “hello from fifo1” > /tmp/fifo1
# Write to fifo2

$ echo “hello from fifo2” > /tmp/fifo2
# Both FIFOs closed, Terminal 1 exits

Expected output in Terminal 1:

--- epoll_wait() returned 1 ready fds ---
fd=3  events: EPOLLIN
  Data: hello from fifo1
--- epoll_wait() returned 1 ready fds ---
fd=3  events: EPOLLHUP
  HUP or ERR on fd 3 — closing
--- epoll_wait() returned 1 ready fds ---
fd=4  events: EPOLLIN
  Data: hello from fifo2
...
All FIFOs closed. Exiting.

Why does EPOLLHUP appear after EPOLLIN? When the writer closes the FIFO after writing, the reader gets both an EPOLLIN (there is still data in the buffer) and then EPOLLHUP (the write end is closed).

Why EINTR Handling Matters

If a signal is delivered while your program is blocked inside epoll_wait(), the system call is interrupted and returns -1 with errno set to EINTR.

A common scenario: your program is stopped with Ctrl+Z (SIGSTOP) and then resumed with fg (SIGCONT). When SIGCONT is delivered, the sleeping epoll_wait() wakes up with EINTR.

❌ Wrong — exits on EINTR

nready = epoll_wait(...);
if (nready == -1) {
    perror("epoll_wait");
    exit(1);  /* BAD: exits for any error */
}

✅ Correct — retries on EINTR

nready = epoll_wait(...);
if (nready == -1) {
    if (errno == EINTR)
        continue; /* just retry */
    perror("epoll_wait");
    exit(1);
}

Why epoll is Better than select/poll

Feature	select()	poll()	epoll
Max fd limit	FD_SETSIZE (1024)	Unlimited	Unlimited
Time complexity per call	O(n)	O(n)	O(ready)
Re-register fds each call?	Yes	Yes	No
Returns only ready fds?	No (scans all)	No (scans all)	Yes
Edge-triggered mode	No	No	Yes (EPOLLET)
Portable (POSIX)?	Yes	Yes	Linux only

The key advantage of epoll at large scale: select and poll scan through ALL monitored fds to find ready ones — O(n) cost. epoll only returns the fds that are actually ready — O(ready) cost. With 10,000 connections but only 5 active at any moment, epoll processes 5 events while select scans 10,000.

Level-Triggered vs Edge-Triggered (EPOLLET)

epoll supports two modes of notification. Understanding the difference is critical for correctness.

Level-Triggered (default)

epoll_wait() keeps telling you an fd is ready as long as data is available. If you don’t read all the data, the next epoll_wait() call still reports the same fd. This is the safe default — same behavior as select/poll.

Edge-Triggered (EPOLLET)

epoll_wait() notifies you only once when the state changes (new data arrives). If you don’t read ALL data in one go, you’ll never be notified again until more data arrives. You MUST use non-blocking fds and read in a loop until EAGAIN.

⚠️ Edge-triggered pitfall: With EPOLLET, if you read only part of the available data, you will miss the rest until new data arrives. Always drain the fd completely (loop until EAGAIN) when using edge-triggered mode.

/* Edge-triggered: must read until EAGAIN */
while (1) {
    n = read(fd, buf, sizeof(buf));
    if (n == -1) {
        if (errno == EAGAIN || errno == EWOULDBLOCK)
            break;   /* no more data for now */
        perror("read");
        break;
    }
    if (n == 0) {
        /* EOF */
        break;
    }
    /* process buf[0..n-1] */
}

Interview Questions — epoll Program Design

Q1. Explain the full workflow of an epoll-based server.

Create epoll instance → add listening socket with EPOLL_CTL_ADD+EPOLLIN → call epoll_wait() in a loop → when listening socket fires, accept() new connection, add the new client socket to epoll → when a client socket fires, read/write data → when EPOLLHUP or EPOLLERR fires, close the client socket and remove from epoll.

Q2. Why do you need EINTR handling in epoll_wait()?

When a signal is delivered to a process that is blocked inside epoll_wait(), the kernel interrupts the system call and returns -1 with errno=EINTR. This is not a real error — it means the wait was interrupted and you should retry. Common triggers include SIGCHLD, SIGCONT, SIGALRM. Without EINTR handling, your server crashes whenever it receives any signal.

Q3. What is the difference between level-triggered and edge-triggered epoll?

Level-triggered (default): epoll_wait() keeps reporting the fd as ready as long as data is available. Safe and easy — same as select/poll behavior.
Edge-triggered (EPOLLET): epoll_wait() reports the fd only once when NEW data arrives. You must read all available data immediately (loop until EAGAIN). More efficient but harder to use correctly — easy to miss data if you don’t fully drain the fd.

Q4. Why must edge-triggered epoll use non-blocking fds?

With edge-triggered mode you must read in a loop until you get EAGAIN (no more data). If the fd is blocking, the final read() call when there is no data will block your process forever instead of returning EAGAIN. Non-blocking fds (O_NONBLOCK) make read() return EAGAIN immediately when no data is available, letting you exit the drain loop cleanly.

Q5. How do you implement a multi-threaded server with epoll?

Common pattern: create one epoll instance and share it across worker threads. Each thread calls epoll_wait(). When one thread picks up an event for a client fd, use EPOLLONESHOT to disable that fd so no other thread processes it simultaneously. After the thread finishes handling the event, re-arm the fd with EPOLL_CTL_MOD. This avoids race conditions without needing per-fd locks.

Q6. What happens when you call epoll_wait() with maxevents=1 but 100 fds are ready?

Only 1 event is returned in the evlist array. The other 99 ready fds remain ready and will be returned in the next epoll_wait() call. No events are lost. This is why high-performance servers use a larger maxevents value (e.g., 64 or 128) to batch-process many events per call and reduce system call overhead.

Q7. How do you detect when a TCP client disconnects using epoll?

Three ways: (1) read() returns 0 (EOF — client closed the connection), (2) EPOLLHUP fires — the connection is fully closed, (3) EPOLLRDHUP fires — the client shut down the write side (TCP half-close). In practice you check for EPOLLIN first (data to read), handle EOF in the read loop, and also check EPOLLHUP/EPOLLERR in the event flags.

Q8. Why does epoll scale better than select for 10,000 connections?

select() has two O(n) costs: the kernel scans all n fds to find ready ones, and you must scan the returned fd_set to find which ones are ready. epoll keeps a kernel-maintained ready list. When an fd becomes ready, the kernel adds it directly to the ready list. epoll_wait() just returns what is in that list — cost is O(ready events), not O(total monitored fds). With 10,000 connections and 10 active at once, epoll processes 10. select processes 10,000.

Q9. What is the numOpenFds pattern and why is it useful?

It is a counter that tracks how many file descriptors are still in the epoll interest list. You increment it when adding a fd and decrement it when closing one (on EPOLLHUP, EPOLLERR, or EOF). The event loop runs while (numOpenFds > 0). This gives you a clean exit condition — the program automatically stops when all monitored fds are gone, with no polling or idle spinning.

You’ve Completed the epoll Tutorial Series

epoll_ctl() · epoll_wait() · Event Bits · EPOLLONESHOT · Full Example Program

embeddedpathashala.com

epoll in Action — Complete Working Program

Putting It All Together

Leave a Reply Cancel reply