In Part 4 we understood what epoll does and why. Now we go deeper into the actual system calls โ their exact signatures, the structures involved, all the flags, and how to correctly use them to build a working event loop. By the end of this tutorial you will have a complete echo server built on epoll.
#include <sys/epoll.h>
int epoll_create(int size);
/* Returns: file descriptor on success, -1 on error */
epoll_create() creates a new epoll instance and returns a file descriptor that is your handle to it. This fd is just like any other โ you can pass it around, dup() it, and you must close() it when done.
size argument used to be a hint about how many FDs you plan to monitor. Since Linux 2.6.8 it is completely ignored โ the kernel allocates structures dynamically. You must still pass a positive value (pass 1) to avoid EINVAL from older kernels.When you close() the epoll fd, the instance is destroyed and all resources freed. If you fork() or dup() the epoll fd, all copies refer to the same epoll instance.
int epfd = epoll_create(1); /* size ignored since 2.6.8, pass any positive value */
if (epfd == -1) {
perror("epoll_create");
exit(EXIT_FAILURE);
}
/* ... use epfd ... */
close(epfd); /* destroy the instance when done */
#include <sys/epoll.h>
int epoll_create1(int flags);
/* Returns: file descriptor on success, -1 on error */
epoll_create1() is the preferred modern replacement. It drops the useless size argument and adds a flags argument. Currently only one flag exists:
exec() to run another program. Without this flag, the child process after exec() would unknowingly inherit the epoll fd โ a resource leak and security concern./* Preferred โ automatically close epoll fd on exec() */
int epfd = epoll_create1(EPOLL_CLOEXEC);
if (epfd == -1) {
perror("epoll_create1");
exit(EXIT_FAILURE);
}
/* Pass 0 for flags if you don't want CLOEXEC */
int epfd2 = epoll_create1(0);
โ No flags โ cannot set CLOEXEC atomically
โ Available since Linux 2.6
โ EPOLL_CLOEXEC flag available
โ Available since Linux 2.6.27
Use epoll_create1() in all new code.
#include <sys/epoll.h>
int epoll_ctl(int epfd, int op, int fd, struct epoll_event *ev);
/* Returns: 0 on success, -1 on error */
Arguments:
| Argument | Type | Meaning |
|---|---|---|
| epfd | int | epoll instance fd (from epoll_create1) |
| op | int | Operation: EPOLL_CTL_ADD, EPOLL_CTL_MOD, or EPOLL_CTL_DEL |
| fd | int | The file descriptor to add/modify/remove |
| ev | struct epoll_event * | Events to watch and user data (NULL for DEL) |
struct epoll_event {
uint32_t events; /* bitmask: which events to watch */
epoll_data_t data; /* user data passed back when event fires */
};
typedef union epoll_data {
void *ptr; /* pointer to anything you want */
int fd; /* most common: store the fd itself */
uint32_t u32;
uint64_t u64;
} epoll_data_t;
The data union is extremely useful โ when epoll_wait() returns an event, this data comes back with it. Storing the fd in data.fd means you always know which connection triggered the event without searching a lookup table.
| Flag | Meaning |
|---|---|
| EPOLLIN | Data available to read |
| EPOLLOUT | Write buffer has space โ can write without blocking |
| EPOLLERR | Error condition on the fd (always monitored by default) |
| EPOLLHUP | Hangup โ peer closed the connection (always monitored) |
| EPOLLPRI | High-priority data (e.g. TCP out-of-band data) |
| EPOLLET | Switch to edge-triggered mode (default is level-triggered) |
| EPOLLONESHOT | After one event fires, disable this fd. Re-arm with EPOLL_CTL_MOD. |
| EPOLLRDHUP | Peer half-closed the connection (sent FIN) |
/* Example: add a socket fd to epoll โ watch for read events */
struct epoll_event ev;
ev.events = EPOLLIN; /* notify when data arrives */
ev.data.fd = client_fd; /* pass fd back when event fires */
if (epoll_ctl(epfd, EPOLL_CTL_ADD, client_fd, &ev) == -1) {
perror("epoll_ctl ADD");
}
/* Modify: now watch for write events too */
ev.events = EPOLLIN | EPOLLOUT;
ev.data.fd = client_fd;
epoll_ctl(epfd, EPOLL_CTL_MOD, client_fd, &ev);
/* Remove: stop monitoring (without closing) */
epoll_ctl(epfd, EPOLL_CTL_DEL, client_fd, NULL); /* ev is ignored for DEL */
#include <sys/epoll.h>
int epoll_wait(int epfd, struct epoll_event *events,
int maxevents, int timeout);
/* Returns: number of ready fds, 0 on timeout, -1 on error */
| Argument | Meaning |
|---|---|
| epfd | epoll instance fd |
| events | Array of epoll_event structs โ kernel fills this with ready events |
| maxevents | Size of your events array โ max events returned per call |
| timeout | -1 = block forever, 0 = non-blocking, positive = milliseconds |
#define MAX_EVENTS 64
struct epoll_event events[MAX_EVENTS];
int nready = epoll_wait(epfd, events, MAX_EVENTS, -1); /* block until events */
if (nready == -1) {
if (errno == EINTR)
continue; /* interrupted by signal โ retry */
perror("epoll_wait");
break;
}
for (int i = 0; i < nready; i++) {
int fd = events[i].data.fd;
uint32_t evmask = events[i].events;
if (evmask & EPOLLIN)
handle_read(fd);
if (evmask & EPOLLOUT)
handle_write(fd);
if (evmask & (EPOLLERR | EPOLLHUP))
handle_error(fd);
}
This is a complete TCP echo server using epoll. It accepts multiple clients and echoes back everything they send, all in a single thread with no blocking.
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <errno.h>
#include <sys/epoll.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <fcntl.h>
#define PORT 8080
#define BACKLOG 50
#define MAX_EVENTS 64
#define BUF_SIZE 1024
/* Make a socket non-blocking */
static void set_nonblocking(int fd)
{
int flags = fcntl(fd, F_GETFL, 0);
fcntl(fd, F_SETFL, flags | O_NONBLOCK);
}
/* Add fd to epoll interest list, watching for EPOLLIN */
static void epoll_add(int epfd, int fd)
{
struct epoll_event ev;
ev.events = EPOLLIN;
ev.data.fd = fd;
if (epoll_ctl(epfd, EPOLL_CTL_ADD, fd, &ev) == -1) {
perror("epoll_ctl ADD");
exit(EXIT_FAILURE);
}
}
int main(void)
{
/* --- Create listening socket --- */
int listen_fd = socket(AF_INET, SOCK_STREAM, 0);
if (listen_fd == -1) { perror("socket"); exit(1); }
int opt = 1;
setsockopt(listen_fd, SOL_SOCKET, SO_REUSEADDR, &opt, sizeof(opt));
struct sockaddr_in addr;
memset(&addr, 0, sizeof(addr));
addr.sin_family = AF_INET;
addr.sin_addr.s_addr = INADDR_ANY;
addr.sin_port = htons(PORT);
if (bind(listen_fd, (struct sockaddr *)&addr, sizeof(addr)) == -1) {
perror("bind"); exit(1);
}
if (listen(listen_fd, BACKLOG) == -1) {
perror("listen"); exit(1);
}
set_nonblocking(listen_fd);
printf("Echo server listening on port %d\n", PORT);
/* --- Create epoll instance --- */
int epfd = epoll_create1(EPOLL_CLOEXEC);
if (epfd == -1) { perror("epoll_create1"); exit(1); }
/* Add listening socket to epoll */
epoll_add(epfd, listen_fd);
struct epoll_event events[MAX_EVENTS];
/* --- Main event loop --- */
for (;;) {
int nready = epoll_wait(epfd, events, MAX_EVENTS, -1);
if (nready == -1) {
if (errno == EINTR) continue;
perror("epoll_wait");
break;
}
for (int i = 0; i < nready; i++) {
int fd = events[i].data.fd;
if (fd == listen_fd) {
/* --- New connection arrived --- */
struct sockaddr_in client_addr;
socklen_t addr_len = sizeof(client_addr);
int client_fd = accept(listen_fd,
(struct sockaddr *)&client_addr,
&addr_len);
if (client_fd == -1) {
perror("accept");
continue;
}
set_nonblocking(client_fd);
epoll_add(epfd, client_fd); /* watch new client */
printf("New client connected: fd=%d\n", client_fd);
} else {
/* --- Data from existing client --- */
char buf[BUF_SIZE];
ssize_t n = read(fd, buf, sizeof(buf));
if (n <= 0) {
/* Client closed or error */
if (n == 0)
printf("Client fd=%d disconnected\n", fd);
else
perror("read");
epoll_ctl(epfd, EPOLL_CTL_DEL, fd, NULL);
close(fd);
} else {
/* Echo back to client */
write(fd, buf, n);
printf("Echoed %zd bytes to fd=%d\n", n, fd);
}
}
}
}
close(epfd);
close(listen_fd);
return 0;
}
Compile and test:
# Compile
gcc -o echo_server echo_server.c
# Run the server
./echo_server
# In another terminal โ connect with netcat
nc localhost 8080
# Type anything and press Enter โ it echoes back
# Test multiple clients simultaneously
nc localhost 8080 &
nc localhost 8080 &
nc localhost 8080 &
accept() new client
add to epoll
read() data
write() it back
EPOLLONESHOT is useful in multithreaded servers. Once an event fires, the fd is automatically disarmed โ no more events until you re-arm it with EPOLL_CTL_MOD. This prevents two threads from handling the same fd simultaneously.
/* Add fd with EPOLLONESHOT โ fires once then must be re-armed */
struct epoll_event ev;
ev.events = EPOLLIN | EPOLLONESHOT;
ev.data.fd = client_fd;
epoll_ctl(epfd, EPOLL_CTL_ADD, client_fd, &ev);
/* After handling the event in your thread, re-arm for next event */
ev.events = EPOLLIN | EPOLLONESHOT;
ev.data.fd = client_fd;
epoll_ctl(epfd, EPOLL_CTL_MOD, client_fd, &ev);
Without EPOLLONESHOT in a thread pool, two threads could both get woken up for the same fd and start reading from it simultaneously โ causing data corruption. EPOLLONESHOT guarantees one event โ one handler at a time.
When you close(fd), Linux automatically removes it from all epoll interest lists. You do NOT need to call epoll_ctl(EPOLL_CTL_DEL, ...) before closing.
โ ๏ธ However, if you dup() a fd and then close one of the duplicates, the fd is NOT removed from epoll because the underlying file description still has references (via the dup’d copy). Always close ALL duplicates to ensure removal.
/* Safe pattern: always close directly, no DEL needed */
close(client_fd); /* automatically removed from epoll */
/* BUT if you dup'd it: */
int dup_fd = dup(client_fd);
close(client_fd); /* NOT removed yet โ dup_fd still references it */
close(dup_fd); /* NOW it is removed from epoll */
You now understand signal-driven I/O with F_SETSIG, realtime signals, overflow handling, multithreaded signal routing, and the complete epoll API.
