select() and poll() Classic I/O Multiplexing System Calls

 

select() and poll()
Chapter 63 — Part 2: Classic I/O Multiplexing System Calls

What This Part Covers

select() and poll() are the original POSIX-standard I/O multiplexing tools. They let your program watch multiple file descriptors at once and block until at least one becomes ready. Both do the same job but have different interfaces and slightly different limitations.

These are simpler than epoll and available on every UNIX system — so they are important to know even if epoll is usually preferred on Linux for large-scale use.

select() — The Original

select() takes three sets of file descriptors — one for reading, one for writing, one for errors — plus an optional timeout. It blocks until at least one fd in any of the sets becomes ready, then modifies the sets in-place to show which fds are ready.

Function Signature

#include <sys/select.h>

int select(int nfds,
           fd_set *readfds,
           fd_set *writefds,
           fd_set *exceptfds,
           struct timeval *timeout);

/* Returns: number of ready fds on success, 0 on timeout, -1 on error */

Parameter What It Means
nfds Highest fd number in any set + 1. Kernel checks fds 0 to nfds-1.
readfds Set of fds to watch for readability. NULL if not needed.
writefds Set of fds to watch for writability. NULL if not needed.
exceptfds Set of fds to watch for exceptions (e.g. OOB data on TCP). NULL usually.
timeout How long to wait. NULL = wait forever. {0,0} = poll (return immediately).

fd_set Macros — How to Manage the Sets

/* Always use these four macros to work with fd_set */

FD_ZERO(&set);          /* Clear all bits — MUST do this before use */
FD_SET(fd, &set);       /* Add fd to the set */
FD_CLR(fd, &set);       /* Remove fd from the set */
FD_ISSET(fd, &set);     /* Check if fd is set (returns nonzero if set) */

/* fd_set is a fixed-size bitmask — typically 1024 bits (FD_SETSIZE = 1024) */
/* This means select() can only monitor fds 0..1023 */

Critical Point: select() modifies the fd_set you pass in. After it returns, the sets only contain the ready fds. You must rebuild the sets before calling select() again in your loop. Forgetting this is a very common bug.

struct timeval

struct timeval {
    long tv_sec;    /* seconds */
    long tv_usec;   /* microseconds */
};

/* Examples */
struct timeval tv;

tv.tv_sec = 5;  tv.tv_usec = 0;    /* Wait up to 5 seconds */
tv.tv_sec = 0;  tv.tv_usec = 500000; /* Wait up to 0.5 seconds */
tv.tv_sec = 0;  tv.tv_usec = 0;    /* Return immediately (poll) */
/* Pass NULL for timeout to wait forever */

Complete select() Example — Monitor stdin and a Socket

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/select.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <errno.h>

int main(void)
{
    fd_set read_fds, master_fds;
    int sockfd, new_fd, maxfd;
    struct sockaddr_in server_addr;
    char buf[1024];
    ssize_t n;
    int ret;

    /* Create a TCP socket */
    sockfd = socket(AF_INET, SOCK_STREAM, 0);
    if (sockfd == -1) { perror("socket"); exit(1); }

    /* Allow reuse of port */
    int opt = 1;
    setsockopt(sockfd, SOL_SOCKET, SO_REUSEADDR, &opt, sizeof(opt));

    /* Bind to port 8080 */
    memset(&server_addr, 0, sizeof(server_addr));
    server_addr.sin_family      = AF_INET;
    server_addr.sin_addr.s_addr = INADDR_ANY;
    server_addr.sin_port        = htons(8080);

    if (bind(sockfd, (struct sockaddr *)&server_addr, sizeof(server_addr)) == -1) {
        perror("bind"); exit(1);
    }

    if (listen(sockfd, 5) == -1) { perror("listen"); exit(1); }

    printf("Listening on port 8080. Also watching stdin.\n");

    /* Build the master set — we keep this and copy it each iteration */
    FD_ZERO(&master_fds);
    FD_SET(STDIN_FILENO, &master_fds);  /* Watch stdin (fd 0) */
    FD_SET(sockfd, &master_fds);        /* Watch listening socket */
    maxfd = sockfd;  /* sockfd is the highest fd right now */

    while (1) {
        /* Copy master to read_fds because select() modifies it */
        read_fds = master_fds;

        struct timeval timeout = { .tv_sec = 5, .tv_usec = 0 };

        ret = select(maxfd + 1, &read_fds, NULL, NULL, &timeout);

        if (ret == -1) {
            perror("select");
            break;
        }

        if (ret == 0) {
            printf("Timeout — no activity in 5 seconds\n");
            continue;
        }

        /* Check if stdin has input */
        if (FD_ISSET(STDIN_FILENO, &read_fds)) {
            n = read(STDIN_FILENO, buf, sizeof(buf) - 1);
            if (n > 0) {
                buf[n] = '\0';
                printf("stdin: %s", buf);
            }
        }

        /* Check if listening socket has a new connection */
        if (FD_ISSET(sockfd, &read_fds)) {
            new_fd = accept(sockfd, NULL, NULL);
            if (new_fd == -1) {
                perror("accept");
            } else {
                printf("New connection: fd %d\n", new_fd);
                FD_SET(new_fd, &master_fds);
                if (new_fd > maxfd)
                    maxfd = new_fd;
            }
        }

        /* Check all connected client fds */
        for (int fd = STDIN_FILENO + 1; fd <= maxfd; fd++) {
            if (fd == sockfd) continue;
            if (!FD_ISSET(fd, &read_fds)) continue;

            n = read(fd, buf, sizeof(buf) - 1);
            if (n <= 0) {
                /* Client disconnected or error */
                if (n == 0)
                    printf("Client fd %d disconnected\n", fd);
                else
                    perror("read");
                close(fd);
                FD_CLR(fd, &master_fds);
            } else {
                buf[n] = '\0';
                printf("From fd %d: %s\n", fd, buf);
                /* Echo it back */
                write(fd, buf, n);
            }
        }
    }

    close(sockfd);
    return 0;
}

How the select() loop works:

1
Copy master_fds → read_fds (because select will destroy read_fds)
2
Call select() — blocks until at least one fd is ready (or timeout)
3
select() returns — read_fds now only has the READY fds set
4
Check each fd with FD_ISSET() and do I/O on the ready ones
5
Go back to step 1

poll() — A Cleaner Interface

poll() does the same job as select() but with a cleaner interface. Instead of three separate fd_sets, you pass an array of struct pollfd. Each element says which fd to watch and what events you care about.

Key advantages over select(): no FD_SETSIZE limit, events and revents are separate (so you don’t need to rebuild the array each call), and the API is cleaner to read.

Function Signature

#include <poll.h>

int poll(struct pollfd fds[], nfds_t nfds, int timeout);

/* Returns: number of ready fds on success, 0 on timeout, -1 on error */

/* timeout is in milliseconds:
   -1  = wait forever
    0  = return immediately (nonblocking poll)
   >0  = wait up to timeout milliseconds */

struct pollfd

struct pollfd {
    int   fd;       /* File descriptor to watch */
    short events;   /* Events we are interested in (input) */
    short revents;  /* Events that occurred (output — filled by kernel) */
};

Event Flag Can Set in events? Meaning
POLLIN Yes Data available to read (including EOF)
POLLOUT Yes Writing is possible without blocking
POLLRDHUP Yes (Linux) Peer closed write end (half-close on TCP)
POLLERR No (kernel sets only) An error condition occurred
POLLHUP No (kernel sets only) Hangup — other end closed connection
POLLNVAL No (kernel sets only) fd is not open — invalid fd
Key Difference from select(): In poll(), you set events once and the kernel fills revents after each call. You do NOT need to rebuild the array between calls. POLLERR, POLLHUP, and POLLNVAL are always checked regardless of what you put in events.

Complete poll() Example — Monitor Multiple Pipes

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <poll.h>
#include <errno.h>

#define NUM_PIPES 3

int main(void)
{
    int pipe_fds[NUM_PIPES][2];
    struct pollfd pfds[NUM_PIPES + 1];  /* +1 for stdin */
    char buf[256];
    ssize_t n;
    int i, num_open;

    /* Create NUM_PIPES pipes */
    for (i = 0; i < NUM_PIPES; i++) {
        if (pipe(pipe_fds[i]) == -1) {
            perror("pipe"); exit(1);
        }
        /* Write some data into each pipe so they are readable */
        char msg[32];
        snprintf(msg, sizeof(msg), "Data from pipe %d\n", i);
        write(pipe_fds[i][1], msg, strlen(msg));
    }

    /* Set up pollfd array */

    /* Watch stdin */
    pfds[0].fd      = STDIN_FILENO;
    pfds[0].events  = POLLIN;
    pfds[0].revents = 0;

    /* Watch read end of each pipe */
    for (i = 0; i < NUM_PIPES; i++) {
        pfds[i + 1].fd      = pipe_fds[i][0];  /* read end */
        pfds[i + 1].events  = POLLIN;
        pfds[i + 1].revents = 0;
    }

    num_open = NUM_PIPES + 1;

    while (num_open > 0) {
        /* Block until something is ready; timeout = 3000ms */
        int ready = poll(pfds, NUM_PIPES + 1, 3000);

        if (ready == -1) {
            perror("poll"); break;
        }

        if (ready == 0) {
            printf("poll() timeout — nothing ready\n");
            break;
        }

        /* Check each fd */
        for (i = 0; i < NUM_PIPES + 1; i++) {

            if (pfds[i].fd == -1) continue;  /* skip closed fds */

            if (pfds[i].revents & POLLIN) {
                n = read(pfds[i].fd, buf, sizeof(buf) - 1);
                if (n > 0) {
                    buf[n] = '\0';
                    printf("fd %d: %s", pfds[i].fd, buf);
                } else if (n == 0) {
                    printf("fd %d: EOF\n", pfds[i].fd);
                    close(pfds[i].fd);
                    pfds[i].fd = -1;  /* Tell poll() to ignore this entry */
                    num_open--;
                }
            }

            if (pfds[i].revents & (POLLERR | POLLHUP | POLLNVAL)) {
                printf("fd %d: error/hangup/invalid\n", pfds[i].fd);
                close(pfds[i].fd);
                pfds[i].fd = -1;
                num_open--;
            }
        }
    }

    /* Close write ends of pipes */
    for (i = 0; i < NUM_PIPES; i++)
        close(pipe_fds[i][1]);

    return 0;
}

select() vs poll() — Side by Side

Aspect select() poll()
Max fd limit 1024 (FD_SETSIZE) No limit
Rebuild before each call? Yes — fd_set is destroyed No — revents is separate
How to check result FD_ISSET() on each fd Check revents in array
Timeout resolution Microseconds (struct timeval) Milliseconds (int)
POSIX standard Yes Yes
Performance at scale Poor — O(N) scan of fd space Medium — O(N) scan of array

Why Both Are Slow with Many File Descriptors

Every time you call select() or poll(), you pass the complete list of file descriptors to the kernel. The kernel has to scan through all of them, check each one, and rebuild the result. This is O(N) work every call.

If you have 10,000 connections but only 10 are active at any moment, you are still telling the kernel to check 10,000 fds every time. This is why epoll was invented — it keeps the list in the kernel and only tells you what changed.

select()/poll() overhead per call:
copy 10,000 fds to kernel
kernel scans all 10,000
copy result back to userspace
only 10 were ready

Interview Questions

Q1: Why must you copy the fd_set before each call to select()?
select() modifies the fd_set in place. After it returns, the set only contains the fds that are ready, not all the fds you originally added. If you call select() again without rebuilding the set, you will only be watching the fds that were ready last time — missing all the others. The common pattern is to keep a master_fds copy and do read_fds = master_fds before each call.
Q2: What is the significance of the nfds parameter in select()?
nfds must be set to one more than the highest numbered fd in any of the three sets. The kernel uses this to know how far into the fd_set bitmask to scan. If you set it too low, some fds will be missed. If you set it too high, performance suffers slightly due to unnecessary scanning. Best practice: track the highest fd you have added and pass that + 1.
Q3: What happens when you set fd = -1 in a struct pollfd?
poll() ignores any entry where fd is -1. This is a clean way to remove a file descriptor from the monitored set without shrinking the array. You just set pfds[i].fd = -1 and poll() will skip that slot. This avoids expensive array reallocation when a client disconnects.
Q4: What is the difference between POLLIN and POLLRDHUP?
POLLIN is set when there is data to read or when the peer has closed the connection (you will get EOF from read). POLLRDHUP is a Linux extension that specifically tells you the peer has closed their write end of the socket (TCP half-close), before you even try to read. It lets you detect disconnection without needing to attempt a read first.
Q5: Both select() and poll() have O(N) complexity. What does that mean practically?
It means performance degrades linearly with the number of monitored file descriptors. With 100 fds it is fine. With 10,000 fds, every single select()/poll() call forces the kernel to examine all 10,000 entries and copy data between kernel and user space — even if only a handful are ever active. This is why high-performance servers on Linux use epoll instead, which is O(1) for checking readiness.
Q6: How do you implement a nonblocking poll — one that returns immediately without waiting?
Pass timeout = 0 to poll(). With select(), pass a struct timeval with both tv_sec and tv_usec set to 0. This makes the call check which fds are ready right now and return immediately — either with the count of ready fds or 0 if nothing is ready. This is called a “poll” in the traditional UNIX sense (check without blocking).

Leave a Reply

Your email address will not be published. Required fields are marked *