Chapter 18 Part 6 โ€” Recursive Directory Traversal | EmbeddedPathashala

 

nftw() โ€” File Tree Walking
Chapter 18 Part 6 โ€” Recursive Directory Traversal | EmbeddedPathashala
๐ŸŒณ
nftw()
๐Ÿ”ฝ
Preorder/Postorder
๐Ÿšฉ
Flags

What is nftw()?

nftw() (new file tree walk) recursively walks an entire directory tree and calls a programmer-defined callback function for every file it finds. Think of it like find command โ€” but from inside your C program. You provide the action; nftw() handles all the recursion.

nftw()FTW_D FTW_FFTW_SL FTW_DNRFTW_DEPTH FTW_PHYSFTW_MOUNT struct FTWnopenfd

Function Signature
#define _XOPEN_SOURCE 500
#include <ftw.h>

int nftw(
    const char *dirpath,     /* Root of the tree to walk */
    int (*func)(             /* Your callback function */
        const char *pathname,       /* Full path to file */
        const struct stat *statbuf, /* File info */
        int typeflag,               /* What kind of file */
        struct FTW *ftwbuf          /* Extra info */
    ),
    int nopenfd,             /* Max open fds to use */
    int flags                /* Control flags */
);
/* Returns: 0 on full success, -1 on error,
   or the first nonzero value returned by func */

Your callback (func) must return 0 to continue walking, or a nonzero to stop early. nopenfd โ€” use 10 or more for modern systems.

typeflag Values Passed to Your Callback
typeflag Meaning
FTW_F Regular file (or any type that’s not a directory or symlink)
FTW_D Directory
FTW_DNR Directory that cannot be read (no execute permission)
FTW_DP Directory โ€” in postorder mode, all children already processed
FTW_SL Symbolic link (only if FTW_PHYS flag is used)
FTW_SLN Dangling symbolic link (FTW_PHYS not used)
FTW_NS stat() failed on this file (probably permission denied)

nftw() Flags
Flag Effect
FTW_DEPTH Postorder traversal: process directory contents BEFORE the directory itself
FTW_PHYS Don’t follow symlinks. Pass symlinks to func as FTW_SL
FTW_MOUNT Don’t cross into another filesystem (don’t follow mount points)
FTW_CHDIR chdir() into each directory before processing its contents
๐Ÿ“Œ FTW_DEPTH is crucial for recursive deletion. When deleting a tree, you must delete children before parents โ€” that’s postorder. Without FTW_DEPTH you’d try to rmdir() a directory that still has files in it.

Preorder vs Postorder โ€” Visualized

Given this tree: root/ โ†’ a.txt, subdir/ โ†’ b.txt

Default (Preorder)
1. root/ (FTW_D)
2. a.txt (FTW_F)
3. subdir/ (FTW_D)
4. b.txt (FTW_F)
Directory is visited BEFORE its contents

FTW_DEPTH (Postorder)
1. a.txt (FTW_F)
2. b.txt (FTW_F)
3. subdir/ (FTW_DP)
4. root/ (FTW_DP)
Directory is visited AFTER its contents

The struct FTW Argument
struct FTW {
    int base;   /* Offset of filename within pathname string */
    int level;  /* Depth from starting directory (0 = root) */
};

/* Example: if pathname = "/home/ravi/code/main.c" */
/* ftwbuf->base = 16 (offset of "main.c") */
/* ftwbuf->level = 2 (root is 0) */

/* Get just the filename: */
const char *filename = &pathname[ftwbuf->base];

Code Examples

Example 1: List all files in a tree with indent by level
#define _XOPEN_SOURCE 500
#include <stdio.h>
#include <ftw.h>

static int print_entry(const char *pathname, const struct stat *sb,
                        int typeflag, struct FTW *ftwbuf) {
    const char *type = "?";
    if (typeflag == FTW_F)   type = "FILE";
    if (typeflag == FTW_D)   type = "DIR ";
    if (typeflag == FTW_SL)  type = "LINK";
    if (typeflag == FTW_DNR) type = "NDIR";

    /* Indent by depth level */
    printf("%*s[%s] %s\n",
           ftwbuf->level * 4, "",
           type,
           &pathname[ftwbuf->base]);   /* Just the filename */

    return 0;  /* 0 = continue walking */
}

int main(int argc, char *argv[]) {
    const char *root = (argc > 1) ? argv[1] : ".";

    if (nftw(root, print_entry, 10, FTW_PHYS) == -1) {
        perror("nftw");
        return 1;
    }
    return 0;
}

Sample output for a small tree:

[DIR ] myproject
    [FILE] main.c
    [FILE] Makefile
    [DIR ] include
        [FILE] utils.h
    [DIR ] src
        [FILE] utils.c
Example 2: Count and sum sizes of all regular files
#define _XOPEN_SOURCE 500
#include <stdio.h>
#include <ftw.h>

static long total_bytes = 0;
static int file_count = 0;

static int count_files(const char *path, const struct stat *sb,
                       int typeflag, struct FTW *ftwb) {
    if (typeflag == FTW_F) {
        total_bytes += sb->st_size;
        file_count++;
    }
    return 0;
}

int main(int argc, char *argv[]) {
    const char *root = (argc > 1) ? argv[1] : ".";

    nftw(root, count_files, 10, FTW_PHYS);

    printf("Files found  : %d\n", file_count);
    printf("Total size   : %ld bytes (%.2f KB)\n",
           total_bytes, total_bytes / 1024.0);
    return 0;
}
Example 3: Find all files larger than N bytes
#define _XOPEN_SOURCE 500
#include <stdio.h>
#include <ftw.h>

static long size_limit = 0;

static int find_large(const char *path, const struct stat *sb,
                      int typeflag, struct FTW *ftwb) {
    if (typeflag == FTW_F && sb->st_size > size_limit) {
        printf("%10ld bytes  %s\n", (long)sb->st_size, path);
    }
    return 0;
}

int main(int argc, char *argv[]) {
    const char *root = (argc > 1) ? argv[1] : ".";
    size_limit = (argc > 2) ? atol(argv[2]) : 4096;

    printf("Files larger than %ld bytes in '%s':\n", size_limit, root);
    nftw(root, find_large, 10, FTW_PHYS);
    return 0;
}

Usage: ./large_files /home/ravi 100000 โ€” finds files > 100KB

Example 4: Recursive delete using FTW_DEPTH (like rm -rf)
#define _XOPEN_SOURCE 500
#include <stdio.h>
#include <ftw.h>
#include <unistd.h>

static int delete_entry(const char *path, const struct stat *sb,
                        int typeflag, struct FTW *ftwb) {
    int ret;
    if (typeflag == FTW_DP || typeflag == FTW_D)
        ret = rmdir(path);   /* Must be empty by now (postorder!) */
    else
        ret = unlink(path);  /* File or symlink */

    if (ret == -1)
        perror(path);
    else
        printf("Deleted: %s\n", path);

    return ret;  /* Stop on error */
}

int rmrf(const char *path) {
    /* FTW_DEPTH = postorder: files deleted before their directories */
    /* FTW_PHYS  = don't follow symlinks */
    return nftw(path, delete_entry, 10, FTW_DEPTH | FTW_PHYS);
}

int main(int argc, char *argv[]) {
    if (argc != 2) {
        printf("Usage: %s \n", argv[0]);
        return 1;
    }
    printf("Recursively deleting: %s\n", argv[1]);
    return rmrf(argv[1]);
}
โš ๏ธ This permanently deletes everything. Never run this on important directories!
FTW_DEPTH (postorder) is required here because rmdir() only works on empty directories. With postorder, all files in a directory are deleted before the directory itself is attempted.
Example 5: Stop walking early when a file is found
#define _XOPEN_SOURCE 500
#include <stdio.h>
#include <ftw.h>
#include <string.h>

static const char *target_name = NULL;

static int find_file(const char *path, const struct stat *sb,
                     int typeflag, struct FTW *ftwb) {
    if (typeflag == FTW_F) {
        if (strcmp(&path[ftwb->base], target_name) == 0) {
            printf("Found: %s\n", path);
            return 1;  /* Nonzero stops the walk */
        }
    }
    return 0;
}

int main(int argc, char *argv[]) {
    if (argc != 3) {
        printf("Usage: %s

\n”, argv[0]); return 1; } target_name = argv[2]; int r = nftw(argv[1], find_file, 10, FTW_PHYS); if (r == 0) printf(“‘%s’ not found in ‘%s’\n”, argv[2], argv[1]); return 0; }

Interview Questions

Q1: What is nftw() and how is it different from writing your own recursive opendir()/readdir() loop?
nftw() handles all the recursion, opendir/closedir bookkeeping, stat() calls, and error conditions internally. It is more reliable and less code. It also manages the number of open file descriptors to avoid hitting system limits.
Q2: Why is FTW_DEPTH (postorder) required for recursive deletion?
rmdir() only succeeds on empty directories. With preorder traversal, nftw() would visit the directory before its contents, so rmdir() would fail with ENOTEMPTY. Postorder ensures all children are deleted before the parent directory is attempted.
Q3: What does FTW_PHYS do and when would you use it?
FTW_PHYS prevents nftw() from following symbolic links. Without it, a symlink to a directory would be followed, potentially causing infinite loops or processing unintended directories. Use FTW_PHYS whenever you are walking a tree you don’t fully trust.
Q4: How do you stop nftw() early from inside the callback?
Return a nonzero value from the callback function. nftw() immediately stops and returns that nonzero value to the caller. Do NOT use longjmp() โ€” it will cause memory leaks because nftw() uses dynamically allocated state.
Q5: What does the level field in struct FTW tell you?
It tells the depth of the current file relative to the starting directory. The root itself is level 0. Its immediate children are level 1, and so on. Useful for indented printing or depth-limited searches.
Q6: What happens when nftw() encounters a directory it cannot read?
It passes that directory to your callback with typeflag == FTW_DNR (directory not readable). Your callback can log the error or ignore it. nftw() does not descend into unreadable directories.

Leave a Reply

Your email address will not be published. Required fields are marked *