Linux File Attributes: stat() System Call

 

Linux File Attributes: stat() System Call
Chapter 15 Part 1 — Retrieving File Metadata in Linux
3
System Calls
13+
struct fields
Beginner
Level

What is this about?

Every file on Linux has two parts: the actual data (contents) and metadata (information about the file). Metadata includes things like file size, owner, permissions, and timestamps. This metadata lives in a data structure called the i-node.

The stat() family of system calls lets your C program read this metadata without opening the file. Think of it like asking: “Tell me about this file” without actually reading its contents.

Key Terms:

stat() lstat() fstat() struct stat i-node st_mode st_size symbolic link file descriptor S_ISREG() S_IFMT

1. The Three stat() Variants — What’s the Difference?

All three do the same job (fill a struct stat with file info), but they differ in how you identify the file:

Function How to identify the file Symbolic link behaviour Typical use
stat() File path (string) Follows the link (gives info about target) Most common — just pass a filename
lstat() File path (string) Does NOT follow — gives info about the link itself When you want symlink’s own metadata
fstat() Open file descriptor (int fd) N/A (file already open) When you already have file open
Remember: stat() and lstat() don’t need read permission on the file. But they DO need execute (search) permission on every directory in the path.
/* Function signatures — include these headers */
#include <sys/stat.h>

int stat(const char *pathname, struct stat *statbuf);
int lstat(const char *pathname, struct stat *statbuf);
int fstat(int fd, struct stat *statbuf);

/* All return 0 on success, -1 on error (check errno) */

2. The struct stat — What Fields Does It Have?

When you call stat(), the kernel fills this structure for you:

struct stat fields
dev_t   st_dev Device ID where file lives (major + minor)
ino_t   st_ino i-node number (unique ID per filesystem)
mode_t  st_mode IMPORTANT: File type + permissions packed together
nlink_t st_nlink Number of hard links pointing to this file
uid_t   st_uid User ID of file owner
gid_t   st_gid Group ID of file owner
dev_t   st_rdev Device ID (only for device special files)
off_t   st_size IMPORTANT: Total size in bytes (or symlink target length)
blksize_t st_blksize Best block size for I/O (usually 4096)
blkcnt_t st_blocks Actual disk blocks allocated (in 512-byte units)
time_t  st_atime Last access time (read)
time_t  st_mtime Last modification time (write)
time_t  st_ctime Last status change time (i-node changed)
st_size vs st_blocks: A sparse file (a file with holes) can have a huge st_size (e.g. 1 GB) but tiny st_blocks (only the parts with actual data are allocated). du -k file shows blocks used, not st_size.

3. Code Examples — Hands-On Practice

Example 1: Basic stat() — print file size and type

#include <stdio.h>
#include <sys/stat.h>
#include <stdlib.h>

int main(int argc, char *argv[]) {
    struct stat sb;

    if (argc != 2) {
        fprintf(stderr, "Usage: %s <filename>\n", argv[0]);
        exit(1);
    }

    /* Call stat() — fills sb with file metadata */
    if (stat(argv[1], &sb) == -1) {
        perror("stat");
        exit(1);
    }

    printf("File: %s\n", argv[1]);
    printf("Size: %lld bytes\n", (long long)sb.st_size);
    printf("i-node: %ld\n", (long)sb.st_ino);
    printf("Hard links: %ld\n", (long)sb.st_nlink);

    return 0;
}
/* Compile: gcc -o fileinfo fileinfo.c
   Run:     ./fileinfo /etc/passwd */

Example 2: Check file type using S_ISXXX() macros

#include <stdio.h>
#include <sys/stat.h>
#include <stdlib.h>

void print_file_type(const char *path) {
    struct stat sb;

    if (stat(path, &sb) == -1) {
        perror("stat");
        return;
    }

    printf("%s is a: ", path);

    /* Use convenience macros — cleaner than bit-masking manually */
    if (S_ISREG(sb.st_mode))   printf("regular file\n");
    else if (S_ISDIR(sb.st_mode))   printf("directory\n");
    else if (S_ISLNK(sb.st_mode))   printf("symbolic link\n");
    else if (S_ISFIFO(sb.st_mode))  printf("FIFO/pipe\n");
    else if (S_ISCHR(sb.st_mode))   printf("character device\n");
    else if (S_ISBLK(sb.st_mode))   printf("block device\n");
    else if (S_ISSOCK(sb.st_mode))  printf("socket\n");
    else printf("unknown type\n");
}

/* Equivalent manual check (without macros): */
void manual_check(struct stat *sb) {
    if ((sb->st_mode & S_IFMT) == S_IFREG)
        printf("regular file (manual check)\n");
}

int main(void) {
    print_file_type("/etc/passwd");   /* regular file */
    print_file_type("/etc");          /* directory    */
    print_file_type("/dev/tty");      /* char device  */
    return 0;
}

Example 3: lstat() vs stat() — see the difference with symlinks

#include <stdio.h>
#include <sys/stat.h>
#include <stdlib.h>

int main(void) {
    struct stat sb1, sb2;

    /* Create a symlink first:  ln -s /etc/passwd mylink */

    /* stat() follows the link — gives info about /etc/passwd */
    stat("mylink", &sb1);
    printf("stat()  size = %lld\n", (long long)sb1.st_size);

    /* lstat() stays at the link — gives info about the link itself */
    lstat("mylink", &sb2);
    printf("lstat() size = %lld\n", (long long)sb2.st_size);
    /* lstat size = length of the string "/etc/passwd" = 11 */

    /* Check: is it a symlink? */
    if (S_ISLNK(sb2.st_mode))
        printf("lstat confirms: it IS a symbolic link\n");

    return 0;
}

Example 4: fstat() — stat on an open file descriptor

#include <stdio.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdlib.h>

int main(void) {
    int fd;
    struct stat sb;

    fd = open("/etc/hostname", O_RDONLY);
    if (fd == -1) { perror("open"); exit(1); }

    /* fstat uses the file descriptor, not the path */
    if (fstat(fd, &sb) == -1) { perror("fstat"); exit(1); }

    printf("File size via fstat: %lld bytes\n", (long long)sb.st_size);
    printf("Owner UID: %d\n", sb.st_uid);
    printf("Owner GID: %d\n", sb.st_gid);

    /* fstat is useful when the file might have been renamed/deleted
       after opening — fd still refers to the same i-node */
    close(fd);
    return 0;
}

Example 5: Check if a path is a directory (common utility function)

#include <sys/stat.h>

/* Returns 1 if path is a directory, 0 otherwise */
int is_directory(const char *path) {
    struct stat sb;
    return (stat(path, &sb) == 0 && S_ISDIR(sb.st_mode));
}

/* Returns 1 if path is a regular file, 0 otherwise */
int is_regular_file(const char *path) {
    struct stat sb;
    return (stat(path, &sb) == 0 && S_ISREG(sb.st_mode));
}

/* Returns file size, or -1 on error */
long long get_file_size(const char *path) {
    struct stat sb;
    if (stat(path, &sb) == -1) return -1;
    return (long long)sb.st_size;
}

/* Check if two paths point to the same file (same i-node) */
int same_file(const char *p1, const char *p2) {
    struct stat s1, s2;
    if (stat(p1, &s1) == -1 || stat(p2, &s2) == -1) return 0;
    /* Same device AND same i-node = same file */
    return (s1.st_dev == s2.st_dev && s1.st_ino == s2.st_ino);
}

Example 6: Print file type using bit-mask directly (understand st_mode internals)

#include <stdio.h>
#include <sys/stat.h>

void show_mode_bits(mode_t mode) {
    /* st_mode bit layout:
       Bits 15-12: file type (4 bits)
       Bits 11-9:  special bits (setUID, setGID, sticky)
       Bits 8-6:   owner rwx
       Bits 5-3:   group rwx
       Bits 2-0:   other rwx
    */

    /* Extract file type by masking top bits */
    switch (mode & S_IFMT) {
        case S_IFREG:  printf("Type: regular file\n");   break;
        case S_IFDIR:  printf("Type: directory\n");       break;
        case S_IFCHR:  printf("Type: char device\n");     break;
        case S_IFBLK:  printf("Type: block device\n");    break;
        case S_IFLNK:  printf("Type: symlink\n");         break;
        case S_IFIFO:  printf("Type: FIFO\n");            break;
        case S_IFSOCK: printf("Type: socket\n");          break;
    }

    /* Print permissions in rwxrwxrwx format */
    printf("Perms: %c%c%c%c%c%c%c%c%c\n",
        (mode & S_IRUSR) ? 'r' : '-',
        (mode & S_IWUSR) ? 'w' : '-',
        (mode & S_IXUSR) ? 'x' : '-',
        (mode & S_IRGRP) ? 'r' : '-',
        (mode & S_IWGRP) ? 'w' : '-',
        (mode & S_IXGRP) ? 'x' : '-',
        (mode & S_IROTH) ? 'r' : '-',
        (mode & S_IWOTH) ? 'w' : '-',
        (mode & S_IXOTH) ? 'x' : '-');
}

4. st_mode Bit Layout — Visual Breakdown

The st_mode field packs two things into one integer. Here’s how the bits are arranged:

st_mode (16 bits)
Bits 15–12
File Type
(4 bits)
Bits 11–9
Special
(setUID/GID/sticky)
Bits 8–6
Owner
r w x
Bits 5–3
Group
r w x
Bits 2–0 — Other: r w x
Constant Value (octal) Meaning
S_IFREG 0100000 Regular file
S_IFDIR 0040000 Directory
S_IFCHR 0020000 Character device
S_IFBLK 0060000 Block device
S_IFIFO 0010000 FIFO / named pipe
S_IFSOCK 0140000 Socket
S_IFLNK 0120000 Symbolic link
Tip: Always use the S_ISXXX(mode) macros rather than manual bit masking. They’re cleaner, readable, and portable across UNIX systems.

5. Common Mistakes to Avoid

Mistake 1: Using the wrong type when printing fields

/* WRONG — st_size is off_t, can overflow int on 64-bit */
printf("Size: %d\n", sb.st_size);

/* CORRECT — cast to long long */
printf("Size: %lld\n", (long long)sb.st_size);

/* WRONG — st_ino is ino_t */
printf("Inode: %d\n", sb.st_ino);

/* CORRECT */
printf("Inode: %lu\n", (unsigned long)sb.st_ino);

Mistake 2: Forgetting that stat() follows symlinks

/* If you want to check if a path IS a symlink, use lstat() not stat() */
struct stat sb;

/* This will NEVER report S_ISLNK because stat() follows the link! */
stat("mylink", &sb);
if (S_ISLNK(sb.st_mode))  /* Always false with stat() */
    printf("symlink\n");

/* Correct way: use lstat() */
lstat("mylink", &sb);
if (S_ISLNK(sb.st_mode))  /* This works */
    printf("it IS a symlink\n");

Mistake 3: Not checking return value

struct stat sb;
stat("/nonexistent/path", &sb);   /* BAD — no error check */
printf("%lld\n", (long long)sb.st_size); /* sb is garbage! */

/* CORRECT */
if (stat("/nonexistent/path", &sb) == -1) {
    perror("stat");  /* prints: stat: No such file or directory */
    exit(1);
}

6. Interview Questions

Q1 What is the difference between stat(), lstat(), and fstat()?

A All three retrieve file metadata into a struct stat. stat() takes a pathname and follows symlinks. lstat() takes a pathname but does NOT follow symlinks — it returns info about the link itself. fstat() takes an open file descriptor instead of a path and always succeeds if the fd is valid.

Q2 What is the difference between st_size and st_blocks?

A st_size is the logical file size in bytes. st_blocks is the actual number of 512-byte disk blocks allocated. For sparse files (files with holes), st_blocks will be much smaller than what you’d expect from st_size.

Q3 How do you determine the file type from st_mode?

A Two ways: (1) Use the S_ISREG(), S_ISDIR(), etc. macros directly on st_mode. (2) Mask with S_IFMT to extract the type bits and compare with constants like S_IFREG, S_IFDIR. Macros are preferred for clarity.

Q4 Can two different pathnames refer to the same file? How do you check?

A Yes — via hard links. Two paths point to the same file if they have the same st_dev (device) AND the same st_ino (i-node number). Check both fields after calling stat() on each path.

Q5 What permissions are needed to call stat() on a file?

A You do NOT need read permission on the file itself. But you DO need execute (search) permission on every parent directory in the path. fstat() always succeeds as long as the file descriptor is valid.

Q6 What does lstat() return for st_size when applied to a symbolic link?

A It returns the length in bytes of the pathname string that the symlink points to. For example, a symlink pointing to /etc/passwd (11 characters) would have st_size = 11.

Q7 What is the st_blksize field? Is it the filesystem block size?

A No — it is the optimal I/O block size for this filesystem. Doing I/O in multiples of this size (typically 4096) gives the best performance. It is NOT the actual disk block size used for allocation.

Q8 Write code to check if a file is a block device using stat().

A

struct stat sb;
if (stat("/dev/sda", &sb) == 0 && S_ISBLK(sb.st_mode))
    printf("It is a block device\n");

Next: File Timestamps — atime, mtime, ctime and how to change them

Next Topic →

Leave a Reply

Your email address will not be published. Required fields are marked *