I-nodes (Index Nodes) in Linux: Complete File System Guide

 

📇 I-nodes (Index Nodes) in Linux: Complete File System Guide
The heart of every file — what an i-node stores and how ext2 locates data blocks
Topic 4 of 9
I-nodes
Key Call
stat(), lstat()
Concept
Direct/Indirect Pointers

Key Terms:

I-node I-number I-node Table Direct Pointer Indirect Pointer Double Indirect Triple Indirect File Hole Hard Link stat()

What is an I-node?

An i-node (index node) is a data structure in the filesystem that stores all information about a file — except its name. The filename is stored in the directory; the directory maps names to i-node numbers.

Every file and directory has exactly one i-node, identified by its i-number. You can see i-numbers with ls -li.

$ ls -li /etc/passwd
263218 -rw-r--r-- 1 root root 2847 Jan 15 10:00 /etc/passwd
^
I-number (unique within this filesystem)
Key insight: A file’s name is not part of the file. It’s just a label in a directory pointing to an i-node. That’s why hard links work — multiple names, same i-node.

📋 What’s Stored in an I-node
Field What it Stores struct stat Member
File type Regular file, directory, symlink, device, FIFO, socket st_mode (type bits)
Owner (UID) User ID of file owner st_uid
Group (GID) Group ID st_gid
Permissions rwxrwxrwx for owner/group/other + setuid/setgid/sticky bits st_mode (perm bits)
Timestamps (3) atime (last access), mtime (last content change), ctime (last inode change) st_atime, st_mtime, st_ctime
Hard link count Number of directory entries pointing to this i-node st_nlink
File size Size in bytes st_size
Allocated blocks Actual disk blocks allocated (in 512-byte units) st_blocks
Data block pointers 15 pointers to data blocks (12 direct + 3 indirect levels) (internal, not in struct stat)
📌 Note: creation time is NOT stored in most Linux filesystems. ext4 added it as crtime, but it’s not exposed in the standard struct stat.

📊 ext2 I-node Data Block Pointer System

An i-node has 15 pointers to locate file data. For small files, direct pointers are enough. For large files, the kernel follows chains of pointer blocks.

I-node

15 pointers
(fixed size)

Pointer 0 (direct) Data Block 0
Pointer 1 (direct) Data Block 1
… (direct 2–10) Data Blocks 2–10
Pointer 11 (direct) Data Block 11
Pointer 12 (single indirect) Indirect Pointer Block
(contains 256–1024 pointers → data blocks)
Pointer 13 (double indirect) Block of Pointers → Block of Pointers → Data Blocks
(supports much larger files)
Pointer 14 (triple indirect) 3 levels of pointer blocks → Data Blocks
(supports files up to ~4TB with 4096-byte blocks)
12 Direct Pointers
With 4KB blocks:
12 × 4096 = 48 KB max
~95% of files fit here!
Single Indirect
1024 pointers × 4KB = 4 MB extra
Double Indirect
1024² × 4KB = 4 GB extra
Triple Indirect
1024³ × 4KB ≈ 4 TB extra

💻 Code Example 1: Read I-node Info with stat()
#include <stdio.h>
#include <sys/stat.h>
#include <time.h>
#include <pwd.h>
#include <grp.h>

void print_inode_info(const char *path) {
    struct stat sb;
    struct passwd *pw;
    struct tm *tm;
    char timebuf[64];

    /* stat() reads i-node info into struct stat */
    if (lstat(path, &sb) == -1) {
        perror("lstat");
        return;
    }

    printf("=== I-node Info for: %s ===\n", path);
    printf("I-number:          %lu\n", (unsigned long)sb.st_ino);
    printf("File size:         %lld bytes\n", (long long)sb.st_size);
    printf("Allocated blocks:  %lld (512-byte units)\n",
           (long long)sb.st_blocks);
    printf("Block size (I/O):  %ld bytes\n", sb.st_blksize);
    printf("Hard link count:   %lu\n", (unsigned long)sb.st_nlink);

    /* File type */
    printf("File type:         ");
    if (S_ISREG(sb.st_mode))       printf("Regular file\n");
    else if (S_ISDIR(sb.st_mode))  printf("Directory\n");
    else if (S_ISLNK(sb.st_mode))  printf("Symbolic link\n");
    else if (S_ISCHR(sb.st_mode))  printf("Character device\n");
    else if (S_ISBLK(sb.st_mode))  printf("Block device\n");
    else if (S_ISFIFO(sb.st_mode)) printf("FIFO/pipe\n");
    else if (S_ISSOCK(sb.st_mode)) printf("Socket\n");

    /* Permissions */
    printf("Permissions:       %c%c%c%c%c%c%c%c%c%c\n",
        (S_ISREG(sb.st_mode)) ? '-' :
        (S_ISDIR(sb.st_mode)) ? 'd' :
        (S_ISLNK(sb.st_mode)) ? 'l' : '?',
        (sb.st_mode & S_IRUSR) ? 'r' : '-',
        (sb.st_mode & S_IWUSR) ? 'w' : '-',
        (sb.st_mode & S_IXUSR) ? 'x' : '-',
        (sb.st_mode & S_IRGRP) ? 'r' : '-',
        (sb.st_mode & S_IWGRP) ? 'w' : '-',
        (sb.st_mode & S_IXGRP) ? 'x' : '-',
        (sb.st_mode & S_IROTH) ? 'r' : '-',
        (sb.st_mode & S_IWOTH) ? 'w' : '-',
        (sb.st_mode & S_IXOTH) ? 'x' : '-');

    /* Owner */
    pw = getpwuid(sb.st_uid);
    printf("Owner:             %s (UID %d)\n",
           pw ? pw->pw_name : "unknown", sb.st_uid);

    /* Timestamps */
    tm = localtime(&sb.st_mtime);
    strftime(timebuf, sizeof(timebuf), "%Y-%m-%d %H:%M:%S", tm);
    printf("Last modified:     %s\n", timebuf);

    tm = localtime(&sb.st_atime);
    strftime(timebuf, sizeof(timebuf), "%Y-%m-%d %H:%M:%S", tm);
    printf("Last accessed:     %s\n", timebuf);

    tm = localtime(&sb.st_ctime);
    strftime(timebuf, sizeof(timebuf), "%Y-%m-%d %H:%M:%S", tm);
    printf("I-node changed:    %s\n", timebuf);
}

int main(int argc, char *argv[]) {
    const char *path = (argc > 1) ? argv[1] : "/etc/passwd";
    print_inode_info(path);
    return 0;
}

💻 Code Example 2: Hard Links Share an I-node
#include <stdio.h>
#include <sys/stat.h>
#include <unistd.h>

int main(void) {
    struct stat sb1, sb2;

    /* Create a file and a hard link to it */
    FILE *f = fopen("/tmp/original.txt", "w");
    fprintf(f, "Hello\n");
    fclose(f);

    /* link() creates a hard link (another name for same i-node) */
    if (link("/tmp/original.txt", "/tmp/hardlink.txt") == -1) {
        perror("link");
        return 1;
    }

    /* Compare i-nodes of both names */
    stat("/tmp/original.txt", &sb1);
    stat("/tmp/hardlink.txt", &sb2);

    printf("original i-number:  %lu\n", (unsigned long)sb1.st_ino);
    printf("hardlink i-number:  %lu\n", (unsigned long)sb2.st_ino);

    if (sb1.st_ino == sb2.st_ino)
        printf("SAME i-node! Both names point to the same file.\n");

    printf("Link count on original: %lu\n",
           (unsigned long)sb1.st_nlink);
    /* Link count is now 2 */

    /* Deleting original doesn't remove the data —
       the i-node link count just decrements to 1 */
    unlink("/tmp/original.txt");
    stat("/tmp/hardlink.txt", &sb2);
    printf("After deleting original, link count: %lu\n",
           (unsigned long)sb2.st_nlink);
    /* File data still accessible via hardlink.txt */

    unlink("/tmp/hardlink.txt");  /* Now count=0, data deleted */
    return 0;
}

/* Output:
original i-number:  1234567
hardlink i-number:  1234567
SAME i-node! Both names point to the same file.
Link count on original: 2
After deleting original, link count: 1
*/

💻 Code Example 3: File Holes (Sparse Files)

If you seek past the end of a file and write, the gap is a “hole” — no disk blocks allocated. The i-node stores 0 in those pointer slots.

#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/stat.h>

int main(void) {
    int fd;
    struct stat sb;
    const char *path = "/tmp/sparse_file.bin";

    fd = open(path, O_WRONLY | O_CREAT | O_TRUNC, 0644);
    if (fd == -1) { perror("open"); return 1; }

    /* Write 1 byte at position 0 */
    write(fd, "A", 1);

    /* Jump 100MB forward and write 1 byte */
    lseek(fd, 100 * 1024 * 1024, SEEK_SET);
    write(fd, "Z", 1);

    close(fd);

    /* Check how much disk space is actually used */
    stat(path, &sb);
    printf("File size (st_size):    %lld bytes (~100 MB)\n",
           (long long)sb.st_size);
    printf("Blocks allocated:        %lld × 512 = %lld bytes\n",
           (long long)sb.st_blocks,
           (long long)sb.st_blocks * 512);
    /* Only 2 blocks or so allocated — the 'hole' takes no space! */

    /* Read from hole area — returns null bytes without disk access */
    fd = open(path, O_RDONLY);
    lseek(fd, 50 * 1024 * 1024, SEEK_SET);
    char buf;
    read(fd, &buf, 1);
    printf("Byte at middle of hole:  0x%02x (always 0)\n",
           (unsigned char)buf);
    close(fd);

    unlink(path);
    return 0;
}

/* Output:
File size (st_size):    104857601 bytes (~100 MB)
Blocks allocated:        8 × 512 = 4096 bytes
Byte at middle of hole:  0x00 (always 0)
*/

💻 Code Example 4: Using fstat() on an Open File Descriptor
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/stat.h>

/* fstat() works on an open fd — useful when you don't have
   the filename (e.g., anonymous file, file you've already opened) */

int main(void) {
    int fd;
    struct stat sb;

    /* Open stdin (fd 0) — we don't have a filename */
    fd = STDIN_FILENO;  /* = 0 */

    if (fstat(fd, &sb) == -1) {
        perror("fstat");
        return 1;
    }

    printf("stdin i-node: %lu\n", (unsigned long)sb.st_ino);
    printf("stdin is: ");
    if (S_ISCHR(sb.st_mode))  printf("a terminal device\n");
    else if (S_ISREG(sb.st_mode)) printf("a regular file\n");
    else if (S_ISFIFO(sb.st_mode)) printf("a pipe\n");
    else printf("something else\n");

    /* Practical: check if two fds refer to the same file */
    int fd1 = open("/etc/hosts", O_RDONLY);
    int fd2 = open("/etc/hosts", O_RDONLY);

    struct stat s1, s2;
    fstat(fd1, &s1);
    fstat(fd2, &s2);

    /* Same file = same device + same inode number */
    if (s1.st_ino == s2.st_ino && s1.st_dev == s2.st_dev)
        printf("fd1 and fd2 point to the same file\n");

    close(fd1); close(fd2);
    return 0;
}

🎯 Interview Questions — I-nodes

Q1. What is an i-node and what information does it store?

An i-node (index node) stores all metadata about a file: type, permissions, UID, GID, three timestamps (atime/mtime/ctime), hard link count, file size, allocated blocks count, and pointers to data blocks. It does NOT store the filename — that’s in the directory.

Q2. Why is the filename not stored in the i-node?

Decoupling name from data enables hard links — multiple directory entries (names) can point to the same i-node. The file’s data is referenced-counted via st_nlink. When the link count drops to 0 and no process has the file open, the data blocks are freed.

Q3. Explain the ext2 pointer system for locating data blocks.

Each i-node has 15 pointers. The first 12 are direct pointers to data blocks (fast, covers ~48KB with 4KB blocks). Pointer 12 is a single indirect — it points to a block of pointers. Pointer 13 is double indirect (two levels of pointer blocks). Pointer 14 is triple indirect (three levels), supporting files up to ~4TB.

Q4. What is a file hole? How does it save disk space?

A file hole is a gap in a file created by seeking past the end and writing. The kernel stores null (0) in the i-node pointer for that region instead of allocating disk blocks. Reading from a hole returns zero bytes without any disk I/O. Useful for large sparse files (e.g., VM disk images).

Q5. What is the difference between atime, mtime, and ctime?

atime (access time) updates when the file is read. mtime (modification time) updates when file content changes. ctime (change time) updates when the i-node itself changes (permissions, ownership, hard link count, content). Note: ctime is not “creation time” — creation time is not stored in most Linux filesystems.

Q6. What is the difference between stat(), lstat(), and fstat()?

stat() follows symbolic links and returns info about the linked file. lstat() does NOT follow symlinks — it returns info about the symlink itself. fstat() takes an open file descriptor instead of a path, useful when you only have an fd and not the filename.

Q7. How do you check if two file descriptors point to the same underlying file?

Call fstat() on both. If st_dev (device ID) and st_ino (i-node number) are both equal, they refer to the same file. Two different paths can point to the same i-node if one is a hard link of the other.

Leave a Reply

Your email address will not be published. Required fields are marked *