I-nodes
stat(), lstat()
Direct/Indirect Pointers
Key Terms:
What is an I-node?
An i-node (index node) is a data structure in the filesystem that stores all information about a file — except its name. The filename is stored in the directory; the directory maps names to i-node numbers.
Every file and directory has exactly one i-node, identified by its i-number. You can see i-numbers with ls -li.
$ ls -li /etc/passwd
263218 -rw-r--r-- 1 root root 2847 Jan 15 10:00 /etc/passwd
^
I-number (unique within this filesystem)
| Field | What it Stores | struct stat Member |
|---|---|---|
File type |
Regular file, directory, symlink, device, FIFO, socket | st_mode (type bits) |
Owner (UID) |
User ID of file owner | st_uid |
Group (GID) |
Group ID | st_gid |
Permissions |
rwxrwxrwx for owner/group/other + setuid/setgid/sticky bits | st_mode (perm bits) |
Timestamps (3) |
atime (last access), mtime (last content change), ctime (last inode change) | st_atime, st_mtime, st_ctime |
Hard link count |
Number of directory entries pointing to this i-node | st_nlink |
File size |
Size in bytes | st_size |
Allocated blocks |
Actual disk blocks allocated (in 512-byte units) | st_blocks |
Data block pointers |
15 pointers to data blocks (12 direct + 3 indirect levels) | (internal, not in struct stat) |
creation time is NOT stored in most Linux filesystems. ext4 added it as crtime, but it’s not exposed in the standard struct stat.An i-node has 15 pointers to locate file data. For small files, direct pointers are enough. For large files, the kernel follows chains of pointer blocks.
| I-node
15 pointers |
Pointer 0 (direct) | → | Data Block 0 |
| Pointer 1 (direct) | → | Data Block 1 | |
| … (direct 2–10) | Data Blocks 2–10 | ||
| Pointer 11 (direct) | → | Data Block 11 | |
| Pointer 12 (single indirect) | → | Indirect Pointer Block (contains 256–1024 pointers → data blocks) |
|
| Pointer 13 (double indirect) | → | Block of Pointers → Block of Pointers → Data Blocks (supports much larger files) |
|
| Pointer 14 (triple indirect) | → | 3 levels of pointer blocks → Data Blocks (supports files up to ~4TB with 4096-byte blocks) |
|
With 4KB blocks:
12 × 4096 = 48 KB max
~95% of files fit here!
1024 pointers × 4KB = 4 MB extra
1024² × 4KB = 4 GB extra
1024³ × 4KB ≈ 4 TB extra
#include <stdio.h>
#include <sys/stat.h>
#include <time.h>
#include <pwd.h>
#include <grp.h>
void print_inode_info(const char *path) {
struct stat sb;
struct passwd *pw;
struct tm *tm;
char timebuf[64];
/* stat() reads i-node info into struct stat */
if (lstat(path, &sb) == -1) {
perror("lstat");
return;
}
printf("=== I-node Info for: %s ===\n", path);
printf("I-number: %lu\n", (unsigned long)sb.st_ino);
printf("File size: %lld bytes\n", (long long)sb.st_size);
printf("Allocated blocks: %lld (512-byte units)\n",
(long long)sb.st_blocks);
printf("Block size (I/O): %ld bytes\n", sb.st_blksize);
printf("Hard link count: %lu\n", (unsigned long)sb.st_nlink);
/* File type */
printf("File type: ");
if (S_ISREG(sb.st_mode)) printf("Regular file\n");
else if (S_ISDIR(sb.st_mode)) printf("Directory\n");
else if (S_ISLNK(sb.st_mode)) printf("Symbolic link\n");
else if (S_ISCHR(sb.st_mode)) printf("Character device\n");
else if (S_ISBLK(sb.st_mode)) printf("Block device\n");
else if (S_ISFIFO(sb.st_mode)) printf("FIFO/pipe\n");
else if (S_ISSOCK(sb.st_mode)) printf("Socket\n");
/* Permissions */
printf("Permissions: %c%c%c%c%c%c%c%c%c%c\n",
(S_ISREG(sb.st_mode)) ? '-' :
(S_ISDIR(sb.st_mode)) ? 'd' :
(S_ISLNK(sb.st_mode)) ? 'l' : '?',
(sb.st_mode & S_IRUSR) ? 'r' : '-',
(sb.st_mode & S_IWUSR) ? 'w' : '-',
(sb.st_mode & S_IXUSR) ? 'x' : '-',
(sb.st_mode & S_IRGRP) ? 'r' : '-',
(sb.st_mode & S_IWGRP) ? 'w' : '-',
(sb.st_mode & S_IXGRP) ? 'x' : '-',
(sb.st_mode & S_IROTH) ? 'r' : '-',
(sb.st_mode & S_IWOTH) ? 'w' : '-',
(sb.st_mode & S_IXOTH) ? 'x' : '-');
/* Owner */
pw = getpwuid(sb.st_uid);
printf("Owner: %s (UID %d)\n",
pw ? pw->pw_name : "unknown", sb.st_uid);
/* Timestamps */
tm = localtime(&sb.st_mtime);
strftime(timebuf, sizeof(timebuf), "%Y-%m-%d %H:%M:%S", tm);
printf("Last modified: %s\n", timebuf);
tm = localtime(&sb.st_atime);
strftime(timebuf, sizeof(timebuf), "%Y-%m-%d %H:%M:%S", tm);
printf("Last accessed: %s\n", timebuf);
tm = localtime(&sb.st_ctime);
strftime(timebuf, sizeof(timebuf), "%Y-%m-%d %H:%M:%S", tm);
printf("I-node changed: %s\n", timebuf);
}
int main(int argc, char *argv[]) {
const char *path = (argc > 1) ? argv[1] : "/etc/passwd";
print_inode_info(path);
return 0;
}
#include <stdio.h>
#include <sys/stat.h>
#include <unistd.h>
int main(void) {
struct stat sb1, sb2;
/* Create a file and a hard link to it */
FILE *f = fopen("/tmp/original.txt", "w");
fprintf(f, "Hello\n");
fclose(f);
/* link() creates a hard link (another name for same i-node) */
if (link("/tmp/original.txt", "/tmp/hardlink.txt") == -1) {
perror("link");
return 1;
}
/* Compare i-nodes of both names */
stat("/tmp/original.txt", &sb1);
stat("/tmp/hardlink.txt", &sb2);
printf("original i-number: %lu\n", (unsigned long)sb1.st_ino);
printf("hardlink i-number: %lu\n", (unsigned long)sb2.st_ino);
if (sb1.st_ino == sb2.st_ino)
printf("SAME i-node! Both names point to the same file.\n");
printf("Link count on original: %lu\n",
(unsigned long)sb1.st_nlink);
/* Link count is now 2 */
/* Deleting original doesn't remove the data —
the i-node link count just decrements to 1 */
unlink("/tmp/original.txt");
stat("/tmp/hardlink.txt", &sb2);
printf("After deleting original, link count: %lu\n",
(unsigned long)sb2.st_nlink);
/* File data still accessible via hardlink.txt */
unlink("/tmp/hardlink.txt"); /* Now count=0, data deleted */
return 0;
}
/* Output:
original i-number: 1234567
hardlink i-number: 1234567
SAME i-node! Both names point to the same file.
Link count on original: 2
After deleting original, link count: 1
*/
If you seek past the end of a file and write, the gap is a “hole” — no disk blocks allocated. The i-node stores 0 in those pointer slots.
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/stat.h>
int main(void) {
int fd;
struct stat sb;
const char *path = "/tmp/sparse_file.bin";
fd = open(path, O_WRONLY | O_CREAT | O_TRUNC, 0644);
if (fd == -1) { perror("open"); return 1; }
/* Write 1 byte at position 0 */
write(fd, "A", 1);
/* Jump 100MB forward and write 1 byte */
lseek(fd, 100 * 1024 * 1024, SEEK_SET);
write(fd, "Z", 1);
close(fd);
/* Check how much disk space is actually used */
stat(path, &sb);
printf("File size (st_size): %lld bytes (~100 MB)\n",
(long long)sb.st_size);
printf("Blocks allocated: %lld × 512 = %lld bytes\n",
(long long)sb.st_blocks,
(long long)sb.st_blocks * 512);
/* Only 2 blocks or so allocated — the 'hole' takes no space! */
/* Read from hole area — returns null bytes without disk access */
fd = open(path, O_RDONLY);
lseek(fd, 50 * 1024 * 1024, SEEK_SET);
char buf;
read(fd, &buf, 1);
printf("Byte at middle of hole: 0x%02x (always 0)\n",
(unsigned char)buf);
close(fd);
unlink(path);
return 0;
}
/* Output:
File size (st_size): 104857601 bytes (~100 MB)
Blocks allocated: 8 × 512 = 4096 bytes
Byte at middle of hole: 0x00 (always 0)
*/
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/stat.h>
/* fstat() works on an open fd — useful when you don't have
the filename (e.g., anonymous file, file you've already opened) */
int main(void) {
int fd;
struct stat sb;
/* Open stdin (fd 0) — we don't have a filename */
fd = STDIN_FILENO; /* = 0 */
if (fstat(fd, &sb) == -1) {
perror("fstat");
return 1;
}
printf("stdin i-node: %lu\n", (unsigned long)sb.st_ino);
printf("stdin is: ");
if (S_ISCHR(sb.st_mode)) printf("a terminal device\n");
else if (S_ISREG(sb.st_mode)) printf("a regular file\n");
else if (S_ISFIFO(sb.st_mode)) printf("a pipe\n");
else printf("something else\n");
/* Practical: check if two fds refer to the same file */
int fd1 = open("/etc/hosts", O_RDONLY);
int fd2 = open("/etc/hosts", O_RDONLY);
struct stat s1, s2;
fstat(fd1, &s1);
fstat(fd2, &s2);
/* Same file = same device + same inode number */
if (s1.st_ino == s2.st_ino && s1.st_dev == s2.st_dev)
printf("fd1 and fd2 point to the same file\n");
close(fd1); close(fd2);
return 0;
}
🎯 Interview Questions — I-nodes
Q1. What is an i-node and what information does it store?
An i-node (index node) stores all metadata about a file: type, permissions, UID, GID, three timestamps (atime/mtime/ctime), hard link count, file size, allocated blocks count, and pointers to data blocks. It does NOT store the filename — that’s in the directory.
Q2. Why is the filename not stored in the i-node?
Decoupling name from data enables hard links — multiple directory entries (names) can point to the same i-node. The file’s data is referenced-counted via st_nlink. When the link count drops to 0 and no process has the file open, the data blocks are freed.
Q3. Explain the ext2 pointer system for locating data blocks.
Each i-node has 15 pointers. The first 12 are direct pointers to data blocks (fast, covers ~48KB with 4KB blocks). Pointer 12 is a single indirect — it points to a block of pointers. Pointer 13 is double indirect (two levels of pointer blocks). Pointer 14 is triple indirect (three levels), supporting files up to ~4TB.
Q4. What is a file hole? How does it save disk space?
A file hole is a gap in a file created by seeking past the end and writing. The kernel stores null (0) in the i-node pointer for that region instead of allocating disk blocks. Reading from a hole returns zero bytes without any disk I/O. Useful for large sparse files (e.g., VM disk images).
Q5. What is the difference between atime, mtime, and ctime?
atime (access time) updates when the file is read. mtime (modification time) updates when file content changes. ctime (change time) updates when the i-node itself changes (permissions, ownership, hard link count, content). Note: ctime is not “creation time” — creation time is not stored in most Linux filesystems.
Q6. What is the difference between stat(), lstat(), and fstat()?
stat() follows symbolic links and returns info about the linked file. lstat() does NOT follow symlinks — it returns info about the symlink itself. fstat() takes an open file descriptor instead of a path, useful when you only have an fd and not the filename.
Q7. How do you check if two file descriptors point to the same underlying file?
Call fstat() on both. If st_dev (device ID) and st_ino (i-node number) are both equal, they refer to the same file. Two different paths can point to the same i-node if one is a hard link of the other.
