Shared Libraries: Why They Exist ? Why Do We Need Shared Libraries?

 

41.3 — Shared Libraries: Why They Exist
Advantages, Naming Conventions & Versioning | EmbeddedPathashala

Key Terms

Shared Library (.so) soname Real Name Linker Name Dynamic Linker Run-time Loading Shared Memory Pages ABI Compatibility

Why Do We Need Shared Libraries?

Consider a typical Linux system running 100 programs — almost all of them use the C standard library functions like printf(), malloc(), strlen(). With static libraries, every single one of those 100 programs would have its own private copy of the printf code embedded in its binary. That’s enormous waste of disk and RAM.

Shared libraries solve this by putting the library code in a separate file (a .so file — “shared object”). All programs that need it can share a single copy on disk AND in memory. When the program runs, the operating system maps the shared library’s pages into the process’s virtual address space. If 100 programs use the same .so, the OS maps the same physical RAM pages into all 100 processes. The result: significant savings.

An additional huge benefit: if a security bug is found in a shared library (like OpenSSL), you just replace the .so file once. All programs automatically use the fixed version next time they run — without relinking or recompiling anything.

Static vs Shared: Memory Layout

Process Memory Comparison

Static Libraries
prog_a: code + libc copy + libm copy
prog_b: code + libc copy + libm copy
prog_c: code + libc copy + libm copy
3 copies of libc in RAM 😱

Shared Libraries
prog_a code
prog_b code
prog_c code
libc.so (ONE shared copy)
↑ All 3 programs point here ✅

Shared Library Naming: Three Names, One Library

Every shared library on Linux has three different names, each serving a different purpose. Understanding this is essential for working with shared libraries.

① Real Name — the actual file on disk

Format: lib<name>.so.<major>.<minor>.<patch>

Example: libembedutils.so.1.2.3

This is the actual file that contains the library code. The version numbers track changes: major = incompatible API break, minor = backward-compatible new features, patch = bug fixes.

② soname — the ABI version identifier

Format: lib<name>.so.<major>

Example: libembedutils.so.1

The soname is embedded inside the .so file itself (in the ELF header). It contains only the major version number. The OS uses this name to find the right version of the library at run time. The soname is typically a symbolic link pointing to the real name file. When major version changes, the soname changes — programs compiled with v1 cannot accidentally use v2.

③ Linker Name — used at compile/link time

Format: lib<name>.so

Example: libembedutils.so

This is a symbolic link used by the linker when you do -lembedutils at build time. It points to the latest soname symlink. This is typically only installed in developer packages (e.g., libssl-dev) — end-user packages only have the soname symlink.

Symlink Chain for libembedutils
Linker uses (compile time)
libembedutils.so
symlink →
OS uses (run time)
libembedutils.so.1
(soname) symlink →
Actual file on disk
libembedutils.so.1.2.3
(real name)
## View the symlink chain for a real system library (OpenSSL)
$ ls -la /usr/lib/x86_64-linux-gnu/libssl*

## Output:
## lrwxrwxrwx  libssl.so -> libssl.so.3      (linker name → soname)
## lrwxrwxrwx  libssl.so.3 -> libssl.so.3.0.2 (soname → real name)
## -rw-r--r--  libssl.so.3.0.2               (real name: actual file)

## The ldd command shows which shared libraries a program needs
$ ldd /usr/bin/curl
## Output (example):
## linux-vdso.so.1 (0x00007ffd...)
## libssl.so.3 => /usr/lib/x86_64-linux-gnu/libssl.so.3 (0x...)
## libcrypto.so.3 => /usr/lib/x86_64-linux-gnu/libcrypto.so.3 (0x...)
## libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x...)

## Each entry shows: soname => actual file path (load address)

Coding Example 1 — Observing Shared Library Sharing

This example demonstrates how two programs share a single .so in memory.

/* prog_a.c - first program that uses libc */
#include <stdio.h>
#include <unistd.h>

int main(void) {
    printf("prog_a PID = %d\n", getpid());
    printf("printf() is from libc.so!\n");

    /* Check our own memory maps */
    char maps_cmd[64];
    snprintf(maps_cmd, sizeof(maps_cmd),
             "grep libc /proc/%d/maps", getpid());
    printf("\n--- Memory map showing libc ---\n");
    system(maps_cmd);

    printf("\nPress Enter to exit...\n");
    getchar();  /* pause so we can inspect with pmap */
    return 0;
}
/* prog_b.c - second program that also uses libc */
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int main(void) {
    printf("prog_b PID = %d\n", getpid());

    /* malloc uses libc */
    void *p = malloc(1024);
    printf("malloc returned %p (from libc heap)\n", p);
    free(p);

    /* Show libc in memory maps */
    char maps_cmd[64];
    snprintf(maps_cmd, sizeof(maps_cmd),
             "grep libc /proc/%d/maps", getpid());
    printf("\n--- Memory map showing libc ---\n");
    system(maps_cmd);

    printf("\nPress Enter to exit...\n");
    getchar();
    return 0;
}
## Compile both programs
$ gcc -g -o prog_a prog_a.c
$ gcc -g -o prog_b prog_b.c

## Run them in two terminals simultaneously
## Terminal 1:
$ ./prog_a

## Terminal 2:
$ ./prog_b

## Both /proc/PID/maps will show libc at DIFFERENT virtual addresses
## but the same physical pages are shared in RAM by the kernel.

## Use pmap to see library mappings:
$ pmap $(pgrep prog_a)
$ pmap $(pgrep prog_b)

## Use the 'size' command to compare linked sizes:
$ gcc -static -o prog_a_static prog_a.c   # static link
$ ls -lh prog_a prog_a_static
## prog_a: ~16KB  (just code + reference to libc)
## prog_a_static: ~800KB+ (libc code embedded!)

Coding Example 2 — The Value of Runtime Updates

This example simulates why shared libraries make bug fixes easy. We build a library, a program that uses it, fix a bug in the library, and show the program picks it up without relinking.

/* version_check.h */
#ifndef VERSION_CHECK_H
#define VERSION_CHECK_H

const char* get_library_version(void);
int safe_divide(int a, int b);

#endif
/* version_check_v1.c - Version 1.0 with a bug: no divide-by-zero check */
#include "version_check.h"
#include <stdio.h>

const char* get_library_version(void) {
    return "1.0.0 (BUGGY - no divide-by-zero check)";
}

int safe_divide(int a, int b) {
    /* BUG: no check for b == 0 ! */
    return a / b;  /* Will crash if b == 0 */
}
/* use_library.c - program that calls the library */
#include <stdio.h>
#include "version_check.h"

int main(void) {
    printf("Library version: %s\n", get_library_version());

    printf("10 / 2 = %d\n", safe_divide(10, 2));

    /* This would crash with v1.0 but should work after the fix */
    printf("Trying 10 / 0 ...\n");
    int result = safe_divide(10, 0);
    printf("Result: %d\n", result);
    return 0;
}
## Step 1: Build version 1.0 of the shared library (buggy)
$ gcc -g -fPIC -c version_check_v1.c -o version_check.o
$ gcc -g -shared -Wl,-soname,libvc.so.1 \
    -o libvc.so.1.0.0 version_check.o

## Create symlinks
$ ln -sf libvc.so.1.0.0 libvc.so.1    # soname symlink
$ ln -sf libvc.so.1     libvc.so       # linker symlink

## Step 2: Compile and link the main program
$ gcc -g -o use_library use_library.c -L. -lvc

## Step 3: Run it (will crash on safe_divide(10, 0))
$ LD_LIBRARY_PATH=. ./use_library

## ---- Now we "fix" the library ----

## version_check_v2.c with the fix:
cat > version_check_v2.c << 'EOF'
#include "version_check.h"
#include <stdio.h>

const char* get_library_version(void) {
    return "1.0.1 (FIXED - divide-by-zero is handled)";
}

int safe_divide(int a, int b) {
    if (b == 0) {
        fprintf(stderr, "Error: division by zero!\n");
        return -1;  /* Safe return value */
    }
    return a / b;
}
EOF

## Step 4: Rebuild the library with the fix (new patch version)
$ gcc -g -fPIC -c version_check_v2.c -o version_check.o
$ gcc -g -shared -Wl,-soname,libvc.so.1 \
    -o libvc.so.1.0.1 version_check.o

## Update the soname symlink to point to new version
$ ln -sf libvc.so.1.0.1 libvc.so.1

## Step 5: Run the SAME executable again — no recompile!
$ LD_LIBRARY_PATH=. ./use_library
## Library version: 1.0.1 (FIXED - divide-by-zero is handled)
## 10 / 2 = 5
## Trying 10 / 0 ...
## Error: division by zero!
## Result: -1

## The program was NOT recompiled — it just picked up the new .so!
💡 Real World Impact:
This is exactly how Linux distributions handle security patches. When a vulnerability is found in libssl.so or libz.so, the distro ships a new .so file via the package manager (apt upgrade). All programs that use that library are immediately protected — no recompiling required. With static linking, every single application would need to be recompiled and redistributed.

Interview Questions — Section 41.3

Q1. What are the three names associated with a shared library? What is each used for?
Real name (e.g., libfoo.so.1.2.3): The actual file on disk, named with full version info.

soname (e.g., libfoo.so.1): A symlink + name embedded in the ELF header. Contains only the major version. Used by the dynamic linker at run time to find the right version.

Linker name (e.g., libfoo.so): A symlink used only at compile/link time when you specify -lfoo. Points to the current soname symlink.

Q2. Why does the soname only contain the major version number?
The soname follows the rule that a major version change means ABI incompatibility — the library’s interface has changed in a way that would break existing programs. Minor and patch changes are backward-compatible (new functions added, bugs fixed) so existing programs can still use the newer library. By only including the major version in the soname, programs compiled against libfoo.so.1 will automatically use any patch/minor update (1.0.1, 1.2.0, etc.) but will never accidentally be loaded with the incompatible version 2.x.
Q3. How does shared library memory sharing actually work in the kernel?
When the dynamic linker loads a shared library, it uses mmap() to map the library’s code (text) segment into the process’s virtual address space. The kernel uses copy-on-write (COW) page tables — multiple processes can have their virtual addresses map to the same physical RAM pages. Since library code is read-only (text segment), it never needs to be written, so the same physical pages are shared by all processes using that library. Only the data/BSS segments (global variables) are given private copies per process.
Q4. What is the difference between a soname and an ABI?
An ABI (Application Binary Interface) defines the binary-level contract of a library: function call conventions, data structure layouts, symbol names, parameter types. When a library change breaks existing binaries (e.g., changes a struct size, removes a function, changes a function signature), it’s an ABI break.

The soname is a naming mechanism that encodes which ABI version a library implements. When a library’s ABI breaks, the major version in the soname is incremented, giving the new library a different soname. Programs compiled against the old ABI continue using the old soname file, while new programs use the new one.

Leave a Reply

Your email address will not be published. Required fields are marked *