Interview Questions Fundamentals of Shared Libraries

 

Interview Questions
Chapter 41: Fundamentals of Shared Libraries | EmbeddedPathashala
25Questions
3Levels
DetailedAnswers

Section A — Object Libraries & Static Libraries Fresher

Q1. What is the difference between compiling and linking?
Compiling converts a single C source file (.c) into an object file (.o) containing machine code. It only processes one file at a time and does not resolve references to functions in other files. The compiler flags (-Wall, -g, -O2) apply at this stage.

Linking takes all the object files and library files and combines them into a final executable. The linker resolves all unresolved symbol references — for example, if file A calls a function defined in file B, the linker finds its address and patches it in. Use -c to compile-only; omit -c to compile+link.

Q2. What does the nm command tell you about an object file?
nm lists the symbols (functions and global variables) in an object file or executable. Symbol types include:

  • T — Symbol defined in the text (code) section of this file
  • U — Undefined — referenced but defined elsewhere (needs linking)
  • D — Defined in the data (initialized global) section
  • B — Defined in BSS (uninitialized global) section
$ nm myobj.o | grep "printf\|main"
## T main      (main is defined here)
## U printf    (printf is undefined — comes from libc)
Q3. What are the naming conventions for static and shared library files on Linux?
Static library: lib<name>.a
Examples: libmath.a, libdemo.a, libc.a

Shared library (real name): lib<name>.so.<major>.<minor>.<patch>
Examples: libssl.so.3.0.2, libembedutils.so.1.2.0

The lib prefix is standard and is stripped when you use -l<name> with the linker. So -lssl looks for libssl.a or libssl.so.

Q4. Write the commands to build a static library from a.c and b.c, and link it into myprog.
## Step 1: Compile source files to object files
$ gcc -g -Wall -c a.c b.c

## Step 2: Create the static library
$ ar rv libmyutils.a a.o b.o

## Step 3: Compile the main program
$ gcc -g -c main.c

## Step 4: Link against the static library
$ gcc -g -o myprog main.o -L. -lmyutils
## OR:
$ gcc -g -o myprog main.o libmyutils.a

## Verify contents of the library:
$ ar tv libmyutils.a
Q5. Why should the -l flag come AFTER object files in a gcc link command?
The GNU linker processes files left-to-right. When it sees a -l option, it checks which symbols in that library are needed by the currently unresolved symbols. If no object files have been processed yet, no symbols are unresolved, so the linker skips the entire library. To fix: always place -l flags after all .o files:

## CORRECT: .o files first, then -l flags
$ gcc -o myprog main.o helper.o -L. -lfoo

## WRONG: library before object files
$ gcc -o myprog -lfoo main.o helper.o  # "undefined reference" errors!
Q6. What is the ar tv command and what does its output mean?
ar tv libname.a lists the contents of the archive in verbose mode. The output format is:

rw-r--r-- 1000/1000  2456  Jun 5 10:00 2025  gpio.o

Fields from left to right:

  • File permissions of the object when it was archived
  • Numeric user ID / group ID
  • Size in bytes
  • Date and time last modified
  • Name of the object file

Section B — Shared Library Concepts Mid-Level

Q7. Explain the three advantages of shared libraries over static libraries.
  1. Disk space saving: The library code exists once on disk. With static libraries, every executable contains its own copy of the code, multiplying disk usage by the number of programs.
  2. RAM saving: The OS maps the read-only code pages of a shared library into all processes that use it. The same physical RAM pages appear in multiple virtual address spaces simultaneously — no duplication.
  3. Runtime updates: If a bug or vulnerability is fixed in a shared library, you replace the .so file and all programs immediately benefit on next run. With static libraries, every program must be recompiled and relinked.
Q8. What is Position-Independent Code (PIC)? Why is it required for shared libraries?
PIC is machine code generated with -fPIC that does not use absolute memory addresses. Instead, it accesses data through the Global Offset Table (GOT) (for variables) and calls functions through the Procedure Linkage Table (PLT) (for external functions), both of which use addresses relative to the instruction pointer (RIP).

It is required because a shared library can be loaded at any virtual address (not known at compile time). With absolute addresses, the code could only work at one specific address. With PIC, the same physical code pages can be mapped into different virtual addresses in different processes, enabling true memory sharing.

Q9. What is a soname? How is it embedded into a shared library?
A soname (shared object name) is a string stored in the ELF dynamic section of a shared library. It contains the library name with only the major version number (e.g., libfoo.so.1). When programs are linked against the library, they record this soname as their dependency (not the full filename). At runtime, the dynamic linker resolves this soname to a symlink.

It is embedded at link time using:

$ gcc -shared -Wl,-soname,libfoo.so.1 -o libfoo.so.1.2.3 *.o

Verify with: readelf -d libfoo.so.1.2.3 | grep SONAME

Q10. Explain lazy binding in the context of shared libraries.
Lazy binding (also called lazy symbol resolution) means that external function symbols in a shared library are resolved only when the function is called for the first time, not when the program starts. This is implemented via the PLT (Procedure Linkage Table).

First call: PLT stub → dynamic linker → resolves address → stores in GOT → calls function.
Subsequent calls: PLT stub → GOT (already resolved) → calls function directly.

This speeds up program startup because not all symbols need to be resolved immediately. You can disable lazy binding with LD_BIND_NOW=1 (or -Wl,-z,now at link time) which forces all symbols to be resolved at startup — useful for security (prevents PLT hijacking attacks like RELRO bypasses).

Q11. What happens if you try to link non-PIC object files into a shared library on x86-32?
On 32-bit x86, non-PIC code uses absolute addressing in the .text section. When the linker tries to create a shared library from such code, it encounters “text relocations” — absolute addresses in the code section that need to be fixed up for each load address. The linker will report an error like:

relocation R_386_32 against `global_var' can not be used when
making a shared object; recompile with -fPIC

On x86-64, the linker may succeed but produce a suboptimal library with text relocations that cannot be shared between processes.

Q12. What is the complete sequence of steps to create a properly-versioned shared library?
## 1. Compile with -fPIC
$ gcc -g -fPIC -c module.c

## 2. Link with -shared and embed soname
$ gcc -shared -Wl,-soname,libfoo.so.1 \
    -o libfoo.so.1.0.0 module.o

## 3. Create soname symlink (OS uses this at runtime)
$ ln -sf libfoo.so.1.0.0 libfoo.so.1

## 4. Create linker name symlink (used by -lfoo at compile time)
$ ln -sf libfoo.so.1 libfoo.so

## 5. Install and update cache (for system-wide use)
$ sudo cp libfoo.so.1.0.0 /usr/local/lib/
$ sudo ln -sf libfoo.so.1.0.0 /usr/local/lib/libfoo.so.1
$ sudo ldconfig

Section C — Advanced & Senior Level Senior

Q13. What is the GOT and PLT? How do they work together for a printf() call in PIC code?
For a printf() call in a PIC shared library:

  1. The compiled code does a relative CALL to printf@PLT — the PLT entry for printf.
  2. The PLT entry contains a JMP through the GOT entry for printf.
  3. Initially, the GOT entry points back into the PLT resolver stub (lazy binding).
  4. First call: PLT stub calls the dynamic linker’s resolver with the library index and symbol index.
  5. The dynamic linker finds printf’s address in libc and writes it into the GOT entry.
  6. Subsequent calls: PLT entry → GOT → printf directly (no resolver overhead).

This two-table design allows code pages (PLT) to be read-only and shared, while data pages (GOT) are private per-process and writable.

Q14. What is RPATH vs RUNPATH? What are their differences?
Both are paths embedded in the ELF binary’s dynamic section specifying where the dynamic linker should search for libraries. The difference is in search priority:

RPATH (DT_RPATH): Searched before LD_LIBRARY_PATH. The older mechanism. Cannot be overridden by LD_LIBRARY_PATH.

RUNPATH (DT_RUNPATH): Searched after LD_LIBRARY_PATH. The newer, preferred mechanism. Can be overridden by the user.

Set RPATH: -Wl,-rpath,/path (old) or -Wl,--disable-new-dtags,-rpath,/path
Set RUNPATH: -Wl,-rpath,/path -Wl,--enable-new-dtags (default on modern systems)
Use $ORIGIN for the directory of the binary itself.

Q15. When should you increment the major version of a shared library soname?
Increment the major version (and thus change the soname from e.g. libfoo.so.1 to libfoo.so.2) whenever you make a change that breaks ABI compatibility:

  • Removing a function from the public API
  • Changing a function’s signature (parameters, return type)
  • Changing the size or layout of a struct that programs use by value
  • Changing the meaning/behavior of a function parameter in an incompatible way
  • Removing or renaming a global variable in the public API

Minor version increments (libfoo.so.1.1.0) are for backward-compatible additions. Patch version increments (libfoo.so.1.0.1) are for bug fixes with no API/ABI change.

Q16. What is a text relocation and why is it problematic for shared libraries?
A text relocation occurs when code in a shared library’s text (code) segment contains absolute addresses that must be patched at load time for the specific address where the library is loaded. This is problematic because:

  • The text segment must be made writable to apply the patches, then made executable again — a security risk (W^X violation).
  • Since each process needs different addresses patched, the text pages cannot be shared between processes — defeating the main purpose of shared libraries.
  • On SELinux/PaX systems, binaries with text relocations are rejected or marked non-executable.

Text relocations are avoided by always compiling shared library code with -fPIC.

Q17. How do you control which symbols are exported from a shared library?
By default, all non-static functions and globals are exported. To control exports:

  1. Use static: Mark internal-only functions as static — they won’t appear in the symbol table at all.
  2. GCC visibility attribute:
    __attribute__((visibility("default"))) void public_func(void);  /* exported */
    __attribute__((visibility("hidden")))  void internal_func(void); /* NOT exported */
    
  3. Compile with -fvisibility=hidden to hide all symbols by default, then mark public ones with visibility("default").
  4. Version script: Use -Wl,--version-script=mylib.map with a .map file listing exactly which symbols to export.

Limiting exported symbols reduces the attack surface, speeds up symbol lookup, and prevents accidental use of internal APIs.

Q18. How do __attribute__((constructor)) and __attribute__((destructor)) work in shared libraries?
These GCC attributes register functions to run automatically:

  • __attribute__((constructor)) — function runs when the shared library is loaded (before main())
  • __attribute__((destructor)) — function runs when the library is unloaded (after main() returns or when dlclose() is called)
__attribute__((constructor))
static void my_lib_init(void) {
    printf("Library loaded!\n");
    /* Initialize mutex, allocate resources, etc. */
}

__attribute__((destructor))
static void my_lib_cleanup(void) {
    printf("Library unloaded!\n");
    /* Free resources, close file handles, etc. */
}

These are placed in the .init_array and .fini_array ELF sections and invoked by the dynamic linker during library loading/unloading.

Section D — Practical & Troubleshooting Questions Mid-Level

Q19. A program gives “error while loading shared libraries: libfoo.so.1: cannot open shared object file”. What are the possible causes and how do you fix each?
Cause 1: Library not installed
Fix: Install the library package (apt install libfoo-dev or copy .so to /usr/local/lib) and run sudo ldconfig.

Cause 2: Library is in a non-standard directory
Fix: Add the directory to /etc/ld.so.conf.d/ and run sudo ldconfig. Or set LD_LIBRARY_PATH=/path/to/lib for testing.

Cause 3: ldconfig not run after installation
Fix: Run sudo ldconfig. The cache is stale.

Cause 4: Wrong soname symlink
Fix: Check that libfoo.so.1 symlink exists and points to the correct .so.x.y.z file. Use ls -la /path | grep libfoo.

Cause 5: Architecture mismatch
Fix: Check if you’re trying to load a 32-bit .so on a 64-bit program or vice versa. Use file libfoo.so.1 to check.

Diagnosis tool: Use ldd ./myprog to see which libraries are “not found”, and LD_DEBUG=libs ./myprog to trace the search paths.

Q20. What is the difference between -l and -L flags? Give examples.
-L<dir> adds a directory to the linker’s library search path. Multiple -L options can be given; directories are searched left to right.

-l<name> specifies the library to link. The linker expands this to lib<name>.so (shared, preferred) or lib<name>.a (static).

## Search /opt/mylibs for libembedutils.so, then link with it:
$ gcc -o myapp main.o -L/opt/mylibs -lembedutils

## Multiple libraries:
$ gcc -o myapp main.o -L. -lembedutils -lm -lpthread

## Force static: add -static or use .a directly
$ gcc -o myapp main.o -L. -static -lembedutils
Q21. How would you build a shared library for an ARM embedded Linux target (cross-compilation)?
The process is identical to x86, but using the cross-compiler toolchain:

## Use the ARM cross-compiler instead of gcc
## (e.g., arm-linux-gnueabihf-gcc for ARM hard-float)
CROSS = arm-linux-gnueabihf-gcc
STRIP = arm-linux-gnueabihf-strip

## Compile with -fPIC (same flag, works for all architectures)
$ $(CROSS) -g -fPIC -c gpio.c -o gpio.o

## Build shared library
$ $(CROSS) -shared -Wl,-soname,libgpio.so.1 \
    -o libgpio.so.1.0.0 gpio.o

## Strip debug symbols for deployment (saves space on embedded)
$ $(STRIP) --strip-unneeded libgpio.so.1.0.0

## Create symlinks
$ ln -sf libgpio.so.1.0.0 libgpio.so.1
$ ln -sf libgpio.so.1 libgpio.so

## Verify it's ARM ELF:
$ file libgpio.so.1.0.0
## ARM, EABI5, dynamically linked, shared object
Q22. What is dlopen(), dlsym(), and dlclose()? When would you use them?
These functions from <dlfcn.h> provide explicit (dynamic) loading of shared libraries at runtime — different from the implicit loading done by the dynamic linker.

dlopen(path, flags) — Load a .so file at runtime. Returns a handle. dlsym(handle, "sym_name") — Get the address of a function or variable. dlclose(handle) — Unload the library.

#include <dlfcn.h>
#include <stdio.h>

int main(void) {
    /* Load libm.so at runtime */
    void *handle = dlopen("libm.so.6", RTLD_LAZY);
    if (!handle) { fprintf(stderr, "%s\n", dlerror()); return 1; }

    /* Get the address of sin() */
    double (*my_sin)(double) = dlsym(handle, "sin");

    printf("sin(3.14) = %f\n", my_sin(3.14));
    dlclose(handle);
    return 0;
}
/* Compile: gcc -o prog main.c -ldl */

Use case: plugin architectures, lazy loading of optional features, language runtimes.

Q23. What is the difference between -fPIC and -fPIE?
Both generate position-independent code (using RIP-relative addressing, GOT), but they are for different purposes:

-fPIC — For shared libraries. Generates code that can be loaded at any address and shared between processes. The GOT is shared but data is per-process.

-fPIE — For executables. Generates position-independent executables so the kernel’s ASLR can randomize the base address of the program itself (not just libraries). Modern Linux distros compile programs with -fPIE by default for security.

PIC code cannot be used for an executable’s main code (it would require a PLT call to access its own globals). PIE executables are slightly larger and slower than non-PIE, but the security benefit is worth it.

Q24. What does readelf -d libfoo.so | grep NEEDED show?
This shows the shared libraries that libfoo.so itself depends on. Just like programs can depend on shared libraries, shared libraries can also depend on other shared libraries. These “NEEDED” entries are recorded in the ELF dynamic section.

$ readelf -d /lib/x86_64-linux-gnu/libssl.so.3 | grep NEEDED
 0x0000000000000001 (NEEDED) Shared library: [libcrypto.so.3]
 0x0000000000000001 (NEEDED) Shared library: [libc.so.6]

This tells you that libssl.so depends on libcrypto.so and libc.so. The dynamic linker will automatically load these dependencies as well when libssl is loaded.

Q25. What is LD_PRELOAD? Give a legitimate use case and a security concern.
LD_PRELOAD is an environment variable that lists shared libraries to be loaded before all others when any program starts. Functions in the preloaded library override (interpose on) functions in other libraries, including libc.

Legitimate use case — memory debugging:

## Run program with AddressSanitizer or custom malloc wrapper:
$ LD_PRELOAD=/usr/lib/libasan.so.6 ./myprog

## Intercept malloc to log allocations:
$ LD_PRELOAD=./mymalloc_debug.so ./myprog

Security concern: An attacker who can set LD_PRELOAD can inject arbitrary code into any program, replacing functions like read() or write() with malicious versions. For this reason, the kernel ignores LD_PRELOAD for setuid/setgid programs and programs with elevated capabilities.

Leave a Reply

Your email address will not be published. Required fields are marked *