Run-Time Symbol Resolution Symbol Interposition, Override Rules & -Bsymbolic

 

Run-Time Symbol Resolution
TLPI Chapter 41.12 — Symbol Interposition, Override Rules & -Bsymbolic
2
Resolution Rules
4+
Code Examples
10
Interview Q&As

Key Concepts
Symbol Resolution Symbol Interposition Global Symbol Symbol Override -Bsymbolic PLT / GOT Weak Symbols LD_PRELOAD

The Core Problem

What happens when the same function name (a “global symbol”) is defined in multiple places — for example, in both the main executable and in a shared library? Which version actually gets called at run time?

This is the symbol resolution problem. The answer might surprise you: the main program wins — its definition overrides the library’s definition, even when the library is calling a function defined within itself. This behavior is called symbol interposition.

Demonstrating Symbol Interposition

Consider a shared library that defines both xyz() and func(). func() calls xyz(). The main program also defines its own xyz(). Which xyz() does func() call?

Symbol Interposition: Who wins?
main program (prog.c)
xyz() { printf(“main-xyz\n”); }
main() { func(); }
shared library (libfoo.so)
xyz() { printf(“foo-xyz\n”); }
func() { xyz(); ← which xyz?? }
▶ At run time: func() calls main’s xyz() — output: “main-xyz”
The executable’s global symbol interposes over the library’s definition
/* foo.c — shared library source */
#include <stdio.h>

/* xyz defined inside the library */
void xyz(void) {
    printf("foo-xyz\n");
}

/* func calls xyz — but which xyz? */
void func(void) {
    xyz();   /* This will call main's xyz, not foo's xyz! */
}
/* prog.c — main program */
#include <stdio.h>

/* xyz also defined here — same name as the library's xyz */
void xyz(void) {
    printf("main-xyz\n");
}

/* func() is declared in libfoo but called here */
void func(void);

int main(void) {
    func();   /* calls libfoo's func(), which calls xyz() */
    return 0;
}
# Build and run
gcc -g -c -fPIC -Wall -c foo.c
gcc -g -shared -o libfoo.so foo.o
gcc -g -o prog prog.c libfoo.so

LD_LIBRARY_PATH=. ./prog
# Output: main-xyz
# --> main program's xyz() OVERRIDES library's xyz()!

The Two Symbol Resolution Rules

Linux shared libraries follow these rules for global symbol resolution (inherited from early UNIX static library semantics):

Rule 1 — Executable Wins

A global symbol defined in the main program overrides any definition with the same name in a shared library.

Even if the library calls that symbol internally, the main program’s version is used.

Rule 2 — Left-to-Right Between Libraries

If a global symbol is defined in multiple shared libraries, the first library listed on the link command line wins.

Libraries are scanned left-to-right; first definition found wins.

# Rule 2 demonstration: two libraries both define greet()
# Link order: libfirst.so listed before libsecond.so
gcc -o prog main.c -L. -lfirst -lsecond

# At run time: libfirst.so's greet() is used
# (even if main.c calls greet() and greet() is also in libsecond.so)

# If you want libsecond.so's greet(), reverse the order:
gcc -o prog main.c -L. -lsecond -lfirst
📝 Historical Reason: These semantics were designed so that converting a static-linked application to use shared libraries would require no source changes. The behaviour exactly mirrors what happens with static libraries: symbols are resolved left-to-right, first definition wins.

Why Symbol Interposition Is a Problem

The default behaviour breaks the idea that a shared library is a self-contained subsystem. A library cannot guarantee that its own internal function calls use its own definitions.

/* Problem scenario: library's internal call gets hijacked */

/* libmath.so defines both square() and cube() */
/* cube() is implemented using square() */

double square(double x) { return x * x; }

double cube(double x) {
    return x * square(x);   /* normally calls libmath's square() */
}

/* main.c overrides square() with a buggy version */
double square(double x) { return x + 1; }  /* BUG: wrong formula */

int main(void) {
    /* This calls libmath's cube(), which calls main's square()! */
    /* cube(3) should be 27, but returns 3 * (3+1) = 12 */
    printf("%.0f\n", cube(3));   /* prints 12, not 27! */
}
⚠️ This is the key problem: a shared library’s internal function calls can be silently hijacked by the main program (or another library), leading to subtle, hard-to-debug bugs when libraries are combined.

Solution: -Bsymbolic Linker Option

When you build a shared library with the -Bsymbolic linker option, you tell the linker: “when a global symbol is referenced from within this library, prefer the definition inside this library over any same-name definition outside.” This gives the library self-contained behavior.

# Build libfoo.so with -Bsymbolic
gcc -g -c -fPIC -Wall -c foo.c
gcc -g -shared -Wl,-Bsymbolic -o libfoo.so foo.o
gcc -g -o prog prog.c libfoo.so

LD_LIBRARY_PATH=. ./prog
# Output: foo-xyz
# Now func() in libfoo calls libfoo's own xyz(), not main's xyz()

# NOTE: calling xyz() DIRECTLY from main() still calls main's xyz()
# -Bsymbolic only affects calls WITHIN the library
/* Demonstrating -Bsymbolic effect */

/* libmath.so built WITH -Bsymbolic */
double square(double x) { return x * x; }

double cube(double x) {
    /* With -Bsymbolic: this calls libmath's own square() */
    /* main's override of square() does NOT affect this */
    return x * square(x);
}

/* Result: cube(3) correctly returns 27, even if main overrides square() */
💡 Best practice: If you are writing a library that is intended to be a self-contained module, build it with -Wl,-Bsymbolic. This prevents accidental symbol interposition from breaking your library’s internal logic.

Intentional Interposition with LD_PRELOAD

Symbol interposition is not always accidental. LD_PRELOAD lets you intentionally inject a library that overrides functions from any other library — without modifying either the application or its libraries. This is used for debugging, profiling, and testing.

/* intercept_malloc.c — intercept malloc() calls */
#include <stdio.h>
#include <stdlib.h>

/* Override malloc with our version */
void *malloc(size_t size) {
    fprintf(stderr, "[intercept] malloc(%zu) called\n", size);
    /* Call the real malloc using dlsym(RTLD_NEXT) */
    void *(*real_malloc)(size_t) = dlsym(RTLD_NEXT, "malloc");
    return real_malloc(size);
}
# Build the interception library
gcc -fPIC -shared -o intercept_malloc.so intercept_malloc.c -ldl

# Run any program with malloc intercepted
LD_PRELOAD=./intercept_malloc.so ls
# Output: [intercept] malloc(768) called
#         [intercept] malloc(120) called
#         ... (every malloc call logged)
# Another use: replace rand() with a deterministic version for testing
/* fake_rand.c */
#include <stdlib.h>
int rand(void) { return 42; }   /* always returns 42 */

gcc -fPIC -shared -o fake_rand.so fake_rand.c
LD_PRELOAD=./fake_rand.so ./my_game   /* game always gets 42 from rand() */
⚠️ Like LD_LIBRARY_PATH, LD_PRELOAD is ignored for SUID/SGID programs for security reasons.

Weak Symbols — Optional Override Points

A weak symbol is a symbol that can be overridden by a strong (regular) definition elsewhere. The library provides a default implementation as a weak symbol, allowing users to replace it simply by defining their own version.

/* library.c — defines a weak default implementation */
#include <stdio.h>

/* Weak definition — can be overridden by the application */
__attribute__((weak)) void error_handler(const char *msg) {
    fprintf(stderr, "Default error: %s\n", msg);
}

void do_work(void) {
    /* calls error_handler — uses application's version if defined */
    error_handler("something went wrong");
}
/* app.c — override the weak symbol with a strong definition */
#include <stdio.h>

/* Strong definition overrides the weak one in the library */
void error_handler(const char *msg) {
    fprintf(stderr, "[APP ERROR LOG] %s\n", msg);
    /* maybe write to log file, send to syslog, etc. */
}

void do_work(void);  /* from library */

int main(void) {
    do_work();   /* calls our custom error_handler */
    return 0;
}
/* Output: [APP ERROR LOG] something went wrong */
💡 Weak symbols are common in embedded systems and RTOS libraries (e.g., ARM CMSIS, FreeRTOS) where hardware-specific handlers can be overridden by application code without modifying the library.

Interview Questions & Answers
Q1. What is symbol interposition in shared libraries?
Symbol interposition is when a global symbol (function or variable) defined in the main executable overrides a same-named symbol in a shared library. The main program’s definition “interposes” over the library’s version. Even calls from within the library to that symbol will use the main program’s version (unless -Bsymbolic is used).
Q2. If both main.c and libfoo.so define xyz(), which is called when libfoo.so internally calls xyz()?
By default, main.c’s definition of xyz() is called — even for internal calls within libfoo.so. The main program’s global symbols take precedence. To make libfoo.so use its own internal xyz(), build it with gcc -shared -Wl,-Bsymbolic.
Q3. What does the -Bsymbolic linker option do?
-Bsymbolic makes a shared library preferentially bind global symbol references to definitions within the same library. This means internal calls within the library use the library’s own definitions instead of being intercepted by same-named symbols in the main program or other libraries. It restores self-contained behavior for the library.
Q4. What is LD_PRELOAD and how does it work?
LD_PRELOAD is an environment variable that specifies shared libraries to load before any others, including standard system libraries. Because of symbol interposition rules, symbols defined in LD_PRELOAD libraries override same-named symbols everywhere else. It allows intercepting, wrapping, or replacing library functions at run time without modifying any code. Example: LD_PRELOAD=./mylib.so ./prog will load mylib.so first, so its malloc() overrides glibc’s malloc().
Q5. Why did Linux implement these symbol interposition semantics?
For historical compatibility. The first shared library implementations were designed to mirror the behavior of static libraries exactly. With static linking, symbol resolution is left-to-right (first definition wins) and the executable’s symbols take priority. By matching this behavior, developers could switch from static to shared libraries without changing their code or experiencing different symbol resolution behavior.
Q6. What is the symbol resolution rule when two shared libraries both define the same global symbol?
The definition from the library listed first (leftmost) on the link command line is used. The linker scans libraries left-to-right; the first definition found wins. So if you do gcc main.c -lA -lB and both libA and libB define func(), libA’s definition is used everywhere.
Q7. What is a weak symbol and when would you use one?
A weak symbol (declared with __attribute__((weak))) is a symbol whose definition can be overridden by any strong (normal) definition. A library provides a default implementation as a weak symbol; if the application defines its own version, the application’s version wins automatically. This is used in hardware abstraction layers, RTOS ports, and plugin/callback architectures where users need to customize behavior without modifying library code.
Q8. How can symbol interposition cause subtle bugs?
If two different libraries use the same name for an internal helper function (e.g., both define a static-free global called init()), one will silently override the other. The affected library’s internal calls will use the wrong function, producing incorrect results that are hard to trace because no linker warning is issued. Using unique name prefixes (e.g., mylib_init()) or making internal functions static (which hides them from global symbol tables) avoids this.
Q9. How do you prevent symbol interposition for internal library symbols?
1. Make internal functions static — they won’t be in the global symbol table at all.
2. Use __attribute__((visibility("hidden"))) to hide symbols from external linking.
3. Build the library with -Wl,-Bsymbolic to bind internal references to internal definitions.
4. Use a linker version script to control which symbols are exported vs. kept internal.
Q10. How does calling xyz() from main() differ from calling xyz() from within libfoo.so, when both define xyz()?
Calling xyz() from main() always invokes main’s xyz() — trivially, because main’s code directly references its own definition. Calling xyz() from within libfoo.so without -Bsymbolic will also call main’s xyz() (interposition). With -Bsymbolic, calling xyz() from within libfoo.so calls libfoo’s own xyz(). But note: even with -Bsymbolic, calling xyz() from main() still calls main’s xyz().

Leave a Reply

Your email address will not be published. Required fields are marked *