dlerror() and dlsym() Symbol Lookup & Function Pointer Casting

 

42.1.2 & 42.1.3 — dlerror() and dlsym()
Error Diagnosis, Symbol Lookup & Function Pointer Casting | Chapter 42 · TLPI
EmbeddedPathashala.com — Free Embedded & Linux Tutorials

Why These Two Functions Are Critical

After calling dlopen(), the very next thing you need to do is look up symbols (functions or variables) inside the loaded library. That is what dlsym() does. But there is a subtle trap: dlsym() returns NULL both when it fails AND when a symbol legitimately has a NULL value. You need dlerror() to distinguish between these two cases.

These two functions are used together in a specific pattern. Understanding the pattern correctly is what separates robust code from crash-prone code.

Key Concepts

dlerror() dlsym() void* to function pointer *(void**)(&fp) cast RTLD_DEFAULT RTLD_NEXT _GNU_SOURCE NULL symbol check

42.1.2 — dlerror(): Diagnosing Errors

#include <dlfcn.h>
const char *dlerror(void);
/* Returns: pointer to error string, or NULL if no error has occurred
   since the last call to dlerror() */

Key behaviors to know:

  • If no error has occurred since the last call to dlerror(), it returns NULL.
  • Each call to dlerror() clears the error state. The next call returns NULL unless another error has occurred.
  • The returned string is statically allocated — do not free() it.
  • dlerror() is also called before dlsym() to clear any stale error, which is part of the correct dlsym pattern.
The error string is transient. If you need to use the error message later, copy it with strdup() immediately after calling dlerror().
Function Call Sequence dlerror() Returns Meaning
dlopen(“noexist.so”, RTLD_LAZY) → NULL
then dlerror()
“noexist.so: cannot open…” Error occurred, message available
dlopen success, then dlerror() NULL No error, all good
dlerror() called twice in a row NULL on 2nd call First call clears the error state

42.1.3 — dlsym(): Looking Up Symbols

#include <dlfcn.h>
void *dlsym(void *handle, char *symbol);
/* Returns: address of symbol, or NULL if not found */

dlsym() searches the library referred to by handle (and its dependency tree) for a symbol with the given name. If found, it returns the address as a void*.

The NULL trap: dlsym() returns NULL both when the symbol is NOT found, AND when the symbol IS found but has a legitimate NULL value (e.g., a global variable initialized to NULL). You cannot tell the two cases apart just by checking the return value. Always use the dlerror() pattern shown below.

The Correct dlsym() Usage Pattern

This four-step pattern is the only reliable way to use dlsym():

Step 1: Call dlerror() to clear any previous error state
Step 2: Call dlsym() to look up the symbol
Step 3: Call dlerror() again to check for error
Step 4: If dlerror() returned non-NULL → symbol not found. Otherwise → use the address (even if NULL, it is a valid address)
/* Correct dlerror() + dlsym() pattern */
void *addr;
const char *err;

dlerror();              /* Step 1: clear any stale error */
addr = dlsym(handle, "my_symbol"); /* Step 2: look up symbol */
err  = dlerror();      /* Step 3: check for error */

if (err != NULL) {
    /* Step 4a: symbol was NOT found */
    fprintf(stderr, "dlsym failed: %s\n", err);
} else {
    /* Step 4b: symbol WAS found (addr may be NULL — that's OK) */
    /* use addr here */
}

Casting void* to a Function Pointer (The Tricky Part)

dlsym() returns a void*. C99 standard forbids direct assignment from void* to a function pointer. This code is wrong:

/* WRONG — C99 forbids assigning void* to a function pointer directly */
int (*funcp)(int);
funcp = dlsym(handle, "my_func");    /* compiler warning or error */

The correct portable way (as specified in SUSv3 Technical Corrigendum) is:

/* CORRECT — SUSv3 approved technique */
int (*funcp)(int);

/* This says: treat &funcp as a void** and store the dlsym result there */
*(void **)(&funcp) = dlsym(handle, "my_func");

/* Now call through the function pointer */
int result = (*funcp)(42);

/* Alternative calling syntax (both are identical): */
int result2 = funcp(42);
Why *(void **)(&funcp) works:

  1. &funcp gives us the address of the function pointer variable (type: int(**)(int))
  2. (void **)(&funcp) reinterprets that address as a pointer-to-void* (type: void**)
  3. *(void **)(&funcp) dereferences it, giving us a void* lvalue
  4. We assign dlsym’s return value (also void*) to that lvalue — legal!
  5. The bits stored are the same as the function pointer, just accessed via a void* alias

dlsym() Pseudohandles: RTLD_DEFAULT and RTLD_NEXT

Instead of a handle from dlopen(), you can pass special pseudohandle constants. These require #define _GNU_SOURCE before including <dlfcn.h>.

Pseudohandle Search Order Use Case
RTLD_DEFAULT Main program → all shared libraries loaded at startup → all RTLD_GLOBAL dlopen’d libs Find any globally visible symbol by name at runtime
RTLD_NEXT Only searches libraries loaded after the library that calls dlsym() Wrapper functions: find the “real” version of a function you are wrapping

Classic RTLD_NEXT use case — wrapping malloc():

#define _GNU_SOURCE    /* Required for RTLD_NEXT */
#include <dlfcn.h>
#include <stdio.h>
#include <stdlib.h>

/* Our replacement malloc that wraps the real one */
void *malloc(size_t size) {
    static void *(*real_malloc)(size_t) = NULL;

    if (!real_malloc) {
        /* Find the NEXT malloc after this library — which is the real libc malloc */
        dlerror();
        *(void **)(&real_malloc) = dlsym(RTLD_NEXT, "malloc");
        if (dlerror()) {
            fprintf(stderr, "Cannot find real malloc!\n");
            return NULL;
        }
    }

    printf("[DEBUG] malloc(%zu) called\n", size);

    void *ptr = real_malloc(size);
    printf("[DEBUG] malloc returned %p\n", ptr);
    return ptr;
}
# Build as a shared library that can be LD_PRELOADed:
gcc -fPIC -shared -o mymalloc.so mymalloc.c -ldl

# Use it to wrap malloc in any program:
LD_PRELOAD=./mymalloc.so ls /tmp
RTLD_NEXT is the foundation of LD_PRELOAD interposition. When your wrapper is loaded first, it intercepts the symbol, does its work, and then calls the real function found via RTLD_NEXT.

Code Example 1: Complete dlsym with Variable and Function

/* mylib2.c — library with both a function and a global variable */
#include <stdio.h>

int global_counter = 42;  /* Global variable */

void print_hello(void) {
    printf("Hello! Counter = %d\n", global_counter);
}

double compute(double x) {
    return x * x + 2.5;
}
/* dlsym_demo.c — demonstrates looking up both functions and variables */
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <dlfcn.h>

int main(void) {
    const char *err;

    void *lib = dlopen("./mylib2.so", RTLD_LAZY);
    if (!lib) { fprintf(stderr, "%s\n", dlerror()); exit(1); }

    /* --- Look up a function --- */
    void (*hello_fn)(void);
    dlerror();
    *(void **)(&hello_fn) = dlsym(lib, "print_hello");
    err = dlerror();
    if (err) { fprintf(stderr, "dlsym(print_hello): %s\n", err); exit(1); }
    (*hello_fn)();

    /* --- Look up a function with parameters --- */
    double (*compute_fn)(double);
    dlerror();
    *(void **)(&compute_fn) = dlsym(lib, "compute");
    err = dlerror();
    if (!err) {
        printf("compute(5.0) = %f\n", (*compute_fn)(5.0));
    }

    /* --- Look up a global variable --- */
    int *counter_ptr;
    dlerror();
    *(void **)(&counter_ptr) = dlsym(lib, "global_counter");
    err = dlerror();
    if (!err) {
        printf("global_counter = %d\n", *counter_ptr);

        /* Modify the variable through the pointer */
        *counter_ptr = 100;
        (*hello_fn)();  /* Now prints Counter = 100 */
    }

    dlclose(lib);
    return 0;
}
gcc -fPIC -shared -o mylib2.so mylib2.c
gcc dlsym_demo.c -ldl -o dlsym_demo
./dlsym_demo
# Hello! Counter = 42
# compute(5.0) = 27.500000
# global_counter = 42
# Hello! Counter = 100

Code Example 2: Using RTLD_DEFAULT to Search All Loaded Libraries

/* rtld_default_demo.c — find functions in already-loaded libraries */
#define _GNU_SOURCE
#include <stdio.h>
#include <math.h>      /* links libm at startup */
#include <dlfcn.h>
#include <stdlib.h>

int main(void) {
    /* sin() is available because we linked -lm at startup */
    double (*sin_fn)(double);
    dlerror();
    *(void **)(&sin_fn) = dlsym(RTLD_DEFAULT, "sin");
    const char *err = dlerror();

    if (err) {
        printf("sin not found via RTLD_DEFAULT: %s\n", err);
    } else {
        printf("Found sin() via RTLD_DEFAULT: sin(1.5707) = %f\n",
               sin_fn(1.5707963));
    }

    /* printf is in libc, also findable */
    int (*printf_fn)(const char *, ...);
    dlerror();
    *(void **)(&printf_fn) = dlsym(RTLD_DEFAULT, "printf");
    err = dlerror();
    if (!err) {
        printf_fn("Found printf() via RTLD_DEFAULT too!\n");
    }

    return 0;
}
gcc rtld_default_demo.c -ldl -lm -o rtld_default_demo
./rtld_default_demo
# Found sin() via RTLD_DEFAULT: sin(1.5707) = 1.000000
# Found printf() via RTLD_DEFAULT too!

Interview Questions & Answers

Q1. Why is checking dlsym() == NULL not sufficient to detect an error?
dlsym() returns NULL in two distinct cases: (1) the symbol was not found, and (2) the symbol was found but its value is legitimately NULL, for example a global pointer variable initialized to NULL. Since both cases return NULL from dlsym(), checking the return value alone cannot distinguish them. The correct approach is to call dlerror() to clear the error state before dlsym(), call dlsym(), then call dlerror() again. If the second dlerror() returns a non-NULL string, there was an error.
Q2. Why can we not write funcp = dlsym(handle, “name”); directly?
The C99 standard prohibits direct assignment between a void* and a function pointer. dlsym() returns void*. Assigning it directly to a function pointer produces a warning or error with a strict C99 compiler. The SUSv3-approved workaround is the *(void **)(&funcp) = dlsym(…) pattern, which stores the raw pointer bits without a direct function-pointer-to-void* assignment.
Q3. What is RTLD_NEXT and what is its most common use case?
RTLD_NEXT is a pseudohandle for dlsym() that searches for a symbol starting only in the libraries loaded after the calling library — effectively finding the next definition of the symbol in the load order. Its most common use is implementing wrapper (interposition) functions. For example, if your library defines malloc(), it can use dlsym(RTLD_NEXT, “malloc”) to find and call the real libc malloc, while adding its own debugging logic around it. This is also the mechanism behind LD_PRELOAD-based function interposition.
Q4. What does calling dlerror() do to the error state?
Calling dlerror() returns the current error message string and then clears the error state. The next call to dlerror() will return NULL unless another dlopen/dlsym/dlclose error has occurred in between. This is why you must call dlerror() before dlsym() to clear any stale error — otherwise you might incorrectly read an error from a previous failed operation.
Q5. What is the difference between RTLD_DEFAULT and RTLD_NEXT?
RTLD_DEFAULT searches from the beginning: main program first, then all libraries in load order. It finds the first (default) definition of a symbol globally. RTLD_NEXT searches only the libraries loaded after the library calling dlsym(). It finds the next definition after the caller’s position in the load order. RTLD_DEFAULT is used to find any symbol; RTLD_NEXT is used specifically in wrapper/interposer libraries to find the “original” symbol being wrapped.

Leave a Reply

Your email address will not be published. Required fields are marked *