Files
strangecpp/namecollision/README.md
2026-02-23 16:19:27 +01:00

117 lines
3.9 KiB
Markdown

# Name Collision / ODR Violation with Shared Libraries
## The Setup
`hello.cpp` defines a class `nt` with a method `print()`, used internally by
`print_obj()`. `main.cpp` defines a `namespace nt` with a free function
`print()`, and calls both `nt::print()` and `print_obj()`.
`hello.cpp` is compiled into a shared library (`libhello.so`), and `main.cpp` is
linked against it.
## The Problem
C++ name mangling encodes both **namespaces** and **classes** identically in the
Itanium ABI (used on Linux):
```
namespace nt { void print() } → _ZN2nt5printEv
class nt { void print() } → _ZN2nt5printEv
```
Both produce the exact same mangled symbol. The compiler has no way to
distinguish them at link time.
## What Actually Happens
With `-O0` (no optimization):
- `libhello.so` exports `_ZN2nt5printEv` as a **weak** symbol (the class method)
- `main` defines `_ZN2nt5printEv` as a **strong/global** symbol (the namespace
function)
- When `print_obj()` in the shared library calls `nt::print()`, the dynamic
linker resolves `_ZN2nt5printEv` at runtime and the **strong symbol in `main`
wins**
- Result: `print_obj()` calls `main`'s `nt::print()` instead of the class method
- Output: `"Hello from namespace"` printed twice
With `-O3` (optimizations on):
- The compiler inlines the class `nt::print()` directly into `print_obj()`
- `_ZN2nt5printEv` is never emitted as a standalone symbol in `libhello.so`
- No collision occurs — each call resolves correctly
- Output: correct
This is why the bug was **optimization-dependent** and hard to spot.
## Symbol Evidence
```
# -O0: libhello.so exports the class method as weak
00000000000011d0 w F .text _ZN2nt5printEv ← weak, can be overridden
# -O3: libhello.so does NOT export it at all (inlined away)
# only _Z9print_objv is present
```
## One Definition Rule (ODR)
The C++ standard (basic.def.odr) requires that any entity with more than one
definition across translation units must have **identical** definitions.
Violating this is **ill-formed, no diagnostic required** — the compiler and
linker are not required to warn you.
The consequences are undefined behavior: the program may crash, produce wrong
output, or appear to work correctly depending on compiler flags, optimization
level, or link order.
## Linux Dynamic Linker Behavior
On Linux, the dynamic linker (`ld.so`) uses a **flat symbol namespace** by
default. When a shared library references a symbol, the linker searches all
loaded objects (including the main executable) and resolves to the **first
strong symbol found**, regardless of which DSO the symbol logically belongs to.
This is different from macOS (which uses two-level namespaces by default) and
Windows (where DLL symbols are explicitly imported/exported via import tables
and don't collide this way).
This flat namespace behavior means a weak symbol in a `.so` can be silently
overridden by any strong symbol of the same name anywhere in the process — even
from the main executable.
## The Fix
Wrap internal-use classes in an **anonymous namespace** in `hello.cpp`:
```cpp
namespace {
class nt {
public:
void print() { ... }
};
}
```
This gives the class **internal linkage**. The mangled symbol becomes:
```
_ZN12_GLOBAL__N_12nt5printEv
```
The `_GLOBAL__N_1` prefix is the ABI encoding for the anonymous namespace,
making it unique per translation unit and invisible to the dynamic linker. No
collision is possible.
## Key Takeaways
- Class and namespace names mangle identically in the Itanium C++ ABI
- Weak symbols in shared libraries can be silently hijacked by strong symbols in
the executable
- ODR violations are UB with no required diagnostic — bugs may only appear at
certain optimization levels
- Internal implementation details in shared libraries should use anonymous
namespaces to prevent symbol leakage
- Use `objdump -t` or `nm` to inspect symbol visibility and catch these issues
early