Page faults and libgcj.so startup time
Tue Apr 3 15:03:00 GMT 2001
On Tue, 3 Apr 2001, Bryce McKinlay wrote:
> The LD_DEBUG=profile output from was reporting 35 million cycles or so spent
> resolving 6000 or so symbols, IIRC, and much longer when libgcj.so wasn't
> already cached in memory. That seems pretty efficient, but it would certainly
> do a lot better if it didn't have to resolve so many symbols.
Yep. I was actually looking at LD_DEBUG=relocs. The output is
shockingly large, and worsens with every new shared object.
Only static objects can be completely resolved at link time, apparently.
(I'd guess that's obvious to someone more familiar with linkers than I.)
> > It appears that RTLD_LAZY is used all right. Certain symbols (_exit)
> > still aren't resolved until the process completes. But the dynamic linker
> > appears to be overly aggressive, and process far more symbols up front
> > than absolutely necessary. I'd like to understand why.
> Me too. Could it have to do with the symbols in the class vtables perhaps? ie
> maybe its just the virtual method symbols that are being looked up at load
Vtables are part of the answer. I see there are four types of
relocations for ELF on x86, and only one of those is lazy. Since c++ has
vtables, it should exhibit some of the same bloat. Here's what I find for
three different shared libraries on my Linux/x86 machine:
libgcj libstdc++ libc
R_386_32 27006 1085 146
R_386_GLOB_DAT 1340 174 149
R_386_JUMP_SLOT 2299 270 452
R_386_RELATIVE 41010 4374 1576
The first of these, R_386_32, is occupied by vtbl entries. The
R_386_JUMP_SLOT entries are lazily computed. The R_386_RELATIVE entries
seem to be fixups that are not based on a symbol, but rather the image
The numbers are not badly out of proportion, actually, considering the
size of .data in libgcj is nearly twenty times that in libstdc++. I
expected that libgcj has more virtual methods and metadata than c++.
Using -Bsymbolic flag helps, but doesn't actually eliminate any
relocations. It does prevent searching the chain of shared libs for each
symbol. Condensing metadata should make a real difference though.
(Bryce's experiment to reduce the cost of global constructors still has
merit IMO, but those results are apparently dwarfed by these loader
More information about the Java