Page faults and startup time

Jeff Sturm
Tue Apr 3 15:03:00 GMT 2001

On Tue, 3 Apr 2001, Bryce McKinlay wrote:
> The LD_DEBUG=profile output from was reporting 35 million cycles or so spent
> resolving 6000 or so symbols, IIRC, and much longer when wasn't
> already cached in memory. That seems pretty efficient, but it would certainly
> do a lot better if it didn't have to resolve so many symbols.

Yep.  I was actually looking at LD_DEBUG=relocs.  The output is
shockingly large, and worsens with every new shared object.

Only static objects can be completely resolved at link time, apparently.
(I'd guess that's obvious to someone more familiar with linkers than I.)

> > It appears that RTLD_LAZY is used all right.  Certain symbols (_exit)
> > still aren't resolved until the process completes.  But the dynamic linker
> > appears to be overly aggressive, and process far more symbols up front
> > than absolutely necessary.  I'd like to understand why.
> Me too. Could it have to do with the symbols in the class vtables perhaps? ie
> maybe its just the virtual method symbols that are being looked up at load
> time.

Vtables are part of the answer.  I see there are four types of
relocations for ELF on x86, and only one of those is lazy.  Since c++ has
vtables, it should exhibit some of the same bloat.  Here's what I find for
three different shared libraries on my Linux/x86 machine:

                 libgcj    libstdc++    libc
R_386_32          27006       1085       146
R_386_GLOB_DAT     1340        174       149
R_386_JUMP_SLOT    2299        270       452
R_386_RELATIVE    41010       4374      1576

The first of these, R_386_32, is occupied by vtbl entries.  The
R_386_JUMP_SLOT entries are lazily computed.  The R_386_RELATIVE entries
seem to be fixups that are not based on a symbol, but rather the image
base address.

The numbers are not badly out of proportion, actually, considering the
size of .data in libgcj is nearly twenty times that in libstdc++.  I
expected that libgcj has more virtual methods and metadata than c++.

Using -Bsymbolic flag helps, but doesn't actually eliminate any
relocations.  It does prevent searching the chain of shared libs for each
symbol.  Condensing metadata should make a real difference though.

(Bryce's experiment to reduce the cost of global constructors still has
merit IMO, but those results are apparently dwarfed by these loader


More information about the Java mailing list