The cost of stack traces

Wed May 10 15:33:00 GMT 2006

Andrew Haley wrote:
> As some of you may be aware, I've been doing benchmarking of servers
> with a mind to improving gcj performance.
> 
> I've recently come across a situation where stack traces greatly
> slowed down an application.  Basically, this all comes down to an
> interaction between the way that loggers work and the way we do stack
> traces.  We're using addr2line to get the line number and filename,
> but this leads to a huge number of addr2line process being spawned.
> 
> In the past we have considered building a reader for DWARF debug info
> into libgcj, but this may not help as much as we imagine: a perusal of
> the profile data for addr2line reveals that much of the CPU time is in
> libbfd itself -- the big effort isn't just spawning addr2line
> processes, it's actually doing the work of reading the debug info.

This is what I have been wanting to do.  I think the speedup would be 
quite large in my usage cases (slow CPU, limited memory where running 
addr2line causes thrashing).  On my little MIPS systems, I can count the 
depth of a stack trace by listening to the disk.  There is a flurry of 
disk activity for each level as addr2line is started.  It goes 
brrt...brrt...brrt...  Without addr2line, it is literally 10-20 times 
faster.

With an in-library line number decoder, if you cached knowledge about 
line number availability for each .text section, you could short circuit 
a lot of work on the second stack trace from a section with no line 
number information.

I have also been thinking about emitting the raw addresses (perhaps with 
  file and base address information) so that off-line analysis of traces 
is easier.  This would give fast runtime performance, but allow detailed 
postmortem analysis.  For statically linked applications all you need is 
the address as they are all absolute.  Back in GCC-3.4 you could easily 
get the address to print out, but that capability seems to have 
disappeared in the most recent libgcj (and I want it back :-)).

> 
> And all of this is often supremely pointless in Fedora, where many of
> the gcj libraries don't have any debuginfo to read.
> 
> The effect of all this is dramatic: running a test load in JOnAS is
> 
>    38m18s with addr2line, and
>    22m49s without.

I have noted this as well, and would like to see improvement.

> 
> I suggest that we set the default use_addr2line to false.  People will
> still get the method info, just not the filename and line number of
> the source file.
> 

I have done this, and it does speed things up.  One thing I was thinking 
was that you could leave use_addr2line set to true by default, and have 
a way to override via an environment variable.  That way in your JOnAS 
startup script, you could turn it off if you wanted to, but the default 
behavior would be similar to the JDK where you get line numbers if 
available.

> Another possibility is to keep the addr2line processes alive between
> invocations of stacktrace.  This is a fairly small change to
> NameFinder, but has the disadvantage of consuming a lot of long-term
> resources.
> 

How is this much different than having the DWARF debug reader be 
integrated in libgcj?  Not that we seem to have enough information to 
know, but this would contradict your assertion above that it may not be 
worthwhile.

David Daney