This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Excessive calls to iterate_phdr during exception handling


On Mon, May 27, 2013 at 3:20 PM, Ryan Johnson
<ryan.johnson@cs.utoronto.ca> wrote:
>
> I have a large C++ app that throws exceptions to unwind anywhere from 5-20
> stack frames when an error prevents the request from being served (which
> happens rather frequently). Works fine single-threaded, but performance is
> terrible for 24 threads on a 48-thread Ubuntu 10 machine. Profiling points
> to a global mutex acquire in __GI___dl_iterate_phdr as the culprit, with
> _Unwind_Find_FDE as its caller.
>
> Tracing the attached test case with the attached gdb script shows that the
> -DDTOR case executes ~20k instructions during unwind and calls iterate_phdr
> 12 times. The -DTRY case executes ~33k instructions and calls iterate_phdr
> 18 times. The exception in this test case only affects three stack frames,
> with minimal cleanup required, and the trace is taken on the second call to
> the function that swallows the error, to warm up libgcc's internal caches
> [1].
>
> The instruction counts aren't terribly surprising---I know unwinding is
> complex---but might it be possible to throw and catch a previously-seen
> exception through a previously-seen stack trace, with something fewer than
> 4-6 global mutex acquires for each frame unwound? As it stands, the deeper
> the stack trace (= the more desirable to throw rather than return an error),
> the more of a scalability bottleneck unwinding becomes. My actual app would
> apparently suffer anywhere from 25 to 80 global mutex acquires for each
> exception thrown, which probably explains why the bottleneck arises...
>
> I'm bringing the issue up here, rather than filing a bug, because I'm not
> sure whether this is an oversight, a known problem that's hard to fix, or a
> feature (e.g. somehow required for reliable unwinding). I suspect the
> former, because _Unwind_Find_FDE tries a call to _Unwind_Find_registered_FDE
> before falling back to dl_iterate_phdr, but the former never succeeds in my
> trace (iterate_phdr is always called).

The issue is dlclose followed by dlopen.  If we had a cache ahead of
dl_iterate_phdr, we would need some way to clear out any information
cached from a dlclose'd library.  Otherwise we might pick up the old
information when looking up an address from a new dlopen.  So 1)
locking will always be required; 2) any caching system to reduce the
number of locks will require support for dlclose, somehow.  It's worth
working on but there isn't going to be a simple solution.

Ian


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]