Mainline merge part 14 - profiler improvements

Tue May 7 14:34:00 GMT 2002

Hello.

> > Where? I only call them from libgcc2.c, which IMHO should be OK.
> 
> I was mistaken about the libcalls.
> 
> > I don't think putting additional instructions, one of them jump, to
> > every edge is a good idea concerning the performance.
> 
> It's not every edge.  Just the minimal spanning tree.
> 
> And what about the extra function call you wind up adding at
> the beginning of every function?  You're suggesting that that
> won't have a performance impact?

Yes, it has quite drastic impact (about 4x the normal overhead on genattrtab).
This number would be  much better in programs that spend most of time in one
tight loop -- and this is where your approach will lose a lot of performance
(you must instrument at least one edge of every cycle; furthermore by this you
will add an inner loop into it, decreasing effectivity of other optimizations).

> > I believe Honza defined it, but it is (number of edges - number of blocks) *
> > sizeof(counter). In my implementation it must be multiplied by number of
> > running threads.
> 
> No, I wanted a number like 10K per thread.  Or whatever.
> 
> What's the actual measured overhead of this on some real-life
> thread-using application?  Say, mozilla.

What is the typical number of basic blocks of such application? Then multiply
it by 8.

Zdenek