This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: gcov: option to instrument whole basic block graph, no use of spanning tree optimization
- From: Andi Kleen <andi at firstfloor dot org>
- To: Zdenek Dvorak <rakdver at kam dot mff dot cuni dot cz>
- Cc: Holger Blasum <hbl at sysgo dot com>, gcc-patches at gcc dot gnu dot org
- Date: Mon, 08 Nov 2010 12:17:30 +0100
- Subject: Re: gcov: option to instrument whole basic block graph, no use of spanning tree optimization
- References: <87oca0sppn.fsf@basil.nowhere.org> <20101108104552.GA26559@kam.mff.cuni.cz>
Zdenek Dvorak <rakdver@kam.mff.cuni.cz> writes:
>
> I had a patch implementing this a few years back:
>
> http://gcc.gnu.org/ml/gcc-patches/2002-01/msg00195.html
>
> which however did not make it to mainline (iirc, there were objections against
> the extra memory and runtime costs -- if you have a program with hundreds of thousands of
> basic blocks, you do not really want to use the corresponding megabytes of
> memory per thread for profiling, and spend time initializing them
> on the thread start and summing them on the thread end.
Doesn't seem like a big problem to me on modern systems.
I guess one could make it optional for the case of someone
running on a memory constrained target.
> Probably a better approach
> (using atomic primitives for manipulating the profile) was proposed in
>
> http://gcc.gnu.org/ml/gcc-patches/2007-03/msg00950.html
>
> which also never got merged, it seems,
atomics are not a better approach. atomics will never scale well and can
experience catastrophic performance degradation on larger systems due to
excessive communication overheads when lots of CPUs try to manipulate
the same counters in parallel.
On many NUMA systems there is also unfair memory which makes them even
worse.
-Andi
--
ak@linux.intel.com -- Speaking for myself only.