This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Gprof can account for less than 1/3 of execution time?!?!
- From: Anthony Foiani <tkil at scrye dot com>
- To: Jon Turner <jon dot turner at wustl dot edu>
- Cc: Nathan Froyd <froydnj at codesourcery dot com>, "gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>, Dave Korn <dave dot korn dot cygwin at googlemail dot com>
- Date: Wed, 24 Feb 2010 17:15:42 -0700
- Subject: Re: Gprof can account for less than 1/3 of execution time?!?!
- References: <4B82F5E8.2020805@wustl.edu> <20100222214702.GO18541@codesourcery.com> <4B830EC1.1060506@wustl.edu>
- Reply-to: Anthony Foiani <tkil at scrye dot com>
[Apologies for the somewhat-off-topic nature of this post, but I felt
it would still be of interest to others on this thread.]
>>>>> "Jon" == Jon Turner <jon.turner@wustl.edu> writes:
Jon> I was well aware of the overhead that gprof can introduce,
Jon> but did not recognize that this overhead was not being
Jon> counted by gprof.
Since you're already on Linux, have you looked at the profiling tools
that have been added to the kernel lately? ("perf" and friends?)
http://perf.wiki.kernel.org/index.php/Main_Page
It requires a recent Linux (2.6.32 or later, preferably). 2.6.31 has
it, but I ran into issues trying to test it; those issues are fixed in
2.6.32.8 at the latest.
The nice thing is that it can show shared library and kernel time as
well as your own code. I set CXXFLAGS to "-g --std=c++0x" and
rebuilt your "maxFlo" executable, then captured statistics with:
perf record -f ./maxFlo
And generated a report with:
perf report -C maxFlo
========================================================================
# comm: maxFlo
# Samples: 65087
#
# Overhead Shared Object Symbol
# ........ .............................. ......
#
15.21% ./maxFlo [.] shortPath::findPath()
14.98% ./maxFlo [.] flograph::res(int, int) const
14.14% ./maxFlo [.] graph::next(int, int) const
13.94% ./maxFlo [.] graph::mate(int, int) const
7.06% ./maxFlo [.] graph::term(int) const
5.92% ./maxFlo [.] graph::n() const
5.54% ./maxFlo [.] graph::left(int) const
5.35% ./maxFlo [.] digraph::tail(int) const
3.44% ./maxFlo [.] list::operator<<=(int)
3.30% ./maxFlo [.] graph::m() const
2.12% ./maxFlo [.] list::operator&=(int)
1.45% ./maxFlo [.] graph::first(int) const
1.18% ./maxFlo [.] flograph::addFlow(int, int, int)
1.09% ./maxFlo [.] flograph::src() const
1.03% ./maxFlo [.] list::makeSpace()
0.95% ./maxFlo [.] augPath::augment()
0.92% ./maxFlo [.] flograph::snk() const
0.62% ./maxFlo [.] list::operator[](int) const
0.48% ./maxFlo [.] list::empty() const
0.26% ./maxFlo [.] min(int, int)
0.25% [kernel] [k] __do_softirq
0.13% [kernel] [k] _spin_unlock_irqrestore
0.11% /lib64/libc-2.11.90.so [.] _int_free
0.08% /lib64/libc-2.11.90.so [.] _int_malloc
0.05% /lib64/libc-2.11.90.so [.] __GI___libc_malloc
0.04% [kernel] [k] kvm_mmu_op
0.04% /lib64/libc-2.11.90.so [.] __cfree
0.03% /usr/lib64/libstdc++.so.6.0.13 [.] operator delete[](void*)
0.03% /usr/lib64/libstdc++.so.6.0.13 [.] operator delete(void*)
[...]
========================================================================
You can see that most of the work was in your program, but we also saw
the kernel stepping in occasionally (even the hypervisor once, with
that "kvm_" call).
Anyway. One more tool.
Happy hacking,
t.