hi,
I've recently done some profiling and analysis of gcc and g++, using
the oprofile hardware-assisted profiling module in linux 2.5. I
collected stats on a variety of versions of gcc (some only of
historical interest) and some different branches and workloads, and
characterized some aspects of the cache and branch-mispredict
penalties in addition to cycle hotspots.
the results of the work are posted at
http://people.redhat.com/graydon/gcc-optimizing/
and http://people.redhat.com/graydon/g++-report/
any comments or questions are welcome (including instructions and
hand-holding on how to set up the profiler for your own runs, if you
want to do more of this stuff).