This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: gcc 3.1 is still very slow, compared to 2.95.3
- From: Richard Earnshaw <rearnsha at arm dot com>
- To: Jan Hubicka <jh at suse dot cz>
- Cc: Marc Espie <espie at nerim dot net>, gcc at gcc dot gnu dot org, Richard dot Earnshaw at arm dot com
- Date: Sat, 18 May 2002 13:12:23 +0100
- Subject: Re: gcc 3.1 is still very slow, compared to 2.95.3
- Organization: ARM Ltd.
- Reply-to: Richard dot Earnshaw at arm dot com
> I am sorry to say that according to the profiles, there is no single
> place in GCC where we burn most of the CPU cycles. The slowdown is commulative
> result of many patches and it is clear that compile time performance has not
> been thread seriously during GCC development (3.0 had number of other problems
> that were addressed). I personaly will care more the compile time performance
> in next development and hope we will set up some periodic tester to check this
> (this has proved to be effective at runtime perfomrance, where 3.1 is very well of).
I can't prove any of the following, but it seemed to me that the major
slowdown in the compiler was when we switched from obstacks to ggc. On a
machine I regularly use to bootstrap the compiler (StrongARM 110, which
has 2x16K caches) bootstrap times went from ~3 hours to ~6 hours at around
the time of that change; despite tuning the GC code, they've never
recovered, since we've added additional languages etc as well.
My suspicions are:
Memory use efficiency: I suspect we have many partially used pages, since
each page of memory is only used for objects of a single size we end up
with many pages with just a few items in them; in particular persistent
objects can now be scattered anywhere across that memory, rather than
being gathered in a single block. We now have to allocate far more pages
for a small compilation than we did before; I rarely see compilation of a
C file requiring less than 8M now, it used to be around 3-4M for a typical
file in GCC. To make matters worse we regularly touch most of those pages
rather than just a subset of them, which means the OS can't usefully page
any of them out.
Cache locality: since things are scattered all over the place we need a
far larger cache to achieve the same hit rate -- gcc has never been
particularly kind to caches, but I suspect it is worse now.
Neither of these is going to show up on the standard sampling statistics
that we measure since the cost is distributed across the entire
compilation.
I used to be able to bootstrap on the machine I have with -j2 to get about
a 5% reduction in compile time. The growth in memory usage of the
compiler means that now I get about 20% increase in compile time because
the machine just thrashes at times.
R.