This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: PATCH: [gcc3.5 improvement branch] Very Simple constant propagation

From: Geoff Keating <geoffk at apple dot com>
To: Roger Sayle <roger at eyesopen dot com>
Cc: Caroline Tice <ctice at apple dot com>, gcc-patches at gcc dot gnu dot org
Date: Mon, 2 Feb 2004 15:02:04 -0800
Subject: Re: PATCH: [gcc3.5 improvement branch] Very Simple constant propagation
References: <Pine.LNX.4.44.0402021220040.15940-100000@www.eyesopen.com>

On Feb 2, 2004, at 1:03 PM, Roger Sayle wrote:

Hi Caroline,
However, I have discovered that there seems to be a major difference in compile-time performance between them. All of these measurements were taken using the "time" command, and done at least three times. I list here the compile times for directly compiling the SPECInt 2000 benchmarks I measured. I compiled each benchmark directly, using "-O3 -ffast-math -save-temps", and timed the compilation with the 'time" command (i.e. I did NOT use the SPEC harness to do the compilation). In addition, I passed the "-fss-const-prop" flag to mine to turn on my version (yours is on be default).

Benchmark My Patch Your Patch
gzip                6.65            12.74
vpr                20.83            36.98
gcc               244.13           399.41
mcf                 1.75             3.28
crafty             31.95            61.89
parser             15.61            31.29
All compile times are in seconds, averaged over three runs.    As you
can see, in every case your version takes significantly longer than my
version.
Could you double check these results? I've just timed 3-stage bootstraps of GCC, all languages except treelang, including building the libraries, and I see no appreciable difference in compilation times.

To match Caroline's results, you should use --enable-intermodule and -O3, not just a regular bootstrap.

The concern is compile-time performance on large-to-very-large functions, like those created by intermodule inlining in SPEC.

Without my patch:
real    58m45.660s
user    42m24.880s
sys     8m49.510s

With it:
real    56m39.610s
user    42m50.360s
sys     8m39.570s

Even considering the 26 second increase in user-time, which I believe
is in the noise, accounts for only one 1.02% increase in compile-time.
And I think some of this could be addressed using rtx_equal_p to ignore
isns's whose REG_EQUAL note it the same as the SET_SRC.

rth suggested to me privately that it might be a good idea to also ignore REG_EQUAL notes that only specify that a register is equal to some constant, since combine already handles this.

But this does nothing to explain the 100+% slowdowns you're seeing
with your benchmarks.

Combine has multiple passes, firstly an O(N) inspection of earlier
instructions to combine two instructions together, and if that doesn't
help two O(N^2) passes, to combine three instructions, the current one
and two earlier ones.  My patch adds an additional O(N) pass to this
already O(N^2) algorithm, where N here is the number of operands in an
instruction, typically two or three.   HAVE_cc0 targets for example,
have two additional O(N) passes in combine in addition to those on
non-HAVE_cc0 targets.

This factor may make a difference; powerpc, which is what Caroline used, is not a CC0 target, and it's quite possible that combine is a significant fraction of compile time (the usual optimizer suspects are combine and gcse).

--
Geoff Keating <geoffk@apple.com>

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Follow-Ups:
- Re: PATCH: [gcc3.5 improvement branch] Very Simple constant propagation
  - From: Roger Sayle

References:
- Re: PATCH: [gcc3.5 improvement branch] Very Simple constant propagation
  - From: Roger Sayle

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]