This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: PR 23551: why should we coalesce inlined variables?


On 6/27/07, Alexandre Oliva <aoliva@redhat.com> wrote:
Sorry that it took me so long to get back to this.

On Jun 1, 2007, "Andrew Pinski" <pinskia@gmail.com> wrote:

> On 6/1/07, Alexandre Oliva <aoliva@redhat.com> wrote:
>> I didn't.  Today I set some time aside to try to get SPEC2000 to run
>> on an x86_64 box with Fedora 7.  I still haven't got all of the
>> benchmarks to compile and run successfully with the GCC trunk, but the
>> results I got for the patch are promising:

>> Again, the left column is the run-time WITH the patch, the right
>> columnt is the run-time WITHOUT the patch.  That's right, removing the
>> patch actually slows things down.  I couldn't quite believe it, after
>> what you said, so I triple-checked.

> Oh I can believe it for x86_64 :)  You might want to try on more than
> x86_64, the register allocator might not be doing a good job for
> x86_64.  Without the patch is causing more register pressure than with
> the patch.  Try either on ia64 or PPC where you have more registers.
> In that case with the patch might slow down the runtime.

Looks like variations for worse are mostly in the noise, and there are
some variations for better than look consistent.  Here's what I got on
ppc32 SPEC2K with -O3 -fomit-frame-pointer.  Left column is pristine,
right column is patched to avoid coalescing of inlined variables:

164_gzip.000.reported_time: 201.922768 201.103212
164_gzip.001.reported_time: 198.960087 198.624947
164_gzip.002.reported_time: 199.073588 198.71297
175_vpr.000.reported_time: 298.687033 297.300598
175_vpr.001.reported_time: 297.122948 297.58685
175_vpr.002.reported_time: 295.630361 297.723687
176_gcc.000.reported_time: 123.435803 126.320587
176_gcc.001.reported_time: 122.343189 124.88423
176_gcc.002.reported_time: 122.800357 123.864006
181_mcf.000.reported_time: 361.04524 360.337646
181_mcf.001.reported_time: 359.981651 360.322147
181_mcf.002.reported_time: 359.908608 360.219215
186_crafty.000.reported_time: 121.013495 117.454093
186_crafty.001.reported_time: 117.082273 117.642486
186_crafty.002.reported_time: 117.416734 117.723788
197_parser.000.reported_time: 281.625919 279.478479
197_parser.001.reported_time: 281.570151 279.82866
197_parser.002.reported_time: 281.804008 279.595665
253_perlbmk.000.reported_time: 287.251103 286.796575
253_perlbmk.001.reported_time: 286.697103 287.089939
253_perlbmk.002.reported_time: 286.286923 286.806686
254_gap.000.reported_time: 159.230154 153.683678
254_gap.001.reported_time: 157.404963 152.992977
254_gap.002.reported_time: 158.465634 150.772173
256_bzip2.000.reported_time: 257.393719 256.039831
256_bzip2.001.reported_time: 256.692201 254.992797
256_bzip2.002.reported_time: 256.105407 255.93728

177_mesa.000.reported_time: 163.103962 163.715843
177_mesa.001.reported_time: 162.948224 163.274986
177_mesa.002.reported_time: 162.827828 163.231781
179_art.000.reported_time: 433.203321 428.189829
179_art.001.reported_time: 434.796632 433.193104
179_art.002.reported_time: 433.542143 431.804417
183_equake.000.reported_time: 124.691579 124.512015
183_equake.001.reported_time: 124.418219 124.450906
183_equake.002.reported_time: 124.455124 124.621132
188_ammp.000.reported_time: 595.206248 595.92599
188_ammp.001.reported_time: 596.071531 595.285807
188_ammp.002.reported_time: 595.5348 595.350713

vortex output was wrong for both builds, so I cut it out from the
report above.  A few other testcases failed to compile and are not
reported, most (all?) of them were just missing an f77 compiler (I
didn't even enable fortran in the tools I built).

Is this enough evidence that the patch is not harmful to run-time
performance, and that it may actually help debugging?

http://gcc.gnu.org/ml/gcc-patches/2007-05/msg00703.html

Can you show me one example of the code where this patch helps, and why it helps the runtime performance ? e.g. gap

The performance difference could simply be due to
small number of inlining happening on critical paths.
And the amount of performance variation suggests
that this might be simply *luck*
(e.g. gap is known to be a bit volatile in performance).
Also, the slowdown of gcc worries me more
as it has one of the most flat profile among spec int suite
and has presumably more automatic inlining happening.
Alas, it's unfortunate that we don't have data for 252.eon.
Most of fp benchmarks are not interesting (except fma3d and sixtrack)
since they spend most of their time on their hot loops.

So while your data is still very useful and interesting,
it doesn't give nearly enough information to determine
whether this is the right thing to do in general,
especially when in theory this isn't "The Right Thing To Do" in general.

This is not to say I object to your patch - rather I'm flabbergasted
than anything that disabling coalescing at gimple has such a low impact -
I have to wonder if there's other passes that coalesce names later
(or if RTL level coalescing/register assignment makes the lack of tree level
coalescing less of a problem).
--
#pragma ident "Seongbae Park, compiler, http://seongbae.blogspot.com";


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]