[patch] enabling vectorization by default at -O3
H.J. Lu
hjl@lucon.org
Thu Sep 6 14:48:00 GMT 2007
On Thu, Sep 06, 2007 at 09:23:31AM -0400, Daniel Berlin wrote:
> On 9/6/07, H.J. Lu <hjl@lucon.org> wrote:
> > On Thu, Sep 06, 2007 at 01:49:52PM +0200, Uros Bizjak wrote:
> > > >
> > > > * Hmm, why is --ffast-math slower? And with vectorization that much
> >
> > Also see
> >
> > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32183
> >
> > With -O2 --ffast-math, we turn a faster loop:
> >
> > float sf;
> > ...
> > sf = 500 * sf;
> > for (i = 0; i < ceplen; i++)
> > sum[i] *= sf;
> >
> > into a slower loop:
> >
> > for (i = 0; i < ceplen; i++)
> > sum[i] = (sum[i]* 500)*sf;
> >
> > > > slower? I recheck induct (V.F, NV.F) and I could reproduce the timings.
> > > >
> > >
> > > > that is indeed interesting (I'd be happy to look at a testcase)
> > >
> > > This is PR 32084, http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32084
> > >
>
> I still don't remember why we have reassoc2. I'm in favor of removing
> it unless someone can show it's producing performance improvements :)
I got
Here are SPEC CPU 2006 -O2 -ffast-math differences between revision
125281 without the second reassoc and revision 125281 on Intel64:
(r125281 w/o reassoc2 - r125281)/r125281
400.perlbench 0.492611%
401.bzip2 0.613497%
403.gcc 0%
429.mcf 0%
445.gobmk 0%
456.hmmer -0.787402%
458.sjeng 1.14286%
462.libquantum 0%
464.h264ref 0%
471.omnetpp -0.869565%
473.astar 0%
483.xalancbmk 0.105374%
Est. SPECint(R)_base2006 0%
410.bwaves 1.8018%
416.gamess -1.14286%
433.milc -0.840336%
434.zeusmp 0.657895%
435.gromacs -1.73348%
436.cactusADM -0.952381%
437.leslie3d -0.21692%
444.namd 0%
447.dealII -2.33463%
450.soplex 0%
453.povray 0%
454.calculix -26.3852%
459.GemsFDTD 0%
465.tonto 0.704225%
470.lbm 0%
481.wrf 8.9613%
482.sphinx3 0%
Est. SPECfp(R)_base2006 -1.51515%
That is without reassoc2, SPEC CPU 2006 FP with -O2 -ffast-math
is down by about 1.5% on Core 2 Duo.
H.J.
More information about the Gcc-patches
mailing list