This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [patch] enabling vectorization by default at -O3
On 9/6/07, H.J. Lu <hjl@lucon.org> wrote:
> On Thu, Sep 06, 2007 at 09:23:31AM -0400, Daniel Berlin wrote:
> > On 9/6/07, H.J. Lu <hjl@lucon.org> wrote:
> > > On Thu, Sep 06, 2007 at 01:49:52PM +0200, Uros Bizjak wrote:
> > > > >
> > > > > * Hmm, why is --ffast-math slower? And with vectorization that much
> > >
> > > Also see
> > >
> > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32183
> > >
> > > With -O2 --ffast-math, we turn a faster loop:
> > >
> > > float sf;
> > > ...
> > > sf = 500 * sf;
> > > for (i = 0; i < ceplen; i++)
> > > sum[i] *= sf;
> > >
> > > into a slower loop:
> > >
> > > for (i = 0; i < ceplen; i++)
> > > sum[i] = (sum[i]* 500)*sf;
> > >
> > > > > slower? I recheck induct (V.F, NV.F) and I could reproduce the timings.
> > > > >
> > > >
> > > > > that is indeed interesting (I'd be happy to look at a testcase)
> > > >
> > > > This is PR 32084, http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32084
> > > >
> >
> > I still don't remember why we have reassoc2. I'm in favor of removing
> > it unless someone can show it's producing performance improvements :)
>
> I got
>
> Here are SPEC CPU 2006 -O2 -ffast-math differences between revision
> 125281 without the second reassoc and revision 125281 on Intel64:
Okay, then i guess we should fix it. I think we should just use
zdenek's patch for now, and if anyone complains about lack of
reassociation across loop boundaries, we fix that then.