This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [RFC] Split -mrecip
- From: Michael Matz <matz at suse dot de>
- To: Uros Bizjak <ubizjak at gmail dot com>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Sat, 3 Sep 2011 17:42:11 +0200 (CEST)
- Subject: Re: [RFC] Split -mrecip
- References: <Pine.LNX.4.64.1108311805470.25354@wotan.suse.de> <CAFULd4YxePQXnBnojpzdXJxb7seeOO6VCc_QOfdHFHVugGb6bA@mail.gmail.com>
Hi,
On Sat, 3 Sep 2011, Uros Bizjak wrote:
> > I've decided to not use four new bits from target_flags, and instead
> > created a new mask (recip_mask). ÂFour bits would have fit in target
> > bits right now, Âbut in the future we might want to add more
> > specialization, like modes for which the reciprocals are active.
> >
> > What do you think?
>
> These new flags looks like a nice addition, but I wonder, why we need
> separate options to handle vector recip. A vector rsqrt or rdiv is
> generated automatically in the same way as scalar rsqrt or rdiv is
> generated, so IMO, -mrecip-sqrt and -mrecip-div should be enough.
No, the difference does matter. Using reciprocal estimates for scalar
divs often results in errors in benchmarks because those sometimes are
used to feed integer conversions for either index calculations or
printouts. The small rounding errors with the reciprocals lead to
incorrect outputs then. Context where the div can be vectorized often
don't have this problem (they're then used purely for calculations over
arrays of float data). For instance spec2006 and polyhedron break with
-mrecip purely because of the scalar reciprocals, but work with only
vectorized ones. I.e. users really want to differ between both.
Also, when this patch goes in I plan to submit another one that activates
vectorized rcp/rsqrt under -ffast-math already (that's what ICC happens to
do too).
> For the future - could rs6000 and x86 use the same compile options to
> handle reciprocals?
I'd guess so. rs6000 uses a hand-written comma-splitter, which we could
reuse.
Ciao,
Michael.