This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: elliminate more float extensions
- From: Jan Hubicka <jh at suse dot cz>
- To: Segher Boessenkool <segher at koffie dot nl>
- Cc: Jan Hubicka <jh at suse dot cz>, gcc-patches at gcc dot gnu dot org, rth at cygnus dot com
- Date: Tue, 14 Jan 2003 14:09:16 +0100
- Subject: Re: elliminate more float extensions
- References: <20030111135229.GB26621@kam.mff.cuni.cz> <3E2185A6.A46DFE2E@koffie.nl>
> Jan Hubicka wrote:
> >
> > Hi,
> > I've been looking for reasons of cvtss2sd (that is slow) in mesa sources and it
> > looks like that after installing floor conversion patches to hammer branch most
> > of these come from testcases like this:
> >
> > /* { dg-do compile { target i?86-*-* x86_64-*-* } } */
> > /* { dg-options "-O2 -msse2 -march=athlon" } */
> > /* { dg-final { scan-assembler-not "cvtss2sd" } } */
> > float a,b;
> > main()
> > {
> > a=b*3.0;
> > }
> >
> > We can optimize the cvtss2sd away. I am doing this by teaching strip_float_extensions
> > to find narrowest mode holding the FP constant so the rest of code will choose
> > narrowest mode holding the operands and result of the operation.
>
> Does this always work correctly? Consider:
>
> float aiee(float a, float b)
> {
> return a * 3.0 * b;
> }
>
> int main()
> {
> aiee(2e+38f, 0.1f);
> }
>
> This will overflow if the (a * 3.0) is calculated in single precision.
>
> People who care about optimization at this low level should know to
> write 3.0f when they want it, imho.
Optimization won't happen in this case, as 3.0*b will not be casted to
single precision immediately.
Unforutnately pepople often forget to write the 'f' suffix and in 3d
programs the conversions are surprisingly common (Mesa spends about
8-15% time in conversions for instance)
Honza
>
>
> Segher
>
> (The CMP and ABS/NEGATE patches look safe to me, btw).
>