This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Is SSE FP ABI breaking on i386?


Hi
I've received email from Urlich mentioning problems with SSE and
C99_EVAL_METHOD.  I am not sure his point is right, as mentioned in
the later emails I am attaching.  Can someone familiar with these
issues comment?

Honza

I've looked through the code generated with -msse a bit.  The result
is that it's pretty good but it will greatly effect existing
implementations.

The x86 libm is expecting that all computations are carried out in the
long double format (see the <float.h> file in gcc).  Making the
computations using the SSE instructions reduces the precision to
float/double.  The result is higher error values.

The -msse flag is therefore effectively changing the ABI.  Speaking in
C99 terms, the FLT_EVAL_METHOD value is changed.  If the -msse option
is enabled FLT_EVAL_METHOD would have to be set to 0.

I'd say it's a nice optimization (is it faster if you have only only
FP value in a XMM register?) but it should be enabled only for
-ffast-float.

-- 
---------------.                          ,-.   1325 Chesapeake Terrace
Ulrich Drepper  \    ,-------------------'   \  Sunnyvale, CA 94089 USA
Red Hat          `--' drepper at redhat.com   `------------------------


> I'd say it's a nice optimization (is it faster if you have only only
> FP value in a XMM register?) but it should be enabled only for
Whether it is faster or not depends on implementation - for P4 it is
faster, since SSE unit is pipelined.

Honza

> The x86 libm is expecting that all computations are carried out in the
> long double format (see the <float.h> file in gcc).  Making the
> computations using the SSE instructions reduces the precision to
> float/double.  The result is higher error values.
Is gcc required to mandate the precisity for FLT_EVAL_METHOD == 2?
I was thinking about this issue and my conclusion is, that currently gcc
is free to throw away the extra precision anytime it wants (it does so
on each spill).  SSE code is equivalent to i387 code if gcc gets crazy
enought to spill after each instruction - gcc has right to do so, so
it has right to use SSE instructions too.

If this is not the case, we should set FLT_EVAL_METHOD to -1 or teach
gcc to do XFmode spills (that is performance killer).
> 
> The -msse flag is therefore effectively changing the ABI.  Speaking in
> C99 terms, the FLT_EVAL_METHOD value is changed.  If the -msse option
> is enabled FLT_EVAL_METHOD would have to be set to 0.
> 
> I'd say it's a nice optimization (is it faster if you have only only
> FP value in a XMM register?) but it should be enabled only for
> -ffast-float.
That can be somehow unfortunate. Perhaps we can make cpp to supply the
FLT_EVAL_METHOD depending on -msse/-mno-sse switch.

Honza


Jan Hubicka <jh@suse.cz> writes:

> Is gcc required to mandate the precisity for FLT_EVAL_METHOD == 2?

This is the only mode available across all variants.

> I was thinking about this issue and my conclusion is, that currently gcc
> is free to throw away the extra precision anytime it wants (it does so
> on each spill).

The floating-point evaluation mechanism is part of the ABI.  One might
want to define an ABI where FLT_EVAL_METHOD == 0 but this means that

a) you break the existing ABI
b) you can support this ABI only for >= PIII for float and >= P4 for
   double (at least which reasonable cost; all other variants would
   have to store the result of a computation after every operation
   which is not acceptable as the default mode)

> SSE code is equivalent to i387 code if gcc gets crazy enought to
> spill after each instruction - gcc has right to do so, so it has
> right to use SSE instructions too.

I know, but this is not gcc's practice and people rely on this.  Just
take libm where we get better (sometimes much better) results with the
old mode.

> If this is not the case, we should set FLT_EVAL_METHOD to -1 or teach
> gcc to do XFmode spills (that is performance killer).

You are breaking the ABI.

> That can be somehow unfortunate. Perhaps we can make cpp to supply the
> FLT_EVAL_METHOD depending on -msse/-mno-sse switch.

This won't help.  Again, you are breaking the ABI.  You'll get
complains from everybody doing floating-point computation on x86
(agreed, these people could make a much better choice of the processor
but still).


You can perhaps make -ffast-math the default on x86 (or create
something similar which does not have the negative side effects of
-ffast-math) and tell the people to disable this if they want the old
behavior.  But a mode with the old behavior must be easily accessible.
And yes, some preprocessor macro should be defined to get <float.h>
define the right value.


> The floating-point evaluation mechanism is part of the ABI.  One might
> want to define an ABI where FLT_EVAL_METHOD == 0 but this means that
> 
> a) you break the existing ABI
I don't remember seeing this in the i386 PS ABI, but I may be mistaken.
> b) you can support this ABI only for >= PIII for float and >= P4 for
>    double (at least which reasonable cost; all other variants would
>    have to store the result of a computation after every operation
>    which is not acceptable as the default mode)
Thats why I was suggeting -1.
> 
> > SSE code is equivalent to i387 code if gcc gets crazy enought to
> > spill after each instruction - gcc has right to do so, so it has
> > right to use SSE instructions too.
> 
> I know, but this is not gcc's practice and people rely on this.  Just
> take libm where we get better (sometimes much better) results with the
> old mode.
But the results depends on gcc version and gcc switches - I guess you
can't rely on that.
> 
> > If this is not the case, we should set FLT_EVAL_METHOD to -1 or teach
> > gcc to do XFmode spills (that is performance killer).
> 
> You are breaking the ABI.
> 
> > That can be somehow unfortunate. Perhaps we can make cpp to supply the
> > FLT_EVAL_METHOD depending on -msse/-mno-sse switch.
> 
> This won't help.  Again, you are breaking the ABI.  You'll get
> complains from everybody doing floating-point computation on x86
> (agreed, these people could make a much better choice of the processor
> but still).
On the other hand lots of people complains about current non-deterministics
behavour of compiler randomly throwing again the extra precision.
> 
> You can perhaps make -ffast-math the default on x86 (or create
> something similar which does not have the negative side effects of
> -ffast-math) and tell the people to disable this if they want the old
Something similar can be choice.  It can just default -mno-sse2/-mno-sse2
and I see I will need to develop mechanizm to disable separately the builtins
and separately the code generation - this is unfortunate, but possible.
> behavior.  But a mode with the old behavior must be easily accessible.
> And yes, some preprocessor macro should be defined to get <float.h>
> define the right value.
This should not be that dificult.

Honza


Jan Hubicka <jh@suse.cz> writes:

> I don't remember seeing this in the i386 PS ABI, but I may be mistaken.

It's not written down but it's what is implemented.

> But the results depends on gcc version and gcc switches - I guess you
> can't rely on that.

Of course you can.  The people writing the code decide about the
makefiles.

> On the other hand lots of people complains about current non-deterministics
> behavour of compiler randomly throwing again the extra precision.

Right, and these people explicitly use -ffloat-store.  But this is not
the default.  You can tell them now that they can use -msse -msse2 but
this is only good news for those.

-- 
---------------.                          ,-.   1325 Chesapeake Terrace
Ulrich Drepper  \    ,-------------------'   \  Sunnyvale, CA 94089 USA
Red Hat          `--' drepper at redhat.com   `------------------------


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]