This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [RFC] Fix PR28684

From: Clint Whaley <whaley at cs dot utsa dot edu>
To: whaley at cs dot utsa dot edu, rguenther at suse dot de
Cc: ZAKS at il dot ibm dot com, roger at eyesopen dot com, gcc-patches at gcc dot gnu dot org, ERES at il dot ibm dot com, DORIT at il dot ibm dot com
Date: Tue, 14 Nov 2006 17:07:15 -0600
Subject: Re: [RFC] Fix PR28684
References: <Pine.LNX.4.44.0611140748230.29879-100000@www.eyesopen.com> <200611142128.kAELS5Tu021986@pandora2.cs.utsa.edu> <Pine.LNX.4.64.0611142241180.3567@zhemvz.fhfr.qr>

Richard,

>"IEEE compliance" is still somewhat a vague statment, but I guess you
>refer to IEEE 754/854?  For example -ffast-math on x86_64 disables
>the handling of denormals and flush-to-zero behavior, which violates
>IEEE - do you care?  

Yes, I do.  I can't throw any flag that says it is free to get less
accuracy than IEEE 754 *unless it tells me exactly how it will do so*.
I.e., in a reordering of a sum, IEEE compliance is *not* lost, because both
answers are the result of a sequence of IEEE-compliant operations.  Using a
recipricol does not violate IEEE, in that the reciprication and multiplication
are both IEEE-compliant operations.  Now, I need to know if it is occuring,
because it *will* increase the expected error [crudely, (1/y * x) expects
2*epsilon error whereas (x/y) expects epsilon error].  This is why in
the original proposed flags, which was proposed to include using recipricols,
I suggested the man page mention it.  I'm not saying that no one cares about
order (i.e. for some applications, you might sort the data into the optimal
order), and if they did, they would still not be able throw the vectorization
flag.  However, for the vast majority of numerical software that doesn't
care about order, but does care that the answer is correct to a provable
degree, vectorization not grouped under non-IEEE could be used.

>It also specifies basic arithmetic and rounding,
>which makes association and contraction a violation of IEEE.
>
>So appearantly you have a more "useful" subset of IEEE you care about?

It is certainly true that things become confusing when optimization is
applied.  I take your point that indeed, some versions of an FMAC are
also not IEEE-compliant, because they don't do the intermediate rounding
suggested by two separate operations.  However, all the FMACs I'm aware
of thereby give you *greater* precision than two separate IEEE operations.
People with this need for control should obviously not only not turn on
advanced optimizations, but must often throw special flags to keep things
OK (for instance, round each 80-bit result in the x87, round in middle
of FMAC, etc).  I believe that GCC already enables FMACs on machines like
the PowerPC, and I know it doesn't round each 80-bit arithmetic on
the x87.  Instead, a special flag is required to override this 
extra-precision behavior, on the rather likely assumption that the pool of
users who are upset about extra accuracy is smaller than the pool of users
upset about lost accuracy.  I am confident that indeed lack of IEEE compliance
resulting in lower accuracy is a more useful set of IEEE violations for
most users than those that result in increased accuracy.  The fact that
normal compilation on the x87 doesn't have the warning that it may violate
IEEE tells me gcc's developers also use this understanding of IEEE compliance.

>We're trying to provide that with -funsafe-math-optimizations -- what in
>-funsafe-math-optimizations is that you cannot use it?  

>From the manpage:
| (b) may violate IEEE or ANSI standards

This is the killer for me.  This allows not handling NaNs, underflow/overflow,
etc, which has all the problems I have discussed.  Therefore, this flag
cannot be used by any serious numerical lib.  Many people do use it, 
because they go by what is actually done, and most of these optimizations
don't hurt normal code.  But if the flag definition allows it, someone is
free to add 3DNow! with saturating arithmetic to this flag (or SSE with the
IEEE-compliant bit turned off), which would then make a the library dangerous
in real simulation.

>-ffast-math
>also "disables" NaNs and Infs and floating point exceptions, which makes
>it a "stronger" violator of IEEE.

Turning off IEEE handling of underflow/overflow are certainly a killer
for most numerical apps, as is improper handling of NaNs (for instance,
ATLAS, a performance-centric package if ever there was one, takes several
performance hits to guarantee correct NaN handling).  At present,
ATLAS cannot throw either of the flags you discuss, as you would expect.

My request involves getting vectorization out from under a flag whose definition
disallows numerical libs from using it.  The flag should ideally specify
what transformation are allowed, so that the user can audit the code to
see if it is a problem.  Reordering in general does not produce less accuracy
as discussed before.  Anytime an individual FLOP is freed from the contract of
IEEE, then I have no assurance of the accuracy of that flop, much less the sum.

So, I guess my definition of IEEE-compliant arithmetic is that each computation
is at *least* as accurate as IEEE arithmetic of that precision, and in cases
where the expected error is increased by doing extra operations, the flag
definition (on info or man page) mentions it, so that the user can audit
the code to see if it can take those changes.

I understand there are practically existential questions here (what is
IEEE compliant, what's good, what's bad, etc).  That was why I I was
originally quite happy with the proposed changes, because it essentially
said reorderings & reciprication, which is easily understood and audited for,
without getting too deep into the existential questions.

Thanks,
Clint

Follow-Ups:
- Re: [RFC] Fix PR28684
  - From: Revital1 Eres

References:
- Re: [RFC] Fix PR28684
  - From: Roger Sayle
- Re: [RFC] Fix PR28684
  - From: Clint Whaley
- Re: [RFC] Fix PR28684
  - From: Richard Guenther

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]