This is the mail archive of the
mailing list for the GCC project.
Re: [RFC] Fix PR28684
- From: Richard Guenther <rguenther at suse dot de>
- To: Roger Sayle <roger at eyesopen dot com>
- Cc: gcc-patches at gcc dot gnu dot org, Revital1 Eres <ERES at il dot ibm dot com>, "R. Clint Whaley" <whaley at cs dot utsa dot edu>
- Date: Tue, 14 Nov 2006 19:02:21 +0100 (CET)
- Subject: Re: [RFC] Fix PR28684
- References: <Pine.LNX.firstname.lastname@example.org>
On Tue, 14 Nov 2006, Roger Sayle wrote:
> Perhaps Brad Lucier can help here, but rearranging A+(B+C) as (A+B)+C
> for many finite values of A, B and C can introduce errors of many
> thousands of ulp, when the exponents of A, B and C differ signifcantly.
> Consider for example, A=10*10, B=-10*10, C=0.5.
> The exercise to the reader is to determine whether the average error
> of reassociating A+(B+C) as (A+B)+C, over the domains of A, B and C is
> any worse than the averaged ulp error of fsin or fcos over the domain of
> its argument. Statistically, even without the extra accuracy of fsin
> or fcos, is it any safer or worse than reassociating just a single pair
> of additions, let alone a reversing a dot product or decomposing TRSM.
Well, sure. Following this reasoning having -fiec60559-math (or whatever
written piece of paper we want to follow exactly) and -ffast-math
(or how we want to call it - "precise" or "exact" are not what I would
name it in the context of the above example).
Of course I like the way Fortran specifies that unless bracketed
in the source, operands can be reordered by the compiler.
Aside from what we present to the user, folding and optimization
passes also need to adhere to some rules - sticking everything under
a flag_unsafe_math_optimizations is easy, getting the various
HONOR_* macros right is not.
> Here's where we're cheating...
> Over course, the bias is that uniformly over the domain of possible
> inputs it's extremely likely that the exponents of A and B will differ
> by more than the available mantissa bits. In the real world, the
> values are not uniformly distributed, so vectorizing reduction becomes
> reasonable, use of fsin/fcos is reasonable, and sub O(N^3) matrix
> multiplication is reasonable. It's a trade-off. As soon as we allow
> any change, the worst case behaviour often goes to hell, but hopefully
> its the average or median case we care about.
If the user can identify a class of transformations that hurt them
(like CSEing of reciprocals I ran into multiple times), being able
to just disable that kind of transformations can help them rescue
some of the performance advantage of -ffast-math. [I have seen
cleverly placed "volatile" temporaries for intermediate results
Richard Guenther <email@example.com>
Novell / SUSE Labs