This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: patch to bug #86829


On Tue, Aug 21, 2018 at 11:27 PM Jeff Law <law@redhat.com> wrote:
>
> On 08/21/2018 02:08 PM, Giuliano Augusto Faulin Belinassi wrote:
> >> Just as an example, compare the results for
> >> x = 0x1.fffffffffffffp1023
> >
> > Thank you for your answer and the counterexample. :-)
> >
> >> If we had useful range info on floats we might conditionalize such
> >> transforms appropriately.  Or we can enable it on floats and do
> >> the sqrt (x*x + 1) in double.
> >
> > I think I managed to find a bound were the transformation can be done
> > without overflow harm, however I don't know about rounding problems,
> > however
> >
> > Suppose we are handling double precision floats for now. The function
> > x/sqrt(1 + x*x) approaches 1 when x is big enough. How big must be x
> > for the function be 1?
> >
> > Since sqrt(1 + x*x) > x when x > 1, then we must find a value to x
> > that x/sqrt(1 + x*x) < eps, where eps is the biggest double smaller
> > than 1. Such eps must be around 1 - 2^-53 in ieee double because the
> > mantissa has 52 bits. Solving for x yields that x must be somewhat
> > bigger than 6.7e7, so let's take 1e8. Therefore if abs(x) > 1e8, it is
> > enough to return copysign(1, x). Notice that this arguments is also
> > valid for x = +-inf (if target supports that) because sin(atan(+-inf))
> > = +-1, and it can be extended to other floating point formats.The
> > following test code illustrates my point:
> > https://pastebin.com/M4G4neLQ
> >
> > This might still be faster than calculating sin(atan(x)) explicitly.
> >
> > Please let me know if this is unfeasible. :-)
> The problem is our VRP implementation doesn't handle any floating point
> types at this time.   If we had range information for FP types, then
> this kind of analysis is precisely what we'd need to do the
> transformation regardless of -ffast-math.

I think his idea was to emit a runtime test?  You'd have to use a
COND_EXPR and evaluate both arms at the same time because
match.pd doesn't allow you to create control flow.

Note the rounding issue is also real given for large x you strip
away lower mantissa bits when computing x*x.

Richard.

> jeff
> >
> > Giuliano.
> >
> > On Tue, Aug 21, 2018 at 11:27 AM, Jeff Law <law@redhat.com> wrote:
> >> On 08/21/2018 02:02 AM, Richard Biener wrote:
> >>> On Mon, Aug 20, 2018 at 9:40 PM Jeff Law <law@redhat.com> wrote:
> >>>>
> >>>> On 08/04/2018 07:22 AM, Giuliano Augusto Faulin Belinassi wrote:
> >>>>> Closes bug #86829
> >>>>>
> >>>>> Description: Adds substitution rules for both sin(atan(x)) and
> >>>>> cos(atan(x)). These formulas are replaced by x / sqrt(x*x + 1) and 1 /
> >>>>> sqrt(x*x + 1) respectively, providing up to 10x speedup. This identity
> >>>>> can be proved mathematically.
> >>>>>
> >>>>> Changelog:
> >>>>>
> >>>>> 2018-08-03  Giuliano Belinassi <giuliano.belinassi@usp.br>
> >>>>>
> >>>>>     * match.pd: add simplification rules to sin(atan(x)) and cos(atan(x)).
> >>>>>
> >>>>> Bootstrap and Testing:
> >>>>> There were no unexpected failures in a proper testing in GCC 8.1.0
> >>>>> under a x86_64 running Ubuntu 18.04.
> >>>> I understand these are mathematical identities.  But floating point
> >>>> arthmetic in a compiler isn't nearly that clean :-)  We have to worry
> >>>> about overflows, underflows, rounding, and the simple fact that many
> >>>> floating point numbers can't be exactly represented.
> >>>>
> >>>> Just as an example, compare the results for
> >>>> x = 0x1.fffffffffffffp1023
> >>>>
> >>>> I think sin(atan (x)) is well defined in that case.  But the x*x isn't
> >>>> because it overflows.
> >>>>
> >>>> So  I think this has to be somewhere under the -ffast-math umbrella.
> >>>> And the testing requirements for that are painful -- you have to verify
> >>>> it doesn't break the spec benchmark.
> >>>>
> >>>> I know Richi acked in the PR, but that might have been premature.
> >>>
> >>> It's under the flag_unsafe_math_optimizations umbrella, but sure,
> >>> a "proper" way to optimize this would be to further expand
> >>> sqrt (x*x + 1) to fabs(x) + ... (extra terms) that are precise enough
> >>> and not have this overflow issue.
> >>>
> >>> But yes, I do not find (quickly skimming) other simplifications that
> >>> have this kind of overflow issue (in fact I do remember raising
> >>> overflow/underflow issues for other patches).
> >>>
> >>> Thus approval withdrawn.
> >> At least until we can do some testing around spec.  There's also a patch
> >> for logarithm addition/subtraction from MCC CS and another from Giuliano
> >> for hyperbolics that need testing with spec.  I think that getting that
> >> testing done anytime between now and stage1 close is sufficient -- none
> >> of the 3 patches is particularly complex.
> >>
> >>
> >>>
> >>> If we had useful range info on floats we might conditionalize such
> >>> transforms appropriately.  Or we can enable it on floats and do
> >>> the sqrt (x*x + 1) in double.
> >> Yea.  I keep thinking about what it might take to start doing some light
> >> VRP of floating point objects.  I'd originally been thinking to just
> >> track 0.0 and exceptional value state.  But the more I ponder the more I
> >> think we could use the range information to allow transformations that
> >> are currently guarded by the -ffast-math family of options.
> >>
> >> jeff
> >>
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]