Bug 95115 - RISC-V 64: inf/inf division optimized out, invalid operation not raised
Summary: RISC-V 64: inf/inf division optimized out, invalid operation not raised
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 10.1.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2020-05-13 20:08 UTC by Aurelien Jarno
Modified: 2020-07-02 19:52 UTC (History)
5 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2020-05-14 00:00:00


Attachments
Testcase (295 bytes, text/x-csrc)
2020-05-13 20:08 UTC, Aurelien Jarno
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Aurelien Jarno 2020-05-13 20:08:23 UTC
Created attachment 48525 [details]
Testcase

On 64-bit RISC-V, the acos/asin tests from the glibc testsuite fails when it is built with GCC 10, as the invalid operation flag is not raised for invalid input values. The glibc code uses the following code to generate a NaN and raise an invalid input [1]:

  else {
    u.i[HIGH_HALF]=0x7ff00000;
    v.i[HIGH_HALF]=0x7ff00000;
    u.i[LOW_HALF]=0;
    v.i[LOW_HALF]=0;
    return u.x/v.x;
  }

With GCC 9, this results in the following code:
        li      a1,2047
        slli    a1,a1,52
        fmv.d.x fa5,a1
        li      a0,16
        fdiv.d  fs0,fa5,fa5

With GCC 10, the division is optimized out and the result is directly loaded as a constant, causing the invalid operation not to be raised.

I have attached a small testcase to reproduce the issue.
Comment 1 Marc Glisse 2020-05-13 20:45:52 UTC
I am seeing the same thing on x86_64, happens during FRE1, so it looks like tree-optimization.
Comment 2 Marc Glisse 2020-05-13 20:58:33 UTC
Or during CCP with the simpler

double f(){
  double d=__builtin_inf();
  return d/d;
}

and all the -frounding-math -ftrapping-math -fsignaling-nans don't seem to help.
Comment 3 Jim Wilson 2020-05-14 03:51:06 UTC
Marc Glisse's testcase fails even with old gcc versions.  My x86_64 Ubuntu 16.04 gcc-5.4.0 also removes the divide with -O.  My Ubuntu 18.04 gcc-7.5.0 gives the same result.  This seems to be simple constant folding that we have always done.  The assumption here seems to be that if the user is dividing constants, then we don't need to worry about setting exception bits.  If I write (4.0 / 3.0) for instance, the compiler just folds it and doesn't worry about setting the inexact bit.

Aurelien Jarno's testcase in the attachment is more interesting, as that works with older gcc versions, just not gcc-10.  I did a bisect, and tracked this down to the Richard Biener's patch for pr83518.  It looks like the glibc code was obfuscated a bit to try to avoid the usual trivial constant folding, and the patch for pr83518 just made gcc smart enough to recognize that constants are involved, and then optimize this case the same way we have always optimized FP constant divides.

Newlib incidentally uses (x-x)/(x-x) where x is the input value, so there are no constants involved, and the divide does not get optimized away.  This still works with gcc-10.  The result is a subtract followed by a divide.

At first glance, this looks more like a glibc problem to me than a gcc problem.  But maybe the fact that constants were written to memory and then read back in should prevent the usual trivial FP constant divide folding.

I can almost make the glibc testcase work if I mark the unions as volatile.  That prevents the union reads and writes from being optimized away, but the divide gets moved after the fetestexcept call.  That looks like a gcc bug though I think a different problem that this pr.  The 234t.optimized dump is correct.  The 236r.expand dump is wrong.  This happens for both x86_64 and RISC-V.  The resulting code is bigger than what the newlib trick generates though.
Comment 4 Marc Glisse 2020-05-14 05:33:41 UTC
(In reply to Jim Wilson from comment #3)
> The assumption here seems to be that if the user is
> dividing constants, then we don't need to worry about setting exception
> bits.  If I write (4.0 / 3.0) for instance, the compiler just folds it and
> doesn't worry about setting the inexact bit.

We don't fold 0./0. though (unless -fno-trapping-math), it would make sense for Inf/Inf to behave the same. And -frounding-math protects 4./3.

> divide gets moved after the fetestexcept call.  That looks like a gcc bug

Yes, that one is known.
Comment 5 Aurelien Jarno 2020-05-14 06:05:33 UTC
(In reply to Jim Wilson from comment #3)
> Newlib incidentally uses (x-x)/(x-x) where x is the input value, so there
> are no constants involved, and the divide does not get optimized away.  This
> still works with gcc-10.  The result is a subtract followed by a divide.

This is also what is used for acosf and asinf, that's why the problem only appears with the double variant.
Comment 6 Richard Biener 2020-05-14 06:33:11 UTC
(simplify
 (rdiv @0 @0)
 (if (FLOAT_TYPE_P (type)
      && ! HONOR_NANS (type)
      && ! HONOR_INFINITIES (type))
  { build_one_cst (type); }))

so that's not it, possibly constant folding instead in const_binop.
There we only have

1276          /* Don't perform operation if we honor signaling NaNs and
1277             either operand is a signaling NaN.  */
1278          if (HONOR_SNANS (mode)
1279              && (REAL_VALUE_ISSIGNALING_NAN (d1)
1280                  || REAL_VALUE_ISSIGNALING_NAN (d2)))
1281            return NULL_TREE;

and

1283          /* Don't perform operation if it would raise a division
(gdb) 
1284             by zero exception.  */
1285          if (code == RDIV_EXPR
1286              && real_equal (&d2, &dconst0)
1287              && (flag_trapping_math || ! MODE_HAS_INFINITIES (mode)))
1288            return NULL_TREE;

which both don't trigger.  Afterwards

1309          inexact = real_arithmetic (&value, code, &d1, &d2);

even returns false and the result is a qNaN.

For the specific regression in this bug we now simply are able to
turn

    return u.x/v.x;

into a division of two constants.  That's nothing we're going to "fix",
so we have to fix the above instead which is a much older issue.
Comment 7 Vineet Gupta 2020-07-02 19:52:25 UTC
FWIW this is also applicable to ARC glibc + upstream gcc 10

https://sourceware.org/pipermail/libc-alpha/2020-July/115679.html
https://sourceware.org/pipermail/libc-alpha/2020-July/115694.html