$ cat test-float.c #include <float.h> #include <assert.h> union gl_long_double_union { struct { double hi; double lo; } dd; long double ld; }; const union gl_long_double_union gl_LDBL_MAX = { { DBL_MAX, DBL_MAX / (double)134217728UL / (double)134217728UL } }; # undef LDBL_MAX # define LDBL_MAX (gl_LDBL_MAX.ld) int main () { volatile long double m = LDBL_MAX; assert (m + m > m); } $ gcc -O2 test-float.c $ ./a.out a.out: test-float.c:20: main: Assertion `m + m > m' failed. Aborted test-float.c.234t.optimized contains: m ={v} 1.79769313486231580793728971405302307166001572487e+308; but that evaluates to Inf. DBL_MAX is 1.79769313486231570814527423731704e+308L.
Does it work with -O0? I guess that we fold the read from gl_LDBL_MAX in a "wrong" way, thus native_interpret_expr is maybe wrong?
No, it doesn't.
GCC 10.2 is released, adjusting target milestone.
(In reply to Andreas Schwab from comment #0) > $ cat test-float.c > #include <float.h> > #include <assert.h> > > union gl_long_double_union > { > struct { double hi; double lo; } dd; > long double ld; > }; > > const union gl_long_double_union gl_LDBL_MAX = > { { DBL_MAX, DBL_MAX / (double)134217728UL / (double)134217728UL } }; > # undef LDBL_MAX > # define LDBL_MAX (gl_LDBL_MAX.ld) > > int > main () > { > volatile long double m = LDBL_MAX; > > assert (m + m > m); > } > $ gcc -O2 test-float.c > $ ./a.out > a.out: test-float.c:20: main: Assertion `m + m > m' failed. > Aborted > > test-float.c.234t.optimized contains: > > m ={v} 1.79769313486231580793728971405302307166001572487e+308; > > but that evaluates to Inf. DBL_MAX is > 1.79769313486231570814527423731704e+308L. This comes from gnulib's use of lib/float.h. My question is why is gnulib using float.h on power? What makes the system float.h unsuable? Even if you fix this for your package including gnulib, the next failure you run into is this one: test-float.c:324: assertion 'x + x == x' failed Aborted (core dumped) Extracting from the test case: #include <stdio.h> #include <assert.h> #include <float.h> #include <math.h> int main (void) { int n = 107; volatile long double m = LDBL_MAX; volatile long double pow2_n = powl (2, n); volatile long double x = m + (m / pow2_n); printf ("n = %d\n", n); printf ("m = %Lf (%La)\n", m, m); printf ("pow2_n = %Lf (%La)\n", pow2_n, pow2_n); printf ("m / pow2_n = %Lf (%La)\n", (m / pow2_n), (m / pow2_n)); printf ("x = %Lf (%La)\n", x, x); if (x > m) assert (x + x == x); return 0; } gcc -o ~/test-ldbl-max ~/test-ldbl-max.c -lm ~/test-ldbl-max n = 107 m = 179769313486231580793728971405301199252069012264752390332004544495176179865349768338004270583473493681874097135387894924752516923758125018237039690323659469736010689648748751591634331824498526377862231967249520608291850653495428451067676993116107021027413767397958053860876625383538022115414866471826801819648.000000 (0x1.fffffffffffff7ffffffffffff8p+1023) pow2_n = 162259276829213363391578010288128.000000 (0x1p+107) m / pow2_n = 1107913932560222581216724223049124694376931327937918798971295069363205703164244740389102844506567402654244799528342026118673562844811584683014545030137100678976901567468093855075985516353544747282849589098225960074532039651619564827101237983225846137075291097947344654582153216.000000 (0x1.fffffffffffff7ffffffffffff8p+916) x = 179769313486231580793728971405301199252069012264752390332004544495176179865349768338004270583473493681874097135387894924752516923758125018237039690323659469736010689648748751591634331824498526377862231967249520608291850653495428451067676993116107021027413767397958053860876625383538022115414866471826801819648.000000 (0x1.fffffffffffff7ffffffffffffcp+1023) test-ldbl-max: /root/test-ldbl-max.c:21: main: Assertion `x + x == x' failed. Aborted (core dumped) Is this just a function of double double? That there is something representable that is larger than LDBL_MAX, but isn't valid given the double-double rules?
Confirmed.
The problem is that this gl_LDBL_MAX.ld is really the right maximum normalized double double number, but is one ulp larger than GCC's __LDBL_MAX__. The former is: 0x1.fffffffffffff7ffffffffffffc000p+1023 and the latter is: 0x1.fffffffffffff7ffffffffffff8000p+1023 The reason why gcc doesn't like the former and will treat it as infinity is that GCC internally treats the double double as having a 106 bit precision, but the former number is too large for 106 bit precision, it requires 107 bit precision. If we'd want to handle double double "properly" in GCC, we'd need to emulate it the way it is actually implemented, as a pair of doubles, and have all the operations defined for those (the question is what to do for transcendentals etc.), by recursing on real_* operations on both doubles.
Bisection points to my change - r280141 aka r10-5900-gea69031c5facc70e4a96df83cd58702900fd54b6 That changed: - _1 = gl_LDBL_MAX.ld; - m ={v} _1; to: + m ={v} 1.79769313486231580793728971405302307166001572487395108634e+308; So, either on the gnulib side one can drop the const from gl_LDBL_MAX, so that nothing tries to optimize it (or make it const volatile?), or perhaps the compiler could completely punt on all optimizations with double double in the + if (len > 0) + return native_interpret_expr (type, buf, len); gimple-fold.c code (i.e. when using native_encode_initializer first) and target is double double, or just punt for this specific case?
Or as an ugly hack for floating types with MODE_COMPOSITE_P (TYPE_MODE (mode)) in that spot, after using native_interpret_expr do native_encode_expr again and compare if the bits are identical (or perhaps do it for all floating point values, e.g. to deal with Intel magic values, NaN canonicalization etc.?
Created attachment 49045 [details] gcc11-pr95450.patch Untested fix. Or as I said, it could be limited to && COMPOSITE_MODE_P (element_mode (type)) only too.
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>: https://gcc.gnu.org/g:9f2f79df19fbfaa1c4be313c2f2b5ce04646433e commit r11-2830-g9f2f79df19fbfaa1c4be313c2f2b5ce04646433e Author: Jakub Jelinek <jakub@redhat.com> Date: Tue Aug 25 07:17:10 2020 +0200 gimple-fold: Don't optimize wierdo floating point value reads [PR95450] My patch to introduce native_encode_initializer to fold_ctor_reference apparently broke gnulib/m4 on powerpc64. There it uses a const union with two doubles and corresponding IBM double double long double which actually is the largest normalizable long double value (1 ulp higher than __LDBL_MAX__). The reason our __LDBL_MAX__ is smaller is that we internally treat the double double type as one having 106-bit precision, but it actually has a variable 53-bit to 2000-ish bit precision and for the 0x1.fffffffffffff7ffffffffffffc000p+1023L value gnulib uses we need 107-bit precision, therefore for GCC __LDBL_MAX__ is 0x1.fffffffffffff7ffffffffffff8000p+1023L Before my changes, we wouldn't be able to fold_ctor_reference it and it worked fine at runtime, but with the change we are able to do that, but because it is larger than anything we can handle internally, we treat it weirdly. Similar problem would be if somebody creates this way valid, but much more than 106 bit precision e.g. 1.0 + 1.0e-768. Now, I think similar problem could happen e.g. on i?86/x86_64 with long double there, it also has some weird values in the format, e.g. the unnormals, pseudo infinities and various other magic values. This patch for floating point types (including vector and complex types with such elements) will try to encode the returned value again and punt if it has different memory representation from the original. Note, this is only done in the path where native_encode_initializer was used, in order not to affect e.g. just reading an unpunned long double value; the value should be compiler generated in that case and thus should be properly representable. It will punt also if e.g. the padding bits are initialized to non-zero values. I think the verification that what we encode can be interpreted back woiuld be only an internal consistency check (so perhaps for ENABLE_CHECKING if flag_checking only, but if both directions perform it, then we need to avoid mutual recursion). While for the other direction (interpretation), at least for the broken by design long doubles we just know we can't represent in GCC all valid values. The other floating point formats are just theoretical case, perhaps we would canonicalize something to a value that wouldn't trigger invalid exception when without canonicalization it would trigger it at runtime, so let's just ignore those. Adjusted (so far untested) patch to do it in native_interpret_real instead and limit it to the MODE_COMPOSITE_P cases, for which e.g. fold-const.c/simplify-rtx.c punts in several other places too because we just know we can't represent everything. E.g. /* Don't constant fold this floating point operation if the result may dependent upon the run-time rounding mode and flag_rounding_math is set, or if GCC's software emulation is unable to accurately represent the result. */ if ((flag_rounding_math || (MODE_COMPOSITE_P (mode) && !flag_unsafe_math_optimizations)) && (inexact || !real_identical (&result, &value))) return NULL_TREE; Or perhaps guard it with MODE_COMPOSITE_P (mode) && !flag_unsafe_math_optimizations too, thus break what gnulib / m4 does with -ffast-math, but not normally? 2020-08-25 Jakub Jelinek <jakub@redhat.com> PR target/95450 * fold-const.c (native_interpret_real): For MODE_COMPOSITE_P modes punt if the to be returned REAL_CST does not encode to the bitwise same representation. * gcc.target/powerpc/pr95450.c: New test.
The releases/gcc-10 branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>: https://gcc.gnu.org/g:7e53436da1902061af797d0aaa744c52bd9829ae commit r10-8669-g7e53436da1902061af797d0aaa744c52bd9829ae Author: Jakub Jelinek <jakub@redhat.com> Date: Tue Aug 25 07:17:10 2020 +0200 gimple-fold: Don't optimize wierdo floating point value reads [PR95450] My patch to introduce native_encode_initializer to fold_ctor_reference apparently broke gnulib/m4 on powerpc64. There it uses a const union with two doubles and corresponding IBM double double long double which actually is the largest normalizable long double value (1 ulp higher than __LDBL_MAX__). The reason our __LDBL_MAX__ is smaller is that we internally treat the double double type as one having 106-bit precision, but it actually has a variable 53-bit to 2000-ish bit precision and for the 0x1.fffffffffffff7ffffffffffffc000p+1023L value gnulib uses we need 107-bit precision, therefore for GCC __LDBL_MAX__ is 0x1.fffffffffffff7ffffffffffff8000p+1023L Before my changes, we wouldn't be able to fold_ctor_reference it and it worked fine at runtime, but with the change we are able to do that, but because it is larger than anything we can handle internally, we treat it weirdly. Similar problem would be if somebody creates this way valid, but much more than 106 bit precision e.g. 1.0 + 1.0e-768. Now, I think similar problem could happen e.g. on i?86/x86_64 with long double there, it also has some weird values in the format, e.g. the unnormals, pseudo infinities and various other magic values. This patch for floating point types (including vector and complex types with such elements) will try to encode the returned value again and punt if it has different memory representation from the original. Note, this is only done in the path where native_encode_initializer was used, in order not to affect e.g. just reading an unpunned long double value; the value should be compiler generated in that case and thus should be properly representable. It will punt also if e.g. the padding bits are initialized to non-zero values. I think the verification that what we encode can be interpreted back woiuld be only an internal consistency check (so perhaps for ENABLE_CHECKING if flag_checking only, but if both directions perform it, then we need to avoid mutual recursion). While for the other direction (interpretation), at least for the broken by design long doubles we just know we can't represent in GCC all valid values. The other floating point formats are just theoretical case, perhaps we would canonicalize something to a value that wouldn't trigger invalid exception when without canonicalization it would trigger it at runtime, so let's just ignore those. Adjusted (so far untested) patch to do it in native_interpret_real instead and limit it to the MODE_COMPOSITE_P cases, for which e.g. fold-const.c/simplify-rtx.c punts in several other places too because we just know we can't represent everything. E.g. /* Don't constant fold this floating point operation if the result may dependent upon the run-time rounding mode and flag_rounding_math is set, or if GCC's software emulation is unable to accurately represent the result. */ if ((flag_rounding_math || (MODE_COMPOSITE_P (mode) && !flag_unsafe_math_optimizations)) && (inexact || !real_identical (&result, &value))) return NULL_TREE; Or perhaps guard it with MODE_COMPOSITE_P (mode) && !flag_unsafe_math_optimizations too, thus break what gnulib / m4 does with -ffast-math, but not normally? 2020-08-25 Jakub Jelinek <jakub@redhat.com> PR target/95450 * fold-const.c (native_interpret_real): For MODE_COMPOSITE_P modes punt if the to be returned REAL_CST does not encode to the bitwise same representation. * gcc.target/powerpc/pr95450.c: New test. (cherry picked from commit 9f2f79df19fbfaa1c4be313c2f2b5ce04646433e)
Fixed for 10.3+ and 11+.