Building gmp-5.0.5 for m68k (configure for --host=m68020-unknown-linux), with gcc 4.6, 4.7, or 4.8, and then running gmp's test suite results in: make[4]: Entering directory `/home/mikpe/objdir-gmp/tests/mpq' ... ERROR (check_random test 1): bad mpq_set_d results 3.098073531878046e-120 3.098073531878046e-120 ... 1 of 11 tests failed The failure is 100% repeatable. Building with gcc-4.5.4 instead eliminates the failure, so this is a regression. I don't have a reduced test case yet. I'm going to start a bisection on gcc trunk to hopefully identify when this regression started.
I cut out one line too many in the build log, it's tests/mpq/t-get_d that fails. On the surface the problem started with Jan Hubicka's "Inline heuristic 4/4" patch in r166657: http://gcc.gnu.org/ml/gcc-patches/2010-11/msg01305.html http://gcc.gnu.org/ml/gcc-cvs/2010-11/msg00546.html However, all that did was to bump the default value of early-inlining-insns from 8 to 10. Compiling gmp with gcc-4.5.4 --param early-inlining-insns=10 also causes tests/mpq/t-get_d to fail, so the real problem is older.
The regression started with Jan Hubicka's "Pretty-ipa merge: Inliner heruistics reorg" patch in r147852: http://gcc.gnu.org/ml/gcc-patches/2009-05/msg00812.html http://gcc.gnu.org/ml/gcc-cvs/2009-05/msg00829.html However, that just changed inlining heuristics so most likely it exposed a latent problem. I'll start working on a reduced test case now.
Mikael -- can you try adding -fno-rename-registers to the cflags used to compile gmp and see if that changes anything? It'd be greatly appreciated.
(In reply to comment #3) > Mikael -- can you try adding -fno-rename-registers to the cflags used to > compile gmp and see if that changes anything? It'd be greatly appreciated. Done, -fno-rename-registers made no difference with any of 4.6/4.7/4.8.
Created attachment 29198 [details] standalone test case Here's a standalone test case, extracted from gmp's t-get_d.c. It's largish (444 lines) but most of that are support functions needed for the program logic but otherwise unrelated to the wrong-code issue. The wrong-code occurs in check_random() as a side-effect of the inlining of my_ldexp().
Thanks. The fact that -fno-rename-registers does not affect the result indicates this is a separate code generation issue than the one I'm working on. The reduced testcase should help considerably. Thanks, Jeff
I'm going to throw this out there before I start doing various pirouettes to get some kind of emulator going to reproduce this problem... Mikael, is there some 68k emulator with a Linux image available somewhere? Are you testing on actual hardware? Is this hardware available for debugging remotely? I'm just basically trying to find some sort of platform where I can reproduce and debug this.
I'm using the ARAnym full-system m68040 emulator -- that's the only realistic option for testing gcc on Linux/m68k at the moment. I maintain my own small Fedora-based distro, so if you like I can supply a disk image with that, plus the scripts and ARAnym parameter files I use. Otherwise you can probably find HOWTOs and pointers to images on the Debian/m68k wiki.
An image with parameters/instructions would be ideal. Heck...not ideal...great! Thanks so much. "mikpe at it dot uu.se" <gcc-bugzilla@gcc.gnu.org> wrote: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55939 --- Comment #8 from Mikael Pettersson <mikpe at it dot uu.se> 2013-01-21 23:18:34 UTC --- I'm using the ARAnym full-system m68040 emulator -- that's the only realistic option for testing gcc on Linux/m68k at the moment. I maintain my own small Fedora-based distro, so if you like I can supply a disk image with that, plus the scripts and ARAnym parameter files I use. Otherwise you can probably find HOWTOs and pointers to images on the Debian/m68k wiki.
Confirmed with testcase in comment #5, compiling with: [guest@aranym2 gcc]$ ./xgcc -B./ a.c -O2 [guest@aranym2 gcc]$ ./a.out Aborted
In adding debugging printfs throughout a locally reduced testcase, I ran into something peculiar, that may or may not be related to this PR. Perhaps some floating point savvy person comment? double d, d2; ... if (d != d2) { dumpd(d,d2); return -1; } By this point, "d" and "d2" are in fp2/fp3, and we compare them in extended-precision real mode (fcmp.x) below: (gdb) si 0x80000670 154 if (d != d2) 1: x/i $pc => 0x80000670 <check_random+260>: fcmpx %fp3,%fp2 (gdb) info reg fp2 fp2 9.5869038983080764131982923476680188e-94 (raw 0x3ec90000fff801fffff84000) (gdb) info reg fp3 fp3 9.5869038983080764090401284414100591e-94 (raw 0x3ec90000fff801fffff83ff8) Notice above that the registers are slightly different, so d != d2. (gdb) si 0x80000674 154 if (d != d2) 1: x/i $pc => 0x80000674 <check_random+264>: fbne 0x80000692 <check_random+294> (gdb) 156 dumpd(d,d2); 1: x/i $pc => 0x80000692 <check_random+294>: fmoved %fp2,%sp@- (gdb) si 0x80000696 156 dumpd(d,d2); 1: x/i $pc => 0x80000696 <check_random+298>: fmoved %fp3,%sp@- (gdb) 0x8000069a 156 dumpd(d,d2); 1: x/i $pc => 0x8000069a <check_random+302>: jsr 0x80000508 <dumpd> Notice we pass fp2/fp3 in the stack as arguments to dumpd(). Interestingly, we move them with fmove.d which is double-precision (instead of the compare we did prior in .x mode). With dumpd() implemented as below, "a" will be the same as "b", since we chopped off some bits with the fmove.d explained above. void dumpd(double a, double b) { ... if (a == b) dump_same(); ... ... } dumpd() gets executed as follows: 1: x/i $pc => 0x8000050c <dumpd+4>: fmoved %fp@(8),%fp1 (gdb) 0x80000512 90 { 1: x/i $pc => 0x80000512 <dumpd+10>: fmoved %fp@(16),%fp0 (gdb) info reg fp1 fp1 9.5869038983080764131982923476680188e-94 (raw 0x3ec90000fff801fffff84000) (gdb) info reg fp0 fp0 9.5869038983080764131982923476680188e-94 (raw 0x3ec90000fff801fffff84000) Notice fp0 and fp1 are now the same. (gdb) 93 if (a == b) 1: x/i $pc => 0x80000528 <dumpd+32>: fcmpx %fp0,%fp1 (gdb) 0x8000052c 93 if (a == b) 1: x/i $pc => 0x8000052c <dumpd+36>: fbeq 0x80000538 <dumpd+48> (gdb) 110 } 1: x/i $pc => 0x80000538 <dumpd+48>: unlk %fp etc etc etc. I'm a bit flaky as to FP standards, but it seems suspect that two values that are different, suddenly become the same when passed to a function. Is this one of those FP quirks, and I'm horrendously misguided?
On 01/29/2013 10:26 AM, aldyh at gcc dot gnu.org wrote: > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55939 > > --- Comment #11 from Aldy Hernandez <aldyh at gcc dot gnu.org> 2013-01-29 17:26:46 UTC --- > In adding debugging printfs throughout a locally reduced testcase, I ran into > something peculiar, that may or may not be related to this PR. Perhaps some > floating point savvy person comment? > > double d, d2; > ... > if (d != d2) { > dumpd(d,d2); > return -1; > } > > By this point, "d" and "d2" are in fp2/fp3, and we compare them in > extended-precision real mode (fcmp.x) below: At this point alarm bells are already ringing... When I see EQ/NE comparisons of floating points, there's a good bet something is wrong. > > (gdb) si > 0x80000670 154 if (d != d2) > 1: x/i $pc > => 0x80000670 <check_random+260>: fcmpx %fp3,%fp2 > (gdb) info reg fp2 > fp2 9.5869038983080764131982923476680188e-94 (raw > 0x3ec90000fff801fffff84000) > (gdb) info reg fp3 > fp3 9.5869038983080764090401284414100591e-94 (raw > 0x3ec90000fff801fffff83ff8) > > Notice above that the registers are slightly different, so d != d2. Right. > > (gdb) si > 0x80000674 154 if (d != d2) > 1: x/i $pc > => 0x80000674 <check_random+264>: fbne 0x80000692 <check_random+294> > (gdb) > 156 dumpd(d,d2); > 1: x/i $pc > => 0x80000692 <check_random+294>: fmoved %fp2,%sp@- > (gdb) si > 0x80000696 156 dumpd(d,d2); > 1: x/i $pc > => 0x80000696 <check_random+298>: fmoved %fp3,%sp@- > (gdb) > 0x8000069a 156 dumpd(d,d2); > 1: x/i $pc > => 0x8000069a <check_random+302>: jsr 0x80000508 <dumpd> > > Notice we pass fp2/fp3 in the stack as arguments to dumpd(). Interestingly, we > move them with fmove.d which is double-precision (instead of the compare we did > prior in .x mode). > > With dumpd() implemented as below, "a" will be the same as "b", since we > chopped off some bits with the fmove.d explained above. Right. This is very similar to the problems with the 80bit FP precision on i387. The in-register format is 80 bits, but in memory the values are just 64 bits. So a value can change simply by storing it into memory and reading it back out. This causes all kinds of subtle problems -- think reload. A programmer can't predict what the allocator is going to do, so a small change in the source might lead to changes in register allocation and a value which previously lived in a register might now be on the stack and reloaded. And if the code is not tolerant of those extra bits appearing/disappearing causes problems. A great example was Dreamworks where this extra precision suddenly appearing/disappearing was causing a single pixel to unexpectedly change color in one of their frame renders. > Is this one of those FP quirks, and I'm horrendously misguided? Yup, this is definitely one of those FP quirks. It might be helpful to see the entire function. Jeff
All hail extended precision, and lack of proper rounding operations (until the 68040 anyway). Aldy's analysis in #c11 suggests that compiling (all of gmp, including the testcase) with -ffloat-store will probably fix this test case. I can't immediately see the bug in gmp, as it looks as if most of the double extraction should be done via integer arithmetic, but perhaps if something went wrong in the configuration one of the fp arithmetic fallback paths is being used accidentially...
The wrong-code occurs in the test wrapper rather than in gmp proper, and the test wrapper does contain a helper function (my_ldexp) which computes and returns 'double'. So the problem might simply be caused by excess precision.
Perhaps -fexcess-precision=standard might fix this too (and be less expensive).
>> double d, d2; >> ... >> if (d != d2) { >> dumpd(d,d2); >> return -1; >> } >> >> By this point, "d" and "d2" are in fp2/fp3, and we compare them in >> extended-precision real mode (fcmp.x) below: > At this point alarm bells are already ringing... When I see EQ/NE > comparisons of floating points, there's a good bet something is wrong. For the record, only the EQ/NE above comes from GMP. The other one (in dumpd()) I added myself in testing. But yes, that one "d != d2" is in the original testcase. I will try the various suggestions and report back. Thanks everyone.
Both -fexcess-precision=standard and -ffloat-store fix the testcase. This is perhaps a duplicate of PR323. Although, Jakub mentions that according to: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=323#c127 ...-fexcess-precision=standard might not be 100% reliable on m68k, so you may have to go for -ffloat-store. Mikael, can you check the GMP testsuite with -ffloat-store? And perhaps see if there is a floating point EQ/NE compare in the GMP test assembly? Thanks.
gmp-5.0.5 builds and tests Ok on m68k-linux when configured with -ffloat-store in CFLAGS. There is one "fcmp;fblt" and three "fcmp;fbne" sequences in the previously failing test case (just the .o file, so excluding libgmp.a). There are 24 fcmps in libgmp.a, but they are all followed by relational conditional jumps, none is followed by an equality/inequality conditional jump. Should I close this as a duplicate of PR323?
Dup. *** This bug has been marked as a duplicate of bug 323 ***
On Tue, 29 Jan 2013, jakub at gcc dot gnu.org wrote: > Perhaps -fexcess-precision=standard might fix this too (and be less > expensive). Note that -fexcess-precision=standard is not fully implemented for m68k: the back-end patterns claiming to carry out SFmode/DFmode operations, that actually do XFmode arithmetic, haven't been disabled for -fexcess-precision=standard as I did for x86 (along with some related fixes for conversions to SFmode/DFmode). While the disabling of SFmode / DFmode operations is just for safety - if the architecture-independent parts of GCC are working correctly, it shouldn't matter whether they are enabled or not because there will be no arithmetic in those modes in the GIMPLE anyway - I don't know if m68k needs similar fixes to conversion operations.
Another less invasive option could be to force the return value of my_ldexp() to go through memory, thus chopping off the excess precision before returning: --- a.c 2013-02-18 20:37:23.000000000 +0100 +++ b.c 2013-02-18 20:36:42.000000000 +0100 @@ -404,7 +404,11 @@ e += 1; } } else - return d; + { + volatile double force_round = d; + return force_round; + + } } } But basically, you'd have to carefully keep track of when you have to use volatiles.