Bug 44631 - [sparc] long long to double conversion error
Summary: [sparc] long long to double conversion error
Status: RESOLVED INVALID
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.4.4
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: wrong-code
: 45559 (view as bug list)
Depends on:
Blocks:
 
Reported: 2010-06-22 13:30 UTC by Matthias Klose
Modified: 2010-09-07 11:47 UTC (History)
4 users (show)

See Also:
Host:
Target: sparc-linux-gnu sparc64-linux-gnu
Build:
Known to work:
Known to fail: 4.3.5 4.4.4 4.5.0 4.6.0
Last reconfirmed:


Attachments
test long long to double runtime conversions (325 bytes, text/plain)
2010-06-23 12:12 UTC, Mikael Pettersson
Details
updated long long to double conversion test (470 bytes, text/plain)
2010-07-15 21:30 UTC, Mikael Pettersson
Details
fix Linux kernel math emulation FP_FROM_INT macro (340 bytes, patch)
2010-07-18 20:58 UTC, Mikael Pettersson
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Matthias Klose 2010-06-22 13:30:28 UTC
[forwarded from http://bugs.debian.org/581571]

seen with current 4.3, 4.4, 4.5, trunk

gcc -O0 -mcpu=v9 foo.c && ./a.out prints 2.47804e+17 instead of 9.79798e+16. works with -mcpu=v8.
same error with gcc -m64 -O0 foo.c && ./a.out

#include <stdio.h>
int main(void) {
    unsigned long long l = 97979797979797980LL;

    printf("%g\n", (double)l);
    return(0);
}
Comment 1 Richard Biener 2010-06-22 13:39:42 UTC
Please make sure this is not a glibc bug.  Does it work with -O1, with -O1 -ffast-math?
Comment 2 Matthias Klose 2010-06-22 13:53:24 UTC
yes, gcc -mcpu=v9 and gcc -m64 work both with -O1 and -O1 -ffast-math.
Comment 3 Jakub Jelinek 2010-06-22 17:13:06 UTC
Well, for -O1 the ullong -> double conversion is done at compile time instead of runtime.
Does the problem occur also when l is long long instead of unsigned long long?
Can you check what value the fxtod insn computes?
Comment 4 Mikael Pettersson 2010-06-23 12:12:15 UTC
Created attachment 20986 [details]
test long long to double runtime conversions

Making the constant signed rather than unsigned makes no difference.

I converted the test case to do the conversions at runtime and to print the hex representations of the long long and double values.  Here's some results:

> gcc -O2 -m32 -mcpu=v8 pr44631.c ; ./a.out
97979797979797980 (0x015c181b6dc019dc) -> 9.79798e+16 (0x4375c181b6dc019e)
72057594037927936 (0x0100000000000000) -> 7.20576e+16 (0x4370000000000000)
72057594037927935 (0x00ffffffffffffff) -> 7.20576e+16 (0x4370000000000000)

This looks fine, but the topmost two values have been rounded.

> gcc -O2 -m32 -mcpu=v9 pr44631.c ; ./a.out
97979797979797980 (0x015c181b6dc019dc) -> 2.47804e+17 (0x438b83036db8033c)
72057594037927936 (0x0100000000000000) -> 1.44115e+17 (0x4380000000000000)
72057594037927935 (0x00ffffffffffffff) -> 7.20576e+16 (0x4370000000000000)

Note the discontinuity.  Looks to me like fxtod fails to round and instead produces a large jump in the exponent.

Does gcc assume some specific setting in FSR?
Comment 5 Mikael Pettersson 2010-07-15 21:30:32 UTC
Created attachment 21219 [details]
updated long long to double conversion test

I've updated the test case to try conversions of a larger range of values, and to convert them back to calculate the diffs due to precision loss.

When testing this on a couple of machines I noticed that the -mcpu=v9 code (fxtod) behaves differently depending on processor generation and OS kernel.

On a USIIIi (Sun Blade 2500 Red) with Linux kernel 2.6.35-rc5 I get:
> ./pr44631v2-v9 
0x015c181b6dc019dc -> 2.47804e+17 (0x438b83036db8033c) -> 0x0370606db7006780, diff 149824205863538084
0x0100000000000000 -> 1.44115e+17 (0x4380000000000000) -> 0x0200000000000000, diff 72057594037927936
0x00ffffffffffffff -> 7.20576e+16 (0x4370000000000000) -> 0x0100000000000000, diff 1
0x0020000000000001 -> 9.00720e+15 (0x4340000000000000) -> 0x0020000000000000, diff -1
0x0020000000000000 -> 9.00720e+15 (0x4340000000000000) -> 0x0020000000000000, diff 0
0x001fffffffffffff -> 9.00720e+15 (0x433fffffffffffff) -> 0x001fffffffffffff, diff 0

That is, going from 0x00ffffffffffffff to the next higher integer results in a huge difference in the resulting double.

On a USIIi (Ultra5) with Linux kernel 2.6.35-rc5 the same binary gives:
> ./pr44631v2-v9 
0x015c181b6dc019dc -> 9.79798e+16 (0x4375c181b6dc019e) -> 0x015c181b6dc019e0, diff 4
0x0100000000000000 -> 7.20576e+16 (0x4370000000000000) -> 0x0100000000000000, diff 0
0x00ffffffffffffff -> 7.20576e+16 (0x4370000000000000) -> 0x0100000000000000, diff 1
0x0020000000000001 -> 9.00720e+15 (0x4340000000000000) -> 0x0020000000000000, diff -1
0x0020000000000000 -> 9.00720e+15 (0x4340000000000000) -> 0x0020000000000000, diff 0
0x001fffffffffffff -> 9.00720e+15 (0x433fffffffffffff) -> 0x001fffffffffffff, diff 0

That is, while rounding occurs there are no huge jumps in the intermediate double representation.  In fact, the output exactly matches the output for the -mcpu=v8 case which uses pure SW conversions instead of fxtod.

So USIIIi and USIIi behave differently.

On another USIIIi (Sun Fire V240) running Solaris 10, a gcc-4.4.4 -mcpu=v9 binary again gives the exact same results as -mcpu=v8 or USIIi runs.

So Linux and Solaris behave differently on USIIIi.

Both the Linux and Solaris kernels for SPARC contain FP emulation support for various cases the HW doesn't like to handle.  According to comments in the Linux kernel one of the changes in USIII from earlier generations was that fxtod started to trap for certain cases.  According to debugging code I added to the Linux kernel, fxtod does trap and get emulated on USIIIi for many (all?) cases where the conversion is inexact, including the test cases where fxtod produced very wrong values.

So it appears the Linux kernel's emulation of fxtod is broken.

My Linux kernel on the USIIIi was compiled by gcc-4.4.4.  As a final test I recompiled it with gcc-3.4.6, but that made no difference.

So I suspect a kernel logic error, not a miscompilation.

BTW, in an interim version of the test case I did log the value of FSR, but all three systems (Linux USIIi, Linux USIIIi, and Solaris USIIIi) did run with the same rounding and exception control settings so that's not the issue.
Comment 6 Mikael Pettersson 2010-07-18 20:58:58 UTC
Created attachment 21244 [details]
fix Linux kernel math emulation FP_FROM_INT macro

The bug is in the Linux kernel math-emu code.  The _FP_FROM_INT macro which converts integers to raw floats is supposed to produce normalized floats, but
due to an error in a boundary condition, it fails to do so for a range of numbers, resulting in very incorrect floats for those numbers.

The fix is syntactically trivial (s/</<=/ in one place) but requires analysis to show that it's needed.  I'll try to get it into the Linux kernel ASAP, meanwhile it's attached to this PR.
Comment 7 Mikael Pettersson 2010-07-23 16:44:30 UTC
The Linux kernel math-emu fix is included in kernel 2.6.35-rc6.  I've re-checked that the test cases work correctly on USIIIi with -mcpu=v9 and this kernel.

The fix is scheduled for backporting to the official stable kernels, and should be trivial to backport to just about any 2.6-based kernel.

Matthias, can you please close this bug now?  (It wasn't even a gcc bug.)
Comment 8 Richard Biener 2010-07-23 17:24:52 UTC
Thus, invalid.
Comment 9 Jakub Jelinek 2010-07-24 08:07:38 UTC
Well, the soft-fp code is also in glibc and gcc, so it will likely need fixing there as well.  Do you have a reference to lkml post?
Comment 10 Mikael Pettersson 2010-07-24 08:45:58 UTC
The lkml post is:
http://marc.info/?l=linux-kernel&m=127957675305013&w=2

I did look briefly at glibc's soft-fp, but (a) it was substantially updated in February 2006, and (b) none of my systems seemed to enable it (i.e., gcc -msoft-float appeared to generate calls to libgcc rather than libc) so I didn't know how to test it short of building a custom glibc.
Comment 11 Jakub Jelinek 2010-07-24 19:59:50 UTC
In gcc, soft-fp is used e.g. for x86_64 __float128 support.
In glibc, soft-fp is used on several architectures for long double support, e.g. sparc* or alpha.
Comment 12 Paul Zimmermann 2010-09-07 11:47:10 UTC
*** Bug 45559 has been marked as a duplicate of this bug. ***