Bug 60413 - extra precision not properly removed on assignment of return value
Summary: extra precision not properly removed on assignment of return value
Status: RESOLVED INVALID
Alias: None
Product: gcc
Classification: Unclassified
Component: c (show other bugs)
Version: 4.8.2
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-03-04 15:15 UTC by Ryan Lortie
Modified: 2014-03-04 17:00 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Ryan Lortie 2014-03-04 15:15:57 UTC
This problem has been seen with at least:

  gcc version 4.8.2 20131212 (Red Hat 4.8.2-7) (GCC) 

and

  gcc version 4.8.2 (Ubuntu 4.8.2-16ubuntu4)

so I believe it to be an upstream problem.

This problem has only been observed to happen on 32bit compilations.  There doesn't seem to be a problem with 64bit.

Consider this code:

==> get-value.h <==
double get_value (void);

==> get-value.c <==
#include "get-value.h"

#include <stdint.h>

int x = 1;

double
get_value (void)
{
  return x / 1e6;
}

==> main.c <==
#include "get-value.h"

#include <stdlib.h>

int
main (void)
{
  double a, b;

  a = get_value ();
  b = get_value ();

  if (a != b)
    abort ();

  return 0;
}

and build it with -O2 -m32.

You will get an abort.

The reason for this is because the return value of the get_value() function comes via a floating point register.  These registers have a higher precision than IEEE double.  The spec permits "extra range and precision":


"""

8  Except for assignment and cast (which remove all extra range and precision),
the values of operations with floating operands and values subject to the usual
arithmetic conversions and of floating constants are evaluated to a format
whose range and precision may be greater than required by the type.

"""


It seems that GCC is failing to remove the extra precision on the assignment "b = get_value();".

Indeed, looking at the code that is output:

        call    get_value
        movsd   %xmm0, 8(%rsp)
        call    get_value
        movsd   8(%rsp), %xmm1
        ucomisd %xmm0, %xmm1

we see that the first call has the return value stored in memory, but the comparison uses the value from the second call directly, without truncating the precision.

Adding 'volatile' to the local variables involved is an effective workaround for the problem.
Comment 1 Jakub Jelinek 2014-03-04 16:14:37 UTC
Use -fexcess-precision=standard or -std=c99 if you want the slower, but standard conforming, rounds to get rid of excess precision.
Comment 2 Ryan Lortie 2014-03-04 16:24:42 UTC
Why is this violation of standards treated in a special way?

Quoting from gcc's manpage:

       -ffast-math
           Sets -fno-math-errno, -funsafe-math-optimizations,
           -ffinite-math-only, -fno-rounding-math, -fno-signaling-nans
           and -fcx-limited-range.

           This option causes the preprocessor macro "__FAST_MATH__"
           to be defined.

           This option is not turned on by any -O option besides
           -Ofast since it can result in incorrect output for programs
           that depend on an exact implementation of IEEE or ISO
           rules/specifications for math functions. It may, however,
           yield faster code for programs that do not require the
           guarantees of these specifications.

It seems that the logic about "since it can result in incorrect output for programs that depend on an exact implementation of IEEE or ISO rules/specifications" should be equally applied here.

ie: default should be to follow the standards, and maybe have 'fast' mode enabled if the user gives -ffast-math.
Comment 3 Jakub Jelinek 2014-03-04 16:28:25 UTC
(In reply to Ryan Lortie from comment #2)
> Why is this violation of standards treated in a special way?

Because it slows down things way too much.  Much better is just to use -msse2 -mfpmath=sse if you really need to use 32-bit programs and have at least SSE2 capable CPU, i387 floating point stack has tons of issues.
Comment 4 Ryan Lortie 2014-03-04 16:43:41 UTC
It seems like a good solution to this problem might be to enable -mfpmath=sse by default on arches where SSE is known to be supported and -fexcess-precision=standard otherwise.  If people want their binaries to be backwards compatible to machines before the pentium3 then they can pay the price in performance -- at least we would not be violating the standard.

This would be nicely mixed with an appeal to distributions to bring their default -march= flag a bit more up to date...
Comment 5 Dominique d'Humieres 2014-03-04 17:00:45 UTC
I think this is another duplicate of pr323.