60181 – constant folding of complex number incorrect

Bug 60181 - constant folding of complex number incorrect

Summary: constant folding of complex number incorrect

Status:	NEW

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	target (show other bugs)
Version:	4.9.0

Importance:	P3 normal
Target Milestone:	---
Assignee:	Not yet assigned to anyone

URL:
Keywords:

Duplicates (1):	63172 (view as bug list)
Depends on:
Blocks:

Reported:	2014-02-13 16:19 UTC by Andreas Krebbel
Modified:	2022-03-08 16:20 UTC (History)
CC List:	3 users (show)

See Also:
Host:
Target:	s390x-ibm-linux, powerpc--*
Build:
Known to work:
Known to fail:
Last reconfirmed:	2014-10-02 00:00:00

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Andreas Krebbel 2014-02-13 16:19:26 UTC

The following testcase fails on s390x and Power.  Constant folding and runtime execution of a division of complex numbers produce different results.

The testcase works fine on x86 so it looks like S/390 and Power do something different here.

It looks somewhat like:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30789

#include <complex.h>
#include <stdio.h>
#include <stdlib.h>

_Complex float __attribute__ ((noinline))
calc (_Complex float a, _Complex float b)
{
  return a / b;
}

int
main (int argc, char **argv)
{
  _Complex float a = calc (10 + 6 * I, 5 + 12 * I);
  _Complex float b = (10 + 6 * I) / (5 + 12 * I);

  printf ("%ap + %ap * i\n", creal (a), cimag (a));
  printf ("%ap + %ap * i\n", creal (b), cimag (b));

  if (a != b)
    abort ();

  return 0;
}

gcc -O0 t.c -o f

./f
0x1.719c08p-1p + -0x1.10a9a8p-1p * i
0x1.719c06p-1p + -0x1.10a9a8p-1p * i
Aborted

Comment 1 jsm-csl@polyomino.org.uk 2014-02-13 18:09:48 UTC

There are no specified accuracy requirements for complex multiplication / 
division, even under Annex G (parts of which - imaginary types in 
particular - are not implemented in GCC at present), beyond certain 
requirements for special cases and avoiding certain cases of overflow from 
the simplest formulas.  Constant folding is correctly rounding, runtime 
complex multiplication / division isn't (and given complaints about 
slowness at present, I don't think users would want it to be even slower, 
though there may well be a case for defining a standard library interface 
for correctly rounding complex multiplication / division).

x86 probably benefits from excess precision being used implicitly by GCC 
when compiling the implementations of complex multiplication and division.

Comment 2 Richard Biener 2014-02-14 09:37:52 UTC

I think that it needs to be decided on a case-by-case basis whether the
runtime complex division routine is "precise enough".  But yes, you
generally cannot expect constant folding and runtime execution to produce
the exact same result.  This is FP after all ...

(I would expect it for operations that are specified to be rounded correctly
to 0.5ulp precision though)

Note that the goal we have instead is that cross-compiling from all hosts
produces the same constant folding results for a target (code generation
doesn't depend on the host).

Comment 3 Andreas Krebbel 2014-02-14 11:59:52 UTC

I'll keep the bugreport open with low prio.  If I find the time I will at least try to understand what's going on before closing it.

The testcase is extracted from gcc/testsuite/go.test/test/ken/cplx2.go which fails due to this problem currently on s390.

Comment 4 Andrew Pinski 2014-02-14 15:34:06 UTC

Could this because of the use of fma for s390 and PPC inside the division code?

Comment 5 boger 2014-10-02 14:52:41 UTC

*** Bug 63172 has been marked as a duplicate of this bug. ***

Comment 6 boger 2014-10-02 15:38:17 UTC

If the last comment is true, does that mean the fold_const.c file in gcc should be built in a way so that it doesn't use the fma, like using some kind of option during the build of gcc at least for this file on the s390 and Power?

Comment 7 David Edelsohn 2014-10-02 15:47:14 UTC

confirmed.

Comment 8 Andrew Pinski 2014-10-02 16:10:52 UTC

(In reply to boger from comment #6)
> If the last comment is true, does that mean the fold_const.c file in gcc
> should be built in a way so that it doesn't use the fma, like using some
> kind of option during the build of gcc at least for this file on the s390
> and Power?

No it is fma usage in the runtime sources and not in the fold-const case.

Comment 9 Paul Zimmermann 2020-02-11 07:32:01 UTC

Here is another example:

$ cat t.c
#include <complex.h>
#include <stdio.h>
#include <stdlib.h>

_Complex double __attribute__ ((noinline))
calc_div (_Complex double a, _Complex double b)
{
  return a / b;
}

_Complex double __attribute__ ((noinline))
calc_mul (_Complex double a, _Complex double b)
{
  return a * b;
}


int
main (int argc, char **argv)
{
  double x, y, z, t;
  _Complex double a, b, c, d;

  x = -1.0689346237980500e+252; y = 7.8846096489686452e-145;
  z = 5.1152592576527620e-318;  t = 6.7327111521288819e+78;
  a = calc_div (x + y * I, z + t * I);
  b = (x + y * I) / (z + t * I);

  printf ("calc_div: %ap + %ap * i\n", creal (a), cimag (a));
  printf ("div:      %ap + %ap * i\n", creal (b), cimag (b));

  x = -4.8798814420289904e-22;  y = -6.4569205261627488e+209;
  z = -1.0918976936190148e+147; t = -8.2942999263695497e-85;
  c = calc_mul (x + y * I, z + t * I);
  d = (x + y * I) * (z + t * I);
  
  printf ("calc_mul: %ap + %ap * i\n", creal (c), cimag (c));
  printf ("mul:      %ap + %ap * i\n", creal (d), cimag (d));

  return 0;
}

With gcc version 9.2.1 20200123 (Debian 9.2.1-25) on x86_64 I get:

$ gcc -O0 t.c; ./a.out
calc_div: 0x1.5ac8471ecdabp-741p + 0x1.48aa457eda0eep+575p * i
div:      0x1.5ac8471ecdabp-741p + 0x1.48aa457eda0eep+575p * i
calc_mul: -0x1.07a6046b053p+410p + infp * i
mul:      -0x1.07a6046b053p+410p + infp * i

$ gcc -O2 t.c; ./a.out
calc_div: 0x1.5ac8471ecdabp-741p + 0x1.48aa457eda0eep+575p * i
div:      -0x1.4d35f7c831ef5p-746p + 0x1.48aa457eda0eep+575p * i
calc_mul: -0x1.07a6046b053p+410p + infp * i
mul:      -0x1.07a6046b053p+410p + infp * i

For division, the correctly rounded result is the one for "div" with -O2 (negative real part).

For multiplication, all results are incorrectly rounded, we should get -0x1.07a6046b05347p410 if I'm correct.

I understand there is no accuracy requirement, but I thought with -O2 gcc was using GNU MPC to get correct rounding, thus with -O2 both the "div" and "mul" results should be correctly rounded. Why isn't it the case for mul?

Comment 10 Marc Glisse 2020-02-11 08:18:38 UTC

Flags like -ftrapping-math can prevent gcc from folding at compile-time when the result is infinite (or maybe it always refuses to fold in that case). In your example, gcc generates a runtime call to __muldc3 in main. The behavior with -fno-trapping-math is a bit surprising. CCP folds the division but not the multiplication (not even with -ffast-math)). CPLXLOWER turns it into __muldc3, and DOM2 folds that to a (bad?) constant.

So in some sense, when gcc fails to fold to a constant, then we go to the runtime behavior, and gcc can still fold that one to a constant (not as good as what it could get in the first round).