Bug 36300 - [4.2 Regression] Incorrect type used for inlined expression
Summary: [4.2 Regression] Incorrect type used for inlined expression
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 4.1.2
: P2 major
Target Milestone: 4.3.1
Assignee: Not yet assigned to anyone
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2008-05-22 14:39 UTC by Alexander Carmeli
Modified: 2009-03-31 15:39 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work: 2.95.4 4.3.1 4.4.0
Known to fail: 3.3.6 4.1.3 4.2.4 4.3.0 4.2.5
Last reconfirmed: 2008-05-22 16:11:36


Attachments
Wrong result if expression is inlined (863 bytes, text/plain)
2008-05-22 14:43 UTC, Alexander Carmeli
Details
Steps to reach numerically incorrect/impossible result (802 bytes, text/plain)
2008-05-28 11:21 UTC, Alexander Carmeli
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Alexander Carmeli 2008-05-22 14:39:37 UTC
System: Debian 2.6.22.8-mw017
GCC: gcc (GCC) 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)

In the expression below:
- VALUE has type uint32_T
- Y gets a wrong result.
- If VALUE is assigned to an int32_T variable, and the variable is used instead of VALUE, Y gets the correct result.

See attached file missing_downcast.c for more details and reproduction steps

-----------------

#define VALUE ((int32_T)((int64_T)U1 * (int64_T)3 >> 2) + 2)
  Y = (int32_T)(   (int64_T)(VALUE * VALUE) 
                 * 
                   (int64_T)954437177
                 >> 
                   29
               );
Comment 1 Alexander Carmeli 2008-05-22 14:43:26 UTC
Created attachment 15671 [details]
Wrong result if expression is inlined

At the bottom of the file there are:
- Instructions on how to compile
- Version of OS and GCC on which it failed
Comment 2 Richard Biener 2008-05-22 16:10:38 UTC
  t = (int) ((long int) U1 * 3 >> 2) + 2;
  Y = (int) ((long int) (t * t) * 954437177 >> 29);

if combined is folded to

  Y = (int) (((long int) (int) ((long int) U1 * 3 >> 2) * 954437177 + 1908874354) * (long int) ((int) ((long int) U1 * 3 >> 2) + 2) >> 29);

We can see that for

int foo(int x)
{
  return (long)(((int)((long)x * 3) + 2) * (int)((long)x * 3)) * 2;
}

the conversion to long is distributed on the operands and the
multiplication by 2 is then re-associated.

I bet this is extract_muldiv at its work again.
Comment 3 Richard Biener 2008-05-22 16:11:36 UTC
I will try to have a look.
Comment 4 Richard Biener 2008-05-26 09:27:40 UTC
Reduced testcase, fails with 32bit and 64bit, works with 2.95.4.

extern void abort (void);

#define VALUE ((int)((long long)U1 * (long long)3) + 2)

int main(void)
{
  int U1;
  long long Y, Y2;
  int t;

  U1 = -2147483647-1;

  Y = ((long long)(VALUE * VALUE) * 3);

  t = VALUE;
  Y2 = ((long long)(t * t) * 3);

  if (Y != Y2)
    abort ();
  return 0;
}
Comment 5 Richard Biener 2008-05-26 10:12:26 UTC
We enter extract_muldiv with

(long long int) (((int) ((unsigned int) U1 * 3) + 2) * ((int) ((unsigned int) U1 * 3) + 2))

and 3 and go down the widening route with long long and then the same operand
of MULT_EXPR route applying a long long widened multiplication to

(int) ((unsigned int) U1 * 3) + 2

and 3 resulting in

(long long int) (int) ((unsigned int) U1 * 3) * 3 + 6


I would say this bug is invalid as it assumes we need to preserve truncations
of overflowed computations on signed integers, but with -fwrapv the same
error happens.

Testcase that works:

extern void abort (void);

#define VALUE (unsigned int)((int)((long long)U1 * (long long)3) + 2)

int main(void)
{
  int U1;
  long long Y, Y2;
  unsigned int t;

  U1 = -2147483647-1;

  Y = ((long long)(int)(VALUE * VALUE) * 3);

  t = VALUE;
  Y2 = ((long long)(int)(t * t) * 3);

  if (Y != Y2)
    abort ();
  return 0;
}
Comment 6 Richard Biener 2008-05-26 12:39:03 UTC
Subject: Bug 36300

Author: rguenth
Date: Mon May 26 12:38:19 2008
New Revision: 135913

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=135913
Log:
2008-05-26  Richard Guenther  <rguenther@suse.de>

	PR middle-end/36300
	* fold-const.c (extract_muldiv_1): Use TYPE_OVERFLOW_WRAPS,
	not TYPE_UNSIGNED.  Use TYPE_PRECISION instead of GET_MODE_SIZE.

	* gcc.dg/pr36300-1.c: New testcase.
	* gcc.dg/pr36300-2.c: Likewise.

Added:
    trunk/gcc/testsuite/gcc.dg/pr36300-1.c
    trunk/gcc/testsuite/gcc.dg/pr36300-2.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/fold-const.c
    trunk/gcc/testsuite/ChangeLog

Comment 7 Richard Biener 2008-05-26 12:39:22 UTC
-fwrapv case fixed for 4.4.0.
Comment 8 Alexander Carmeli 2008-05-27 12:50:49 UTC
Richard,

I truly appreciate your effort on this bug.

Can you please elaborate on your plan? Will this be fixed for -fwrapv only, or also for -fwrapv-free compilations?

I understand that downcasting an out-of-range value to a signed integer is not defined in the standard. However, many compilers offer an "expected" modulo-2^n behavior. This behavior helps avoid writing extra cumbersome "safe downcast" code. Having GCC offer it for non -fwrapv compilations is an important interoperability and readability feature.

Thank you for looking into it.

Alex.
Comment 9 rguenther@suse.de 2008-05-27 12:53:15 UTC
Subject: Re:  Incorrect type used for inlined expression

On Tue, 27 May 2008, acarmeli at mathworks dot com wrote:

> ------- Comment #8 from acarmeli at mathworks dot com  2008-05-27 12:50 -------
> Richard,
> 
> I truly appreciate your effort on this bug.
> 
> Can you please elaborate on your plan? Will this be fixed for -fwrapv only, or
> also for -fwrapv-free compilations?
> 
> I understand that downcasting an out-of-range value to a signed integer is not
> defined in the standard. However, many compilers offer an "expected" modulo-2^n
> behavior. This behavior helps avoid writing extra cumbersome "safe downcast"
> code. Having GCC offer it for non -fwrapv compilations is an important
> interoperability and readability feature.
> 
> Thank you for looking into it.

This is only fixed for -fwrapv (this is the "expected" modulo-2^n behavior 
switch).  It was never broken for the defined variant with the casts
to unsigned before the multiplication.

Richard.
Comment 10 Alexander Carmeli 2008-05-28 11:21:26 UTC
Created attachment 15691 [details]
Steps to reach numerically incorrect/impossible result

Compile: gcc bad_numeric.c

Builds up expression until it breaks numerically.
Comment 11 Alexander Carmeli 2008-05-28 11:30:31 UTC
I believe there is still a violation of the C standard when expression folding folds this.

The attached file demonstrates this.

In step 7, we perform 4*3 in 32-bit and expect to get a 12 in a 64-bit.
It should be possible to perform this without overflow and represent in 64-bit.
It should be possible to divide the result in 3 and get the original number: 4

However, the folded expression does not permit this to happen.
Comment 12 Richard Biener 2008-05-28 11:48:59 UTC
We are probably not honoring the implementation defined signed truncation(s).
But my head hurts looking at the testcase vs. the folding code ...
Comment 13 Richard Biener 2008-05-28 12:11:22 UTC
In step5 we have signed overflow in the multiplication of -2147483646
with itself.  Thus starting from that, the following results are unreliable.
Comment 14 Alexander Carmeli 2008-05-28 12:25:27 UTC
You are correct. But step 6 did not reveal that, and provided a correct s64 result. This result should have been carried over to step 7 to lead to a correct result.

Otherwise, it will not be possible to get the result through a folded expression.

It seems to me that adding more to the expression changes the optimization to inner expressions in a way that compromizes the numerical stability.

We also tried masking to eliminate unwanted bits. The masks seem to have no effect and we suspect they get optimized out. It was an unfavorable solution anyway, due to the cost of the mask.

In effect, there is no reliable way to grow a folded expression and consistently preserve the result of inner expressions. The only solution we have is temporary variables, which might get optimized out and folded as well.
Comment 15 rguenther@suse.de 2008-05-28 12:57:53 UTC
Subject: Re:  Incorrect type used for inlined expression

On Wed, 28 May 2008, acarmeli at mathworks dot com wrote:

> ------- Comment #14 from acarmeli at mathworks dot com  2008-05-28 12:25 -------
> You are correct. But step 6 did not reveal that, and provided a correct s64
> result. This result should have been carried over to step 7 to lead to a
> correct result.

Well, step 6 has a "correct" result by luck - the undefinedness doesn't
expose an optimization opportunity.

In step 7 the result from step 6 is not used, but instead its expression
is inserted and the intermediate result from step 6 is unused (and so
is combined with the added expression, exploiting the undefined behavior
for a transformation the folding code thinks is worthwhile).

> Otherwise, it will not be possible to get the result through a folded
> expression.

The code invokes undefined behavior if the expression in step 5 overflows.
Undefined behavior can be also strange and "inconsistent" behavior as
you see in your testcase.

> It seems to me that adding more to the expression changes the optimization to
> inner expressions in a way that compromizes the numerical stability.

Yep.  The folding code is run early and sees the whole C expression.
Obviously this exploits more "interesting" opportunities to mangle
the expressions.

> We also tried masking to eliminate unwanted bits. The masks seem to have no
> effect and we suspect they get optimized out. It was an unfavorable solution
> anyway, due to the cost of the mask.
> 
> In effect, there is no reliable way to grow a folded expression and
> consistently preserve the result of inner expressions. The only solution we
> have is temporary variables, which might get optimized out and folded as well.

True.  The only reliable way is to only write (partial) expressions that
do not invoke undefined behavior in any case ;)  This can be either
done by carefully doing operations that may overflow in unsigned
arithmetic or by providing -fwrapv as a compiler option.

Richard.
Comment 16 Richard Biener 2008-05-28 13:46:23 UTC
Fixed for 4.3.1.  Unassigning.
Comment 17 Richard Biener 2008-05-28 13:46:55 UTC
Subject: Bug 36300

Author: rguenth
Date: Wed May 28 13:45:47 2008
New Revision: 136085

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=136085
Log:
2008-05-28  Richard Guenther  <rguenther@suse.de>

	PR middle-end/36300
	* fold-const.c (extract_muldiv_1): Use TYPE_OVERFLOW_WRAPS,
	not TYPE_UNSIGNED.  Use TYPE_PRECISION instead of GET_MODE_SIZE.

	* gcc.dg/pr36300-1.c: New testcase.
	* gcc.dg/pr36300-2.c: Likewise.

Added:
    branches/gcc-4_3-branch/gcc/testsuite/gcc.dg/pr36300-1.c
    branches/gcc-4_3-branch/gcc/testsuite/gcc.dg/pr36300-2.c
Modified:
    branches/gcc-4_3-branch/gcc/ChangeLog
    branches/gcc-4_3-branch/gcc/fold-const.c
    branches/gcc-4_3-branch/gcc/testsuite/ChangeLog

Comment 18 Joseph S. Myers 2008-07-04 23:03:18 UTC
Closing 4.1 branch.
Comment 19 Joseph S. Myers 2009-03-31 15:39:11 UTC
Closing 4.2 branch, fixed in 4.3.1 and 4.4.