Bug 55481

Summary: [4.8 regression] -O2 generates a wrong-code infinite loop in C++Benchmark's simple_types_constant_folding int8 xor test
Product: gcc Reporter: Matt Hargett <matt>
Component: middle-endAssignee: Richard Biener <rguenth>
Status: RESOLVED FIXED    
Severity: normal CC: rakdver
Priority: P1    
Version: 4.8.0   
Target Milestone: 4.7.3   
Host: Target:
Build: Known to work:
Known to fail: Last reconfirmed: 2012-11-27 00:00:00
Attachments: zip containing preprocessed source of reduced examples and multiple binaries. only gcc48-O2 exhibits the bug.
reduced testcase
A proposed patch

Description Matt Hargett 2012-11-27 01:08:37 UTC
With current trunk at -O2, the attached preprocessed code emits what appears to be an infinite loop (linux/x64 binaries also attached). GCC 4.7 at -O2 and GCC 4.8 ant -O[13] do not exhibit the problem. I tried adding individual flags to -O1 to narrow the problem but wasn't able to reproduce.
Comment 1 Matt Hargett 2012-11-27 01:09:28 UTC
Created attachment 28784 [details]
zip containing preprocessed source of reduced examples and multiple binaries. only gcc48-O2 exhibits the bug.
Comment 2 Andrew Pinski 2012-11-27 01:10:07 UTC
I bet there is an access out of bounds in the loop that produces the infinite loop.
Comment 3 Matt Hargett 2012-11-27 02:11:29 UTC
Actually, the same problem happens at -O3 with const int SIZE > 20. base_iterations can be very high; it's just SIZE that's the problem.
Comment 4 Richard Biener 2012-11-27 10:40:45 UTC
Infinite loops are a sign for out-of-bound array accesses, 4.8 now very
aggressively exploits undefined behavior that this triggers.

Btw, it also happens when I declare data8 as

int8_t data8[SIZE*2 ];

so it does look like genuine bug in GCC.  It loops in

#0  0x00000000004009cc in test_constant<signed char, custom_or_constants<signed char> > (count=13, label=0x4012da "int8_t or constants", 
    first=0x6020d0 <data8> '\001' <repeats 13 times>)
    at benchmark_shared_tests.h:661

btw.
Comment 5 Richard Biener 2012-11-27 10:41:34 UTC
Created attachment 28790 [details]
reduced testcase

Reduced testcase attached.
Comment 6 Richard Biener 2012-11-27 11:02:39 UTC
More reduced testcase, fails when compiled at -O2 with the C++ frontend,
passes compiled with the C frontend ...

typedef signed char int8_t;
extern void abort (void);

#define SIZE 13
double init_value = 1.0;
int8_t data8[SIZE];

static inline int8_t __attribute__((always_inline))
do_shift(int8_t input) { return ((int8_t)(23) | (int8_t)(10)); }

int main(int argc, char** argv)
{
  int8_t *first = data8;
  int8_t *last = data8+SIZE;
  while (first != last) *first++ = (int8_t)(init_value);
  first = data8;
  int i;
  for(i = 0; i < 1000; ++i)
    {
      int8_t result = 0;
      int n;
      for (n = 0; n < SIZE; ++n)
        result += do_shift( first[n] );
      int8_t temp = (int8_t)SIZE * do_shift((int8_t)init_value);
      if (result != temp)
        abort ();
    }
  return 0;
}
Comment 7 Richard Biener 2012-11-27 11:12:03 UTC
Note that the frontends warn with

lucky13x.c: In function 'int main(int, char**)':
lucky13x.c:24:64: warning: overflow in implicit constant conversion [-Woverflow]
       int8_t temp = (int8_t)SIZE * ((int8_t)(23) | (int8_t)(10));

when manually inlining do_shift at the checking point:

      int8_t temp = (int8_t)SIZE * ((int8_t)(23) | (int8_t)(10));
      if (result != temp)
        abort ();

this computes 13 * 31 which is 403 so the warning is correct.  Fixed with
an extra cast.  This also means that

      for (n = 0; n < SIZE; ++n)
        result += do_shift( first[n] );

triggers the bug about doing the addition in the wrong type and the bug
is fixed by rewriting this as

        result = result + do_shift (first[n]);

that is, self-modify is not correctly doing the addition in 'int'.  And thus
this is a dup of PR35634 I believe (-fwrapv also fixes this).
Comment 8 Richard Biener 2012-11-27 16:10:29 UTC
The following even more reduced testcase is not fixed by the patch pending
for PR35634.

typedef signed char int8_t;

#define SIZE 13

static inline int8_t do_shift() { return 31; }

int main(int argc, char** argv)
{
  int i;
  for(i = 0; i < 1000; ++i)
    {
      int8_t result = 0;
      int n;
      for (n = 0; n < SIZE; ++n)
        result += do_shift();
      int8_t temp = (int8_t)((int8_t)SIZE * ((int8_t)(23) | (int8_t)(10)));
      if (result != temp)
        __builtin_abort ();
    }
  return 0;
}


With the C++ frontend we correctly do

        D.2194 = do_shift ();
        D.2207 = (int) result;
        D.2208 = (int) D.2194;
        D.2209 = D.2207 + D.2208;
        result = (int8_t) D.2209;

anyway, but then IVOPTs comes around and changes it:

   <bb 4>:
   # result_17 = PHI <result_8(3), 0(7)>
-  # n_18 = PHI <n_9(3), 0(7)>
-  # ivtmp_12 = PHI <ivtmp_13(3), 13(7)>
   _6 = (int) result_17;
   _7 = _6 + 31;
-  result_8 = (int8_t) _7;
-  n_9 = n_18 + 1;
-  ivtmp_13 = ivtmp_12 - 1;
-  if (ivtmp_13 != 0)
+  _1 = result_17 + 31;
+  result_8 = _1;
+  if (result_8 != -109)
     goto <bb 3>;

performing the computation in int8_t again which causes VRP to "miscompile"
the test.
Comment 9 Richard Biener 2012-11-28 15:14:20 UTC
Testcase that fails (infinite loop) with both the C and the C++ frontend at -O2:

int main()
{
  signed char result = 0;
  int n;
  for (n = 0; n < 13; ++n)
    {
      int tem = result;
      tem = tem + 31;
      result = tem;
    }
  if (result != -109)
    __builtin_abort ();
  return 0;
}
Comment 10 Richard Biener 2012-11-28 15:16:00 UTC
Caused by

2012-06-27  Richard Guenther  <rguenther@suse.de>

        PR middle-end/53676
        * tree-chrec.c (chrec_convert_1): Represent truncation to
        a type with undefined overflow as truncation to an unsigned
        type converted to the type with undefined overflow.
        * tree-scalar-evolution.c (interpret_rhs_expr): For computing
        the scalar evolution of a truncated widened operation avoid
        looking at the non-existing evolution of the widened operation
        result.
Comment 11 Richard Biener 2012-11-28 15:16:51 UTC
Mine.
Comment 12 Richard Biener 2012-11-28 15:38:46 UTC
We have

@@ -15,22 +15,11 @@
   (scalar = result_11)
 (get_scalar_evolution 
   (scalar = result_11)
-  (scalar_evolution = result_11))
+  (scalar_evolution = (signed char) {0, +, 31}_1))

that isn't wrong.  But IVOPTs happily uses STRIP_NOPs which also strips
sign-conversions.  simple_iv returns true for result_11 in

  signed char result;

  <bb 3>:
  # result_11 = PHI <result_5(4), 0(2)>
  # ivtmp_10 = PHI <ivtmp_9(4), 13(2)>
  tem_3 = (int) result_11;
  tem_4 = tem_3 + 31;
  result_5 = (signed char) tem_4;
  ivtmp_9 = ivtmp_10 - 1;
  if (ivtmp_9 != 0)
    goto <bb 4>;
  else
    goto <bb 5>;

  <bb 4>:
  goto <bb 3>;

now, but iv->no_overflow is false (and IVOPTs nowhere uses that flag ...).

I can fix this for example with

Index: tree-ssa-loop-ivopts.c
===================================================================
--- tree-ssa-loop-ivopts.c      (revision 193887)
+++ tree-ssa-loop-ivopts.c      (working copy)
@@ -982,6 +982,9 @@ determine_biv_step (gimple phi)
   if (!simple_iv (loop, loop, name, &iv, true))
     return NULL_TREE;
 
+  if (!iv.no_overflow)
+    return NULL_TREE;
+
   return integer_zerop (iv.step) ? NULL_TREE : iv.step;
 }
 
but I'm not sure what invariants should hold for BIVs and if the overflow
check should happen in a different place instead.  Zdenek?
Comment 13 rakdver 2012-11-28 16:19:11 UTC
> now, but iv->no_overflow is false (and IVOPTs nowhere uses that flag ...).
> 
> I can fix this for example with
> 
> Index: tree-ssa-loop-ivopts.c
> ===================================================================
> --- tree-ssa-loop-ivopts.c      (revision 193887)
> +++ tree-ssa-loop-ivopts.c      (working copy)
> @@ -982,6 +982,9 @@ determine_biv_step (gimple phi)
>    if (!simple_iv (loop, loop, name, &iv, true))
>      return NULL_TREE;
> 
> +  if (!iv.no_overflow)
> +    return NULL_TREE;
> +
>    return integer_zerop (iv.step) ? NULL_TREE : iv.step;
>  }
> 
> but I'm not sure what invariants should hold for BIVs and if the overflow
> check should happen in a different place instead.  Zdenek?

Ivopts transformations are intended to work even with wrapping induction variables;
so this does not seem to be the right place to fix the problem.  I will have
to have a closer look to see what goes wrong.
Comment 14 Zdenek Dvorak 2012-12-10 18:19:07 UTC
This is a problem in rewrite_use_nonlinear_expr, which should leave the statement defining the biv untouched (as suggested in the comment at its beginning) but does not.  Investigating further...
Comment 15 Zdenek Dvorak 2012-12-11 18:19:40 UTC
Created attachment 28928 [details]
A proposed patch

This patch fixes the error (and also makes the code more clearly match what is described in the comments).  Long story: we try to avoid rewriting computations that define bivs, as that is usually unnecessary.  However, there is a special case where rewriting is still needed; e.g.,

loop
{
  tmp = biv + 1;
  use (tmp);
  biv = tmp + 1;   (*)
}

Ivopts may decide to eliminate the temporary variable tmp by expressing its use in some other way (based on a different biv, etc.).  So, if we did not rewrite (*), it would refer to a variable that is no longer defined.

And in this case, we used a wrong way of expressing the biv computation that may cause signed overflows.  The patch makes us instead to fall back to the general rewriting case, which does not have this problem.  Untested (beyond the single testcase + C-only bootstrap).
Comment 16 Richard Biener 2012-12-12 09:43:18 UTC
Thanks Zdenek - I'll give the patch further testing.
Comment 17 Richard Biener 2012-12-12 13:07:26 UTC
Author: rguenth
Date: Wed Dec 12 13:07:19 2012
New Revision: 194444

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=194444
Log:
2012-12-12  Zdenek Dvorak  <ook@ucw.cz>

	PR tree-optimization/55481
	* tree-ssa-loop-ivopts.c (rewrite_use_nonlinear_expr): Fall
	back to general rewriting if we cannot leave an original biv
	definition alone.

	* gcc.dg/torture/pr55481.c: New testcase.

Added:
    trunk/gcc/testsuite/gcc.dg/torture/pr55481.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-ssa-loop-ivopts.c
Comment 18 Richard Biener 2012-12-12 13:08:06 UTC
Fixed.
Comment 19 Richard Biener 2013-03-01 12:23:57 UTC
*** Bug 56488 has been marked as a duplicate of this bug. ***
Comment 20 Richard Biener 2013-03-01 12:29:46 UTC
Author: rguenth
Date: Fri Mar  1 12:29:39 2013
New Revision: 196377

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=196377
Log:
2013-03-01  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/55481
	* gcc.dg/torture/pr56488.c: New testcase.

Added:
    trunk/gcc/testsuite/gcc.dg/torture/pr56488.c
Modified:
    trunk/gcc/testsuite/ChangeLog
Comment 21 Richard Biener 2013-03-01 13:55:15 UTC
Fixed for 4.7.3 as well.
Comment 22 Richard Biener 2013-03-01 13:55:19 UTC
Author: rguenth
Date: Fri Mar  1 13:55:11 2013
New Revision: 196379

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=196379
Log:
2013-03-01  Richard Biener  <rguenther@suse.de>

	Backport from mainline
	2012-12-12  Zdenek Dvorak  <ook@ucw.cz>

	PR tree-optimization/55481
	* tree-ssa-loop-ivopts.c (rewrite_use_nonlinear_expr): Fall
	back to general rewriting if we cannot leave an original biv
	definition alone.

	* gcc.dg/torture/pr55481.c: New testcase.
	* gcc.dg/torture/pr56488.c: Likewise.

Added:
    branches/gcc-4_7-branch/gcc/testsuite/gcc.dg/torture/pr55481.c
    branches/gcc-4_7-branch/gcc/testsuite/gcc.dg/torture/pr56488.c
Modified:
    branches/gcc-4_7-branch/gcc/ChangeLog
    branches/gcc-4_7-branch/gcc/testsuite/ChangeLog
    branches/gcc-4_7-branch/gcc/tree-ssa-loop-ivopts.c