Bug 66375 - [4.9 Regression] wrong code at -O2 and -O3 on x86_64-linux-gnu
Summary: [4.9 Regression] wrong code at -O2 and -O3 on x86_64-linux-gnu
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 6.0
: P2 normal
Target Milestone: 4.9.4
Assignee: Richard Biener
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2015-06-02 05:43 UTC by Zhendong Su
Modified: 2016-02-11 13:45 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Known to work: 5.2.0, 6.0
Known to fail: 4.9.3, 5.1.0
Last reconfirmed: 2015-06-02 00:00:00


Attachments
patch (869 bytes, patch)
2015-06-02 14:34 UTC, Richard Biener
Details | Diff
Backport to gcc-4_9-branch/gcc-4.9.2: PR tree-optimization/66375 (wrong code fix!) (1012 bytes, patch)
2015-06-09 23:55 UTC, Bernhard Kaindl
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Zhendong Su 2015-06-02 05:43:36 UTC
The current gcc trunk miscompiles the following code on x86_64-linux at -O2 and -O3 in both 32-bit and 64-bit modes.  

It also affects gcc 4.8.x, 4.9.x and 5.1.x. This is a regression from 4.7.x.


$ gcc-trunk -v
Using built-in specs.
COLLECT_GCC=gcc-trunk
COLLECT_LTO_WRAPPER=/usr/local/gcc-trunk/libexec/gcc/x86_64-unknown-linux-gnu/6.0.0/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: ../gcc-trunk/configure --prefix=/usr/local/gcc-trunk --enable-languages=c,c++ --disable-werror --enable-multilib
Thread model: posix
gcc version 6.0.0 20150601 (experimental) [trunk revision 223995] (GCC) 
$  
$ gcc-trunk -Os small.c; ./a.out
$ gcc-4.7 -O3 small.c; ./a.out
$ 
$ gcc-trunk -O2 small.c
$ ./a.out
Aborted (core dumped)
$ 


---------------------------------


int a, b, c, d[6];

int
main ()
{
  c = 0;
  for (; a < 14; a++)
    {
      int i = 0;
      for (; i < 6; i++)
        d[i] = 11;
      char e = b = c;
      c = e - d[0];
    }

  if (b != 113)
    __builtin_abort();

  return 0;
}
Comment 1 Andrew Pinski 2015-06-02 05:49:16 UTC
Works for me on aarch64-linux-gnu.
Comment 2 Jakub Jelinek 2015-06-02 06:16:03 UTC
Started with r185691.
Comment 3 Richard Biener 2015-06-02 08:50:14 UTC
I will have a looksee.
Comment 4 Richard Biener 2015-06-02 09:31:24 UTC
PRE doesn't seem to do sth wrong here.  The testcase can be fixed by for example disabling VRP2.  VRP2 needs either store-motion or complete unrolling to
make it trigger the miscompile.  Nicest IL before VRP2 is with -fno-ivopts -fno-tree-loop-im.  The transform VRP2 does is

Folding statement: if (a.4_17 <= 13)
Simplified relational if (a.4_17 <= 13)
 into if (a.4_17 != 14)

and

  Registering jump thread: (6, 7) incoming edge;  (7, 8) normal;
  Threaded jump 6 --> 7 to 12

  <bb 6>:
  # prephitmp_3 = PHI <prephitmp_22(5)>
  # prephitmp_39 = PHI <prephitmp_22(5)>

  <bb 7>:
  # prephitmp_26 = PHI <prephitmp_39(6), pretmp_23(3)>
  if (prephitmp_26 != 113)
    goto <bb 8>;
  else
    goto <bb 9>;

  <bb 8>:
  __builtin_abort ();

so it looks jump-threading related to me.  Somehow the equivalences I
see being recorded miss the backedge for prephitmp_22 but only record zero.
Comment 5 Richard Biener 2015-06-02 10:31:33 UTC
Hum.

  <bb 5>:
  # prephitmp_22 = PHI <0(4), c.2_15(10)>
...
  e_12 = (char) prephitmp_22;
  _13 = (int) e_12;
...
  c.2_15 = _13 + -11;

Simulating statement (from ssa_edges): prephitmp_22 = PHI <0(4), c.2_15(10)>

Visiting PHI node: prephitmp_22 = PHI <0(4), c.2_15(10)>
    Argument #0 (4 -> 5 executable)
        0: [0, 0]
    Argument #1 (10 -> 5 executable)
        c.2_15: [-22, -11]
Meeting
  [0, 0]
and
  [-22, -11]
to
  [-22, 0]
...
Found new range for prephitmp_22: [-2147483647, 0]

...

Visiting statement:
_13 = (int) e_12;
Intersecting
  [-128, 127]
and
  [-128, 127]
to
  [-128, 127]
Found new range for _13: [-128, 127]
marking stmt to be not simulated again

Simulating statement (from ssa_edges): c.2_15 = _13 + -11;

Visiting statement:
c.2_15 = _13 + -11;
Found new range for c.2_15: [-139, 116]
marking stmt to be not simulated again

Simulating statement (from ssa_edges): prephitmp_22 = PHI <0(4), c.2_15(10)>

Visiting PHI node: prephitmp_22 = PHI <0(4), c.2_15(10)>
    Argument #0 (4 -> 5 executable)
        0: [0, 0]
    Argument #1 (10 -> 5 executable)
        c.2_15: [-139, 116]
Meeting
  [0, 0]
and
  [-139, 116]
to
  [-139, 116]
marking stmt to be not simulated again

(note no "Found new range for prephitmp_22" here!)

Value ranges after VRP:

prephitmp_22: [-2147483647, 0]

oops.

This seems to be because we drop to [-2147483647, 2147483646] but then
adjust_range_with_scev computes {0, +, -11}_1 for _22  which is obviously
wrong.

It looks like the CHREC gets built from

static t_bool
follow_ssa_edge_binary (struct loop *loop, gimple at_stmt,
                        tree type, tree rhs0, enum tree_code code, tree rhs1,
                        gphi *halting_phi, tree *evolution_of_loop,
                        int limit)
{
...
  switch (code)
    {
    case POINTER_PLUS_EXPR:
    case PLUS_EXPR:
      if (TREE_CODE (rhs0) == SSA_NAME)
        {
          if (TREE_CODE (rhs1) == SSA_NAME)
            {
...
          else
            {
              /* Match an assignment under the form:
                 "a = b + ...".  */
              res = follow_ssa_edge
                (loop, SSA_NAME_DEF_STMT (rhs0), halting_phi,
                 evolution_of_loop, limit);
              if (res == t_true)
                *evolution_of_loop = add_to_evolution
                  (loop->num, chrec_convert (type, *evolution_of_loop,
                                             at_stmt),
                   code, rhs1, at_stmt);

and what goes wrong is follow_ssa_edge skipping the (unsigned char) truncation.

Still digging...
Comment 6 Richard Biener 2015-06-02 11:20:58 UTC
Indeed as we just feed the initial condition to chrec_convert it happily just
fold_convert()s the zero to signed char and then back to int ...

So

              res = follow_ssa_edge
                (loop, SSA_NAME_DEF_STMT (rhs0), halting_phi,
                 evolution_of_loop, limit);
              if (res == t_true)
                *evolution_of_loop = add_to_evolution
                  (loop->num, chrec_convert (type, *evolution_of_loop,
                                             at_stmt),
                   code, rhs1, at_stmt);

doesn't work at all (the follow_ssa_edge call just picking up the initial
value converted to the type of the add).

static tree
analyze_evolution_in_loop (gphi *loop_phi_node,
                           tree init_cond)
{
...
          /* Pass in the initial condition to the follow edge function.  */
          ev_fn = init_cond;
          res = follow_ssa_edge (loop, ssa_chain, loop_phi_node, &ev_fn, 0);

I suppose a trick would be to always pass a symbolic initial condition
but then this would likely pessimize code.  With that we get

(add_to_evolution
  (loop_nb = 1)
  (chrec_before = (int) (signed char) D.1876)
  (to_add = -11)
  (res = {(int) (signed char) D.1876, +, -11}_1))
  (evolution_function = {(int) (signed char) 0, +, -11}_1))

Hmm, but that isn't correct either.  We want (int)(signed char){(int)0, +, -11)_1


Sorter testcase pin-pointing the issue in VRP/SCEV:


int a;
extern void abort (void);
int main ()
{
  int c = 0;
  for (; a < 13; ++a)
    c = (signed char)c - 11;
  if (c != 113)
    abort ();
  return c;
}
Comment 7 Richard Biener 2015-06-02 11:56:25 UTC
Sebastian - it seems that we have to somehow tell the analysis phase that we
are analyzing an evolution, not just an initial value.  Like instead of

              /* Match an assignment under the form:
                 "a = b + ...".  */
              res = follow_ssa_edge
                (loop, SSA_NAME_DEF_STMT (rhs0), halting_phi,
                 evolution_of_loop, limit);
              if (res == t_true)
                *evolution_of_loop = add_to_evolution
                  (loop->num, chrec_convert (type, *evolution_of_loop,
                                             at_stmt),
                   code, rhs1, at_stmt);

doing the add_to_evolution first and only then following the SSA edge.
That seems to kind-of work, though the debug output is strange:

(analyze_scalar_evolution
  (loop_nb = 1)
  (scalar = c_1)
(get_scalar_evolution
  (scalar = c_1)
  (scalar_evolution = ))
(analyze_initial_condition
  (loop_phi_node =
c_1 = PHI <0(2), c_7(3)>
)
  (init_cond = 0))
(analyze_evolution_in_loop
  (loop_phi_node = c_1 = PHI <0(2), c_7(3)>
)
(add_to_evolution
  (loop_nb = 1)
  (chrec_before = 0)
  (to_add = -11)
  (res = {0, +, -11}_1))
... computing niter, from chrec_convert stuff I guess
  (evolution_function = (int) (signed char) {0, +, 245}_1))

(that 245 looks wrong)

but at least the scalar evolution ends up (correctly) as being not know...

That is with

Index: tree-scalar-evolution.c
===================================================================
--- tree-scalar-evolution.c     (revision 224013)
+++ tree-scalar-evolution.c     (working copy)
@@ -991,15 +991,15 @@ follow_ssa_edge_binary (struct loop *loo
            {
              /* Match an assignment under the form:
                 "a = b + ...".  */
+             *evolution_of_loop = add_to_evolution
+                 (loop->num, chrec_convert (type, *evolution_of_loop,
+                                            at_stmt),
+                  code, rhs1, at_stmt);
              res = follow_ssa_edge
                (loop, SSA_NAME_DEF_STMT (rhs0), halting_phi,
                 evolution_of_loop, limit);
              if (res == t_true)
-               *evolution_of_loop = add_to_evolution
-                 (loop->num, chrec_convert (type, *evolution_of_loop,
-                                            at_stmt),
-                  code, rhs1, at_stmt);
-
+               ;
              else if (res == t_dont_know)
                *evolution_of_loop = chrec_dont_know;
            }

but as said, I don't think this is correct ...?  (just thrown it to testing)
Comment 8 Richard Biener 2015-06-02 14:33:33 UTC
Bootstrapped / tested on x86_64-unknown-linux-gnu with no regressions...
testing a more complete fix (applying this to all cases in follow_ssa_edge_binary)
now.
Comment 9 Richard Biener 2015-06-02 14:34:05 UTC
Created attachment 35678 [details]
patch

What I am testing.
Comment 10 Richard Biener 2015-06-03 07:57:45 UTC
Author: rguenth
Date: Wed Jun  3 07:57:13 2015
New Revision: 224060

URL: https://gcc.gnu.org/viewcvs?rev=224060&root=gcc&view=rev
Log:
2015-06-03  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/66375
	* tree-scalar-evolution.c (follow_ssa_edge_binary): First
	add to the evolution before following SSA edges.

	* gcc.dg/torture/pr66375.c: New testcase.

Added:
    trunk/gcc/testsuite/gcc.dg/torture/pr66375.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-scalar-evolution.c
Comment 11 Bernhard Kaindl 2015-06-09 23:55:39 UTC
Created attachment 35735 [details]
Backport to gcc-4_9-branch/gcc-4.9.2: PR tree-optimization/66375 (wrong code fix!)

Backport from mainline to gcc-4_9-branch (also applies to gcc-4.9.2):

Note: Fixes wrong code for x86 and x86_64 targets
      (other targets apparently not affected)

URL: <this attachment>
Log:
2015-06-10  Bernhard Kaindl <bernhard.kaindl@thalesgroup.com>

       Backport from mainline
       2015-06-03  Richard Biener  <rguenther@suse.de>

       PR tree-optimization/66375

       * tree-scalar-evolution.c (follow_ssa_edge_binary): First
       add to the evolution before following SSA edges.

Added:
    trunk/gcc/testsuite/gcc.dg/torture/pr66375.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-scalar-evolution.c

The context of the ChangeLogs is removed, so it applies using "patch -p1" without ever causing a conflict (other changes may be merged before this,
the ChangeLogs are automatically added on top without context matching),
but it won't apply with "git apply" this way.
Comment 12 Richard Biener 2015-06-10 07:01:25 UTC
Thanks, I am going to do another round of backporting today or tomorrow.
Comment 13 Richard Biener 2015-06-18 14:04:37 UTC
Author: rguenth
Date: Thu Jun 18 14:04:05 2015
New Revision: 224610

URL: https://gcc.gnu.org/viewcvs?rev=224610&root=gcc&view=rev
Log:
2015-06-18  Richard Biener  <rguenther@suse.de>

	Backport from mainline
	2015-06-03  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/66375
	* tree-scalar-evolution.c (follow_ssa_edge_binary): First
	add to the evolution before following SSA edges.

	* gcc.dg/torture/pr66375.c: New testcase.

Added:
    branches/gcc-5-branch/gcc/testsuite/gcc.dg/torture/pr66375.c
Modified:
    branches/gcc-5-branch/gcc/ChangeLog
    branches/gcc-5-branch/gcc/testsuite/ChangeLog
    branches/gcc-5-branch/gcc/tree-scalar-evolution.c
Comment 14 Richard Biener 2015-06-23 08:19:08 UTC
The gcc-4_8-branch is being closed, re-targeting regressions to 4.9.3.
Comment 15 Jakub Jelinek 2015-06-26 19:55:12 UTC
GCC 4.9.3 has been released.
Comment 16 Richard Biener 2016-02-11 13:41:04 UTC
Author: rguenth
Date: Thu Feb 11 13:40:31 2016
New Revision: 233344

URL: https://gcc.gnu.org/viewcvs?rev=233344&root=gcc&view=rev
Log:
2016-02-11  Richard Biener  <rguenther@suse.de>

	Backport from mainline
	2015-02-18  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/62217
	* tree-ssa-dom.c (cprop_operand): Avoid propagating copies
	into BIVs.

	* gcc.dg/tree-ssa/cunroll-11.c: New testcase.

	2015-06-18  Richard Biener  <rguenther@suse.de>

	Backport from mainline
	2015-06-03  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/66375
	* tree-scalar-evolution.c (follow_ssa_edge_binary): First
	add to the evolution before following SSA edges.

	* gcc.dg/torture/pr66375.c: New testcase.

	2015-06-23  Richard Biener  <rguenther@suse.de>

	Backport from mainline
	2015-06-09  Richard Biener  <rguenther@suse.de>

	PR middle-end/66413
	* tree-inline.c (insert_init_debug_bind): Unshare value.

	* gcc.dg/torture/pr66413.c: New testcase.

	2015-07-08  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/66794
	* gimple-ssa-isolate-paths.c (gimple_ssa_isolate_erroneous_paths):
	Free post-dominators.

	* gcc.dg/torture/pr66794.c: New testcase.

	2015-07-10  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/66823
	* tree-if-conv.c (memrefs_read_or_written_unconditionally): Fix
	inverted predicate.

Added:
    branches/gcc-4_9-branch/gcc/testsuite/gcc.dg/torture/pr66375.c
    branches/gcc-4_9-branch/gcc/testsuite/gcc.dg/torture/pr66413.c
    branches/gcc-4_9-branch/gcc/testsuite/gcc.dg/torture/pr66794.c
    branches/gcc-4_9-branch/gcc/testsuite/gcc.dg/tree-ssa/cunroll-11.c
Modified:
    branches/gcc-4_9-branch/gcc/ChangeLog
    branches/gcc-4_9-branch/gcc/gimple-ssa-isolate-paths.c
    branches/gcc-4_9-branch/gcc/gimple.c
    branches/gcc-4_9-branch/gcc/testsuite/ChangeLog
    branches/gcc-4_9-branch/gcc/tree-if-conv.c
    branches/gcc-4_9-branch/gcc/tree-inline.c
    branches/gcc-4_9-branch/gcc/tree-scalar-evolution.c
    branches/gcc-4_9-branch/gcc/tree-ssa-dom.c
Comment 17 Richard Biener 2016-02-11 13:45:47 UTC
Fixed.