Bug 28545 - [4.1 Regression] Wrong code for hoisted multiplication
Summary: [4.1 Regression] Wrong code for hoisted multiplication
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.1.2
: P1 normal
Target Milestone: 4.1.2
Assignee: Andrew Pinski
URL: http://gcc.gnu.org/ml/gcc-patches/200...
Keywords: patch, wrong-code
: 29740 30440 (view as bug list)
Depends on:
Blocks:
 
Reported: 2006-07-30 18:42 UTC by Ben Hutchings
Modified: 2007-01-14 06:07 UTC (History)
5 users (show)

See Also:
Host:
Target:
Build:
Known to work: 4.2.0 4.0.3
Known to fail: 4.1.0 4.1.1
Last reconfirmed: 2006-10-14 14:48:56


Attachments
test case (8.16 KB, text/x-csrc)
2006-07-30 18:46 UTC, Ben Hutchings
Details
patch which I am testing (1.22 KB, patch)
2006-10-14 16:09 UTC, Andrew Pinski
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Ben Hutchings 2006-07-30 18:42:45 UTC
I have some code that calls
    memset(p, cr+1, cr*cr);
inside a loop.  (Will attach this later.)

gcc 4.1.2 (snapshot from 20060715) appears to attempt to calculate cr*cr outside the loop and then load it when needed. However, the generated code uses the wrong register for one the multiplicands:

	movl	%edi, %ecx     /* edi contains cr */
	xorl	%eax, %eax
	imull	%ebx, %ecx     /* ebx contains another variable */
	leal	1(%edi), %ebx
Comment 1 Ben Hutchings 2006-07-30 18:46:29 UTC
Created attachment 11974 [details]
test case

Test case.
It should produce the output "Difficulty rating: Extreme (complex non-recursive techniques required)".
When compiled and run on i486-linux-gnu with -O2 it seg-faults in memset() due to a wrong length value.
Comment 2 Ben Hutchings 2006-07-30 19:02:11 UTC
This code appears to be compiled correctly with a 2006-07-21 snapshot. This doesn't necessarily mean the underlying bug has gone away, though.
Comment 3 Andrew Pinski 2006-07-30 19:05:39 UTC
This works on the mainline.  I can reproduce it on the 4.1 branch though.  I don't know if this is a regression or not.
Comment 4 Ben Hutchings 2006-07-30 20:11:33 UTC
Works on a 2006-02-18 snapshot of 4.2
Fails on a 2005-11-24 snapshot of 4.1
Comment 5 Andrew Pinski 2006-07-31 01:41:11 UTC
I wonder if this is a memset only problem.
Comment 6 Richard Biener 2006-07-31 08:23:30 UTC
4.0.3 works.  (also does gcc 4.1.0 + patches as shipped with suse 10.1)
Comment 7 Richard Biener 2006-10-14 09:26:49 UTC
It's a VRP bug :/  Before VRP we have

  D.6958_313 = cr_285 * cr_285;
  D.6959_314 = (size_t) D.6958_313;
  D.6960_315 = cr_285 + 1;
  memset (number_287, D.6960_315, D.6959_314);

and after we get introduced a spurious, uninitialized cr_2738

  D.6958_313 = cr_285 * cr_2738;
  D.6959_314 = (size_t) D.6958_313;
  D.6960_315 = cr_285 + 1;
  memset (number_287, D.6960_315, D.6959_314);

this get's generated by

cr_2738 = ASSERT_EXPR <cr_285, x_1 < cr_285>;
Comment 8 Richard Biener 2006-10-14 10:18:53 UTC
Confirmed.  Reduced testcase, -O -ftree-vrp

int solver_set(int);

void solver(int c, int r, char *number)
{
  int cr = c*r;
  int x, y, x2, y2, n, ret;

  for (x = 0; x < cr; x++)
    for (y = 0; y < r; y++)
      for (n = 1; n <= cr; n++)
        ;

  for (x = 0; x < cr; x += r)
    for (y = 0; y < r; y++)
      ret = solver_set(r*cr);

  for (y = 0; y < cr; y++)
    ret = solver_set(cr*cr);

  for (x = 0; x < cr; x++)
    ret = solver_set(cr);

  for (n = 1; n <= cr; n++)
    ret = solver_set(cr*cr);

  for (y = 0; y+1 < cr; y++)
    for (x = 0; x+1 < cr; x++)
      for (y2 = y+1; y2 < cr; y2++)
        for (x2 = x+1; x2 < cr; x2++)
          if (x/r == x2/r && y%r == y2%r)
            continue;

  for (y = 0; y < cr; y++)
    for (x = 0; x < cr; x++) {
      for (n = 1; n <= cr; n++)
        ;
      __builtin_memset(number, cr+1, cr*cr);
    }
}

shows the failure in the t36.vrp1 dump.
Comment 9 Richard Biener 2006-10-14 10:40:52 UTC
Much simpler testcase:

void foo(int);
void solver(int cr)
{
  int x, y, n;

  for (y = 0; y < cr; y++)
    for (x = 0; x < cr; x++) {
      for (n = 0; n < cr; n++)
        ;
      foo(cr*cr);
    }
}

t35.copyprop1:

;; Function solver (solver)

solver (cr)
{
  int n;
  int y;
  int x;
  int D.1293;

<bb 0>:
  goto <bb 6> (<L7>);

<L2>:;
  n_11 = n_3 + 1;

  # n_3 = PHI <0(4), n_11(1)>;
<L3>:;
  if (n_3 < cr_5) goto <L2>; else goto <L4>;

<L4>:;
  D.1293_9 = cr_5 * cr_5;
  foo (D.1293_9);
  x_10 = x_1 + 1;

  # x_1 = PHI <0(6), x_10(3)>;
<L5>:;
  if (x_1 < cr_5) goto <L3>; else goto <L6>;

<L6>:;
  y_7 = y_2 + 1;

  # y_2 = PHI <0(0), y_7(5)>;
<L7>:;
  if (y_2 < cr_5) goto <L5>; else goto <L8>;

<L8>:;
  return;

}

t36.vrp1:

solver (cr)
{
  int n;
  int y;
  int x;
  int D.1293;

<bb 0>:
  goto <bb 6> (<L7>);

<L2>:;
  n_11 = n_3 + 1;

  # n_3 = PHI <0(4), n_11(1)>;
<L3>:;
  if (n_3 < cr_5) goto <L2>; else goto <L4>;

<L4>:;
  D.1293_9 = cr_5 * cr_13;
  foo (D.1293_9);
  x_10 = x_1 + 1;

  # x_1 = PHI <0(6), x_10(3)>;
<L5>:;
  if (x_1 < cr_5) goto <L3>; else goto <L6>;

<L6>:;
  y_7 = y_2 + 1;

  # y_2 = PHI <0(0), y_7(5)>;
<L7>:;
  if (y_2 < cr_5) goto <L5>; else goto <L8>;

<L8>:;
  return;

}


(this smells like a tree sharing issue).
Comment 10 Richard Biener 2006-10-14 11:11:36 UTC
If we watch what happens to the multiplication after VRP, we see that it is possibly an immediate uses problem:

Hardware watchpoint 6: *(union tree_node **) 3082785444

Old value = (union tree_node *) 0xb7c793dc
New value = (union tree_node *) 0xb7c7971c
set_ssa_use_from_ptr (use=0xb7c7a0bc, val=0xb7c7971c)
    at /home/richard/src/gcc-4_1-branch/gcc/tree-flow-inline.h:330
330       link_imm_use (use, val);

#0  set_ssa_use_from_ptr (use=0xb7c7a0bc, val=0xb7c7971c)
    at /home/richard/src/gcc-4_1-branch/gcc/tree-flow-inline.h:330
#1  0x080d3ed2 in maybe_replace_use (use_p=0xb7c7a0bc)
    at /home/richard/src/gcc-4_1-branch/gcc/tree-into-ssa.c:1386
#2  0x080d3d05 in rewrite_update_stmt (walk_data=0xbfaa5f50, bb=0xb7bfa460, si=
      {tsi = {ptr = 0xb7bf8f78, container = 0xb7c78018}, bb = 0xb7bfa460})
    at /home/richard/src/gcc-4_1-branch/gcc/tree-into-ssa.c:1465
#3  0x080f1573 in walk_dominator_tree (walk_data=0xbfaa5f50, bb=0xb7bfa460)
    at /home/richard/src/gcc-4_1-branch/gcc/domwalk.c:196
#4  0x080f1616 in walk_dominator_tree (walk_data=0xbfaa5f50, bb=0xb7bfa410)
    at /home/richard/src/gcc-4_1-branch/gcc/domwalk.c:212
#5  0x080f1616 in walk_dominator_tree (walk_data=0xbfaa5f50, bb=0xb7bfa5f0)
    at /home/richard/src/gcc-4_1-branch/gcc/domwalk.c:212
#6  0x080f1616 in walk_dominator_tree (walk_data=0xbfaa5f50, bb=0xb7bfa4b0)
    at /home/richard/src/gcc-4_1-branch/gcc/domwalk.c:212
#7  0x080f1616 in walk_dominator_tree (walk_data=0xbfaa5f50, bb=0xb7bfa640)
    at /home/richard/src/gcc-4_1-branch/gcc/domwalk.c:212
#8  0x080f1616 in walk_dominator_tree (walk_data=0xbfaa5f50, bb=0xb7bfa550)
    at /home/richard/src/gcc-4_1-branch/gcc/domwalk.c:212
#9  0x080d42c4 in rewrite_blocks (entry=0xb7bfa550, what=REWRITE_UPDATE,
    blocks=0x86ab0d8)
    at /home/richard/src/gcc-4_1-branch/gcc/tree-into-ssa.c:1616
#10 0x080d6d10 in update_ssa (update_flags=256)
    at /home/richard/src/gcc-4_1-branch/gcc/tree-into-ssa.c:2798
#11 0x08403f38 in insert_range_assertions ()
    at /home/richard/src/gcc-4_1-branch/gcc/tree-vrp.c:2960
#12 0x08406ce7 in execute_vrp ()
    at /home/richard/src/gcc-4_1-branch/gcc/tree-vrp.c:4176

it changes a few times until finally it settles at the wrong solution.

It get's magically fixed if I disable swapping tree operands in 
get_expr_operands ()  (WTF!?)  -  so this seems to be an operand-scanner
vs. update_ssa interaction?

Index: tree-ssa-operands.c
===================================================================
*** tree-ssa-operands.c (revision 117726)
--- tree-ssa-operands.c (working copy)
*************** get_expr_operands (tree stmt, tree *expr
*** 1257,1295 ****
      case ASSERT_EXPR:
      do_binary:
        {
-       tree op0 = TREE_OPERAND (expr, 0);
-       tree op1 = TREE_OPERAND (expr, 1);
-
-       /* If it would be profitable to swap the operands, then do so to
-          canonicalize the statement, enabling better optimization.
-
-          By placing canonicalization of such expressions here we
-          transparently keep statements in canonical form, even
-          when the statement is modified.  */
-       if (tree_swap_operands_p (op0, op1, false))
-         {
-           /* For relationals we need to swap the operands
-              and change the code.  */
-           if (code == LT_EXPR
-               || code == GT_EXPR
-               || code == LE_EXPR
-               || code == GE_EXPR)
-             {
-               TREE_SET_CODE (expr, swap_tree_comparison (code));
-               swap_tree_operands (stmt,
-                                   &TREE_OPERAND (expr, 0),
-                                   &TREE_OPERAND (expr, 1));
-             }
-
-           /* For a commutative operator we can just swap the operands.  */
-           else if (commutative_tree_code (code))
-             {
-               swap_tree_operands (stmt,
-                                   &TREE_OPERAND (expr, 0),
-                                   &TREE_OPERAND (expr, 1));
-             }
-         }
-
        get_expr_operands (stmt, &TREE_OPERAND (expr, 0), flags);
        get_expr_operands (stmt, &TREE_OPERAND (expr, 1), flags);
        return;
--- 1257,1262 ----


note this patch seems to be in mainline already!?  My archology skills are
not good enough to figure out when it was gone.
Comment 11 patchapp@dberlin.org 2006-10-14 15:05:17 UTC
Subject: Bug number PR28545

A patch for this bug has been added to the patch tracker.
The mailing list url for the patch is http://gcc.gnu.org/ml/gcc-patches/2006-10/msg00754.html
Comment 12 Andrew Pinski 2006-10-14 15:13:44 UTC
Actually see 
http://gcc.gnu.org/ml/gcc-patches/2006-03/txt00090.txt
Comment 13 Andrew Pinski 2006-10-14 16:07:51 UTC
I have the correct work around patch which I am testing right now.  It is only local to tree-vrp.c also.
Comment 14 Andrew Pinski 2006-10-14 16:09:31 UTC
Created attachment 12428 [details]
patch which I am testing
Comment 15 Andrew Macleod 2006-10-16 14:36:24 UTC
Or we can just backport the fix from mainline to 4.1 which changes the immediate use iterators and eliminates this kind of problem.   Bug 26854 and a couple of small follow up patches on 4/28 and 5/2.  They've been in mainline since the beginning of may and should be stable enough...  Makes a speed difference too.

Comment 16 Richard Biener 2006-11-06 15:15:22 UTC
*** Bug 29740 has been marked as a duplicate of this bug. ***
Comment 17 Richard Biener 2006-11-06 15:17:28 UTC
pinskia: ping!  (whole distro-rebuild with that patch ok for {x86_64,i686,ppc,ppc64,ia64,s390,s390x}-linux)
Comment 18 Andrew Pinski 2006-11-07 19:36:43 UTC
(In reply to comment #17)
> pinskia: ping!  

Why ping me, I posted the patch and waiting for approval.
Comment 19 Andrew Pinski 2006-11-12 01:49:09 UTC
Subject: Bug 28545

Author: pinskia
Date: Sun Nov 12 01:48:55 2006
New Revision: 118717

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=118717
Log:
2006-11-11  Andrew Pinski  <andrew_pinski@playstation.sony.com>

        PR tree-opt/28545
        * tree-vrp.c (replace_uses_by_vrp): New function.
        (remove_range_assertions): Use it.



Modified:
    branches/gcc-4_1-branch/gcc/ChangeLog
    branches/gcc-4_1-branch/gcc/tree-vrp.c

Comment 20 Andrew Pinski 2006-11-12 01:49:27 UTC
Fixed.
Comment 21 Andrew Pinski 2007-01-14 06:07:10 UTC
*** Bug 30440 has been marked as a duplicate of this bug. ***