29581 – Latent bug in 4.1/4.2/4.3 lambda-code.c

Bug 29581 - Latent bug in 4.1/4.2/4.3 lambda-code.c

Summary: Latent bug in 4.1/4.2/4.3 lambda-code.c

Status:	RESOLVED FIXED

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	tree-optimization (show other bugs)
Version:	4.3.0

Importance:	P3 normal
Target Milestone:	4.1.2
Assignee:	Jakub Jelinek

URL:	http://gcc.gnu.org/ml/gcc-patches/200...
Keywords:	patch, wrong-code

Depends on:
Blocks:

Reported:	2006-10-24 15:21 UTC by Jakub Jelinek
Modified:	2007-01-10 19:15 UTC (History)
CC List:	3 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:	2006-11-08 16:24:55

Attachments
gcc43-pr29581-tests.patch (642 bytes, patch) 2006-11-13 10:16 UTC, Jakub Jelinek	Details \| Diff
gcc43-pr29581.patch (2.95 KB, patch) 2006-11-13 16:56 UTC, Jakub Jelinek	Details \| Diff
Show Obsolete (1) View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Jakub Jelinek 2006-10-24 15:21:36 UTC

SUBROUTINE FOO (K)
      INTEGER I, J, K, A(5,5), B
      COMMON A
      A(1,1) = 1
 10   B = 0
      DO 30 I = 1, K
        DO 20 J = 1, K
          B = B + A(I,J)
 20     CONTINUE
        A(I,I) = A(I,I) * 2
 30   CONTINUE
      IF (B.GE.3) RETURN
      GO TO 10
      END SUBROUTINE

      PROGRAM BAR
        CALL FOO (2)
      END

is miscompiled on redhat/gcc-4_1-branch, with -O2 -ftree-loop-linear,
I believe on all arches, tested at least ppc -m32 and x86_64 -m64.
Unfortunately, I couldn't reproduce this on gcc-4_1-branch nor on the
trunk, because the statements coming from earlier passes
are slightly different, though IMHO valid in all 3 cases while what
-ftree-loop-linear creates is clearly invalid.  I believe it is a latent
bug in lambda-code.c (see below), although I don't have any testcases
that would prove it on non-vendor branches.

All vars used in foo are int4, the body in *.empty is on
redhat/gcc-4_1-branch:

<bb 0>:
  __BLNK__.a[0] = 1;
  D.684_14 = *k_13;
  goto <bb 2> (<L24>);
__label_000010:;
<L24>:;
  if (D.684_14 > 0) goto <L28>; else goto __label_000010;
<L28>:;
  # b_5 = PHI <0(3), b_37(8)>;
  # i_1 = PHI <1(3), i_28(8)>;
<L25>:;
  pretmp.29_7 = i_1 + -6;
  # j_9 = PHI <1(4), j_35(6)>;
  # b_6 = PHI <b_5(4), b_33(6)>;
<L3>:;
  D.697_29 = j_9 * 5;
  D.698_30 = pretmp.29_7;
  D.699_31 = pretmp.29_7 + D.697_29;
  D.700_32 = __BLNK__.a[D.699_31];
  b_33 = D.700_32 + b_6;
  j_35 = j_9 + 1;
  if (j_9 == D.684_14) goto <L8>; else goto <L30>;
<L30>:;
  goto <bb 5> (<L3>);
  # b_37 = PHI <b_33(5)>;
<L8>:;
  D.701_18 = i_1 * 5;
  D.702_19 = pretmp.29_7;
  D.703_20 = pretmp.29_7 + D.701_18;
  D.704_24 = __BLNK__.a[D.703_20];
  D.705_25 = D.704_24 * 2;
  __BLNK__.a[D.703_20] = D.705_25;
  if (i_1 == D.684_14) goto <L13>; else goto <L31>;
<L31>:;
  i_28 = i_1 + 1;
  goto <bb 4> (<L25>);
  # b_8 = PHI <b_37(7)>;
<L13>:;
  if (b_8 > 2) goto __return_foo; else goto __label_000010;
__return_foo:;
  return;

On gcc-4_1-branch the difference is just that pretmp.29_7 doesn't
exist, all users of that have i_1 + -6 instead.  On the trunk,
pretmp var is only used in the innermost loop (L3..L30) and after
L8 label it instead uses i_1 * 6 + -6.  Now, -ftree-loop-linear
on the above determines the L3..L30 loop isn't perfectly
nested in L25..L31 but can_convert_to_perfect_nest says it can
be converted to perfect nest, because the pretmp.9_7 = i_1 + -6;
satisfies:
                  /* If the IV is simple, it can be duplicated.  */
                  if (!automatically_generated_chrec_p (scev))
                    {
                      tree step = evolution_part_in_loop_num (scev, loop->num);
                      if (step && step != chrec_dont_know
                          && TREE_CODE (step) == INTEGER_CST)
                        continue;
                    }

-ftree-loop-linear then changes the loops, so that the b computation
is done using 2 nested loops and the *2 multiplication in a separate
loop.  Unfortunately, it doesn't compute the start of the loop
correctly, above, the first __BLKN__.a[] that is multiplied by 2 is
pretmp.29_7 + i_1 * 5 = i_1 * 6 - 6 where i_1 starts at 1,
but after the transformation it is perfectiv.32_12 * 5 + perfectiv.32_12:

...
  # perfectiv.32_12 = PHI <1(11), perfectiv.32_39(14)>;
<L34>:;
<bb 13>:
  uboundvar.33_40 = D.684_14 + 1;
  perfectiv.32_39 = perfectiv.32_12 + 1;
  D.701_18 = perfectiv.32_12 * 5;
  D.702_19 = perfectiv.32_12;
  D.703_20 = perfectiv.32_12 + D.701_18;
  D.704_24 = __BLNK__.a[D.703_20];
  D.705_25 = D.704_24 * 2;
  __BLNK__.a[D.703_20] = D.705_25;
  if (uboundvar.33_40 >= perfectiv.32_39) goto <L35>; else goto <L13>;
<L35>:;
  goto <bb 12> (<L34>);

can_convert_to_perfect_nest only looks at evolution part of the scev:

              if (TREE_CODE (stmt) == MODIFY_EXPR)
                {
                  use_operand_p use_a, use_b;
                  imm_use_iterator imm_iter;
                  ssa_op_iter op_iter, op_iter1;
                  tree op0 = TREE_OPERAND (stmt, 0);
                  tree scev = instantiate_parameters
                    (loop, analyze_scalar_evolution (loop, op0));

                  /* If the IV is simple, it can be duplicated.  */
                  if (!automatically_generated_chrec_p (scev))
                    {
                      tree step = evolution_part_in_loop_num (scev, loop->num);
                      if (step && step != chrec_dont_know
                          && TREE_CODE (step) == INTEGER_CST)
                        continue;
                    }

and replace_uses_equiv_to_x_with_y happily replaces vars with the perfectiv
if the evolution part of the scev is constant and equal to the loop step:

static void
replace_uses_equiv_to_x_with_y (struct loop *loop, tree stmt, tree x,
                                int xstep, tree y)
{
  ssa_op_iter iter;
  use_operand_p use_p;

  FOR_EACH_SSA_USE_OPERAND (use_p, stmt, iter, SSA_OP_USE)
    {
      tree use = USE_FROM_PTR (use_p);
      tree step = NULL_TREE;
      tree scev = instantiate_parameters (loop,
                                          analyze_scalar_evolution (loop, use));

      if (scev != NULL_TREE && scev != chrec_dont_know)
        step = evolution_part_in_loop_num (scev, loop->num);

      if ((step && step != chrec_dont_know
           && TREE_CODE (step) == INTEGER_CST
           && int_cst_value (step) == xstep)
          || USE_FROM_PTR (use_p) == x)
        SET_USE (use_p, y);
    }
}

For USE_FROM_PTR (use_p) == x that's fine and if
i1 = initial_condition_in_loop_num (scev, loop->num) is equal to
i2 = initial_condition_in_loop_num (instantiate_parameters (loop, analyze_scalar_evolution (loop, x)))
value of y that's fine too.  But if it is not, I believe we need to
replace use with y + i1 - i2.
In the testcase I posted, USE's initial_condition_in_loop_num is
-5 and X's initial_condition_in_loop_num is 1.  And, adding -5 - 1
to perfectiv (aka Y in the above routine) is exactly what would
cure this testcase.

Do you agree with this?

Not sure how hard is that though, other alternative would be to make
can_convert_to_perfect_nest more conservative.

Comment 1 Daniel Berlin 2006-11-08 16:24:55 UTC

(In reply to comment #0)

> In the testcase I posted, USE's initial_condition_in_loop_num is
> -5 and X's initial_condition_in_loop_num is 1.  And, adding -5 - 1
> to perfectiv (aka Y in the above routine) is exactly what would
> cure this testcase.
> 
> Do you agree with this?
> 

I agree completely with your analysis.  If you want to work up a fix to do that, i'm happy to approve it (It should suffice to generate the initial condition in the loop header and use that as the starting point for the new iv variable).


> Not sure how hard is that though, other alternative would be to make
> can_convert_to_perfect_nest more conservative.
>

Comment 2 Jakub Jelinek 2006-11-13 10:16:36 UTC

Created attachment 12601 [details]
gcc43-pr29581-tests.patch

Testcases that show this bug even on the trunk (taken from vect-8{5,6,7,8}.c,
removed vector related stuff from it, added -O2 -ftree-loop-linear).
With my preliminary fix for this PR all these seem to be fixed (on a release
only checking gcc), unfortunately with checking enabled I'm getting
tree checking failures like:
error: statement makes a memory store, but has no V_MAY_DEFS nor V_MUST_DEFS
perfecttmp.42 = perfectiv.40_5 + 16;
I called update_stmt, o not sure what I'm still missing.

BTW, with -ftree-loop-linear also fortran forall_1.f90 and scalarize.f90 are
miscompiled, but that problem isn't fixed by my patch, so perhaps there are other bugs.

Comment 3 Andrew Pinski 2006-11-13 14:09:32 UTC

> error: statement makes a memory store, but has no V_MAY_DEFS nor V_MUST_DEFS
> perfecttmp.42 = perfectiv.40_5 + 16;
> I called update_stmt, o not sure what I'm still missing.

You forgot to mark perfecttmp.42 for renaming (or manually name it yourself).

Comment 4 Jakub Jelinek 2006-11-13 16:56:27 UTC

Created attachment 12607 [details]
gcc43-pr29581.patch

Yeah, found that out too earlier today.
This is the fix which I have regtested, including RUNTESTFLAGS=--target_board=unix/-ftree-loop-linear.

Comment 5 Jakub Jelinek 2006-11-15 09:35:44 UTC

Subject: Bug 29581

Author: jakub
Date: Wed Nov 15 09:35:34 2006
New Revision: 118848

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=118848
Log:
	PR tree-optimization/29581
	* lambda-code.c (replace_uses_equiv_to_x_with_y): Add YINIT,
	REPLACEMENTS, FIRSTBSI arguments.  If initial condition or
	type is different between Y and USE, create a temporary
	variable, initialize it at the beginning of the body bb
	and use it as replacement instead of Y.

	* gcc.dg/pr29581-1.c: New test.
	* gcc.dg/pr29581-2.c: New test.
	* gcc.dg/pr29581-3.c: New test.
	* gcc.dg/pr29581-4.c: New test.
	* gfortran.dg/pr29581.f90: New test.

Added:
    trunk/gcc/testsuite/gcc.dg/pr29581-1.c
    trunk/gcc/testsuite/gcc.dg/pr29581-2.c
    trunk/gcc/testsuite/gcc.dg/pr29581-3.c
    trunk/gcc/testsuite/gcc.dg/pr29581-4.c
    trunk/gcc/testsuite/gfortran.dg/pr29581.f90
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/lambda-code.c
    trunk/gcc/testsuite/ChangeLog

Comment 6 Jakub Jelinek 2006-11-15 09:37:02 UTC

Subject: Bug 29581

Author: jakub
Date: Wed Nov 15 09:36:51 2006
New Revision: 118849

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=118849
Log:
	PR tree-optimization/29581
	* lambda-code.c (replace_uses_equiv_to_x_with_y): Add YINIT,
	REPLACEMENTS, FIRSTBSI arguments.  If initial condition or
	type is different between Y and USE, create a temporary
	variable, initialize it at the beginning of the body bb
	and use it as replacement instead of Y.

	* gcc.dg/pr29581-1.c: New test.
	* gcc.dg/pr29581-2.c: New test.
	* gcc.dg/pr29581-3.c: New test.
	* gcc.dg/pr29581-4.c: New test.
	* gfortran.dg/pr29581.f90: New test.

Added:
    branches/gcc-4_2-branch/gcc/testsuite/gcc.dg/pr29581-1.c
    branches/gcc-4_2-branch/gcc/testsuite/gcc.dg/pr29581-2.c
    branches/gcc-4_2-branch/gcc/testsuite/gcc.dg/pr29581-3.c
    branches/gcc-4_2-branch/gcc/testsuite/gcc.dg/pr29581-4.c
    branches/gcc-4_2-branch/gcc/testsuite/gfortran.dg/pr29581.f90
Modified:
    branches/gcc-4_2-branch/gcc/ChangeLog
    branches/gcc-4_2-branch/gcc/lambda-code.c
    branches/gcc-4_2-branch/gcc/testsuite/ChangeLog

Comment 7 Jakub Jelinek 2006-11-15 09:39:33 UTC

Subject: Bug 29581

Author: jakub
Date: Wed Nov 15 09:39:18 2006
New Revision: 118850

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=118850
Log:
	PR tree-optimization/29581
	* lambda-code.c (replace_uses_equiv_to_x_with_y): Add YINIT,
	REPLACEMENTS, FIRSTBSI arguments.  If initial condition or
	type is different between Y and USE, create a temporary
	variable, initialize it at the beginning of the body bb
	and use it as replacement instead of Y.

	* gcc.dg/pr29581-1.c: New test.
	* gcc.dg/pr29581-2.c: New test.
	* gcc.dg/pr29581-3.c: New test.
	* gcc.dg/pr29581-4.c: New test.
	* gfortran.dg/pr29581.f90: New test.

Added:
    branches/gcc-4_1-branch/gcc/testsuite/gcc.dg/pr29581-1.c
    branches/gcc-4_1-branch/gcc/testsuite/gcc.dg/pr29581-2.c
    branches/gcc-4_1-branch/gcc/testsuite/gcc.dg/pr29581-3.c
    branches/gcc-4_1-branch/gcc/testsuite/gcc.dg/pr29581-4.c
    branches/gcc-4_1-branch/gcc/testsuite/gfortran.dg/pr29581.f90
Modified:
    branches/gcc-4_1-branch/gcc/ChangeLog
    branches/gcc-4_1-branch/gcc/lambda-code.c
    branches/gcc-4_1-branch/gcc/testsuite/ChangeLog

Comment 8 Jakub Jelinek 2006-11-15 09:55:15 UTC

Fixed.