Bug 41101 - [4.4 Regression] ICE in compute_antic, at tree-ssa-pre.c:2419
Summary: [4.4 Regression] ICE in compute_antic, at tree-ssa-pre.c:2419
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.4.2
: P1 normal
Target Milestone: 4.4.2
Assignee: Richard Biener
URL:
Keywords: compile-time-hog, ice-checking, ice-on-valid-code
: 41263 (view as bug list)
Depends on: 41316
Blocks: 40321
  Show dependency treegraph
 
Reported: 2009-08-18 10:47 UTC by Debian GCC Maintainers
Modified: 2009-09-16 11:57 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work: 4.4.1 4.4.2 4.5.0
Known to fail: 4.4.2
Last reconfirmed: 2009-08-18 11:23:14


Attachments
preprocessed source (56.39 KB, application/x-gzip)
2009-08-18 10:48 UTC, Debian GCC Maintainers
Details
preprocessed libstdc++ testcase (32bit) (45.59 KB, application/octet-stream)
2009-09-07 10:01 UTC, Richard Biener
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Debian GCC Maintainers 2009-08-18 10:47:54 UTC
[forwarded from http://bugs.debian.org/541816]

works with trunk and 4.4.1 release, fails with gcc-4.4-branch 20090708 (-O2, but works with -O0 and -O1), seen on at least ix86 and x86_64.

  Matthias

gcc-4.4 -S -O2 -Wall -fno-strict-aliasing gcc44bug.i
Comment 1 Debian GCC Maintainers 2009-08-18 10:48:52 UTC
Created attachment 18394 [details]
preprocessed source
Comment 2 Richard Biener 2009-08-18 11:17:54 UTC
Pfff...  and no, the fix for PR40321 didn't fix it.  Reducing.
Comment 3 Richard Biener 2009-08-18 11:21:49 UTC
Instead it likely broke it.
Comment 4 Richard Biener 2009-08-18 11:23:14 UTC
It did.
Comment 5 Richard Biener 2009-08-18 11:24:19 UTC
Trunk is broken as well at -O -ftree-pre.
Comment 6 Richard Biener 2009-08-18 11:47:32 UTC
One reduced testcase, requires both the typedef and the unprototyped foo
(thus, implicit return type int).

typedef long *GEN; 
int foo(GEN);
void int_elt_val(GEN nf, GEN x, GEN y, long N, int b)
{
  GEN tmp, a;
  int i;
  while (1)
    {
      for (i=1; i<=N; i++)
        {
          a = (GEN) (__SIZE_TYPE__) foo(((GEN *)x)[1]);
          ((GEN *)y)[i] = (GEN) (__SIZE_TYPE__) foo(a);
          if (b)
            return; 
        }
      tmp=x;
      x=y;
      y=tmp;
    }
}
Comment 7 Richard Biener 2009-08-18 13:18:07 UTC
Other one:

typedef long *GEN;
GEN dvmdii(GEN x);
GEN mulii(GEN x);
void int_elt_val(GEN x, GEN y, long N, int b, long k)
{
  long i;
  GEN r,a;
  while (1)
    {
      for (i=1; i<=N; i++)
        {
          a = mulii((((GEN*) (x))[1]));
          a = mulii((((GEN*) (x))[k]));
          (((GEN*) (y))[i]) = a;
          if (b)
            return;
        }
      r=x;
      x=y;
      y=r;
    }
}
Comment 8 Richard Biener 2009-09-04 17:50:49 UTC
*** Bug 41263 has been marked as a duplicate of this bug. ***
Comment 9 Richard Biener 2009-09-04 17:51:59 UTC
Testcase from PR41263:

int func(int);

int bug(int* x, int* y, int N)
{
  int i;
  int* t;

  while (1)
  {
    for (i=1; i<=N; i++)
    {
      y[i] = func(x[i] - x[1]);
      if (y[i]) return i;
    }
    t=x; x=y; y=t;
  }
}
Comment 10 Richard Biener 2009-09-04 18:15:08 UTC
Testcase with less IL:

int func(int);

void
bug(int* x, int* y, unsigned long int N)
{
  unsigned long int i;
  int* t;

  while (1)
    {
      for (i=1; i<=N; i++)
        {
          y[i] = func(x[i] - x[1]);
          if (y[i])
            return;
        }
      t=x; x=y; y=t;
    }
}
Comment 11 Richard Biener 2009-09-06 12:38:06 UTC
We again have ANTIC_IN/OUT oscillation

ANTIC_OUT[] := { x_2 (0001), y_3 (0002), {pointer_plus_expr,x_2,4} (0011) }
ANTIC_IN[] := { x_2 (0001), y_3 (0002), {pointer_plus_expr,x_2,4} (0011) }
ANTIC_OUT[] := { x_2 (0001), y_3 (0002), {pointer_plus_expr,y_3,4} (0025) }
ANTIC_IN[] := { x_2 (0001), y_3 (0002), {pointer_plus_expr,y_3,4} (0025) }

for blocks 3, 9 and 6 which form a conditionally infinite loop.

<bb 2>:

<bb 3>:
  # x_2 = PHI <x_4(D)(2), y_3(6)>
  # y_3 = PHI <y_5(D)(2), x_2(6)>
  if (N_7(D) != 0)
    goto <bb 8>;
  else
    goto <bb 9>;

<bb 9>:
  goto <bb 6>;

<bb 8>:

<bb 4>:
  # i_10 = PHI <i_20(10), 1(8)>
  D.1986_8 = i_10 * 4;
  D.1987_9 = y_3 + D.1986_8;
  D.1988_11 = x_2 + D.1986_8;
  D.1989_12 = *D.1988_11;
  D.1990_13 = x_2 + 4;
  D.1991_14 = *D.1990_13;
  D.1992_15 = D.1989_12 - D.1991_14;
  D.1993_16 = func (D.1992_15);
  *D.1987_9 = D.1993_16;
  if (D.1993_16 != 0)
    goto <bb 7>;
  else
    goto <bb 5>;

<bb 5>:
  i_20 = i_10 + 1;
  if (N_7(D) >= i_20)
    goto <bb 10>;
  else
    goto <bb 11>;

<bb 10>:
  goto <bb 4>;

<bb 11>:

<bb 6>:
  goto <bb 3>;

<bb 7>:
  return;

The maximal set has only {pointer_plus_expr,x_2,4} (0011),
{pointer_plus_expr,y_3,4} (0025) is the result of phi translating
{pointer_plus_expr,y_3,D.1986_8} (0007).

Injecting that pair into the cycle via BB3 -> BB8 yields to the oscillation
as from BB9 we get oscillating 11 which will mask out the other.

It seems to me that whenever we encounter such a cycle as 3 -> 9 -> 6 -> 3
we may not inject the maximal set there as it may not really be the
maximal set (which we can't really enlarge during phi translation, can we?
It does fix this case obviously though).
Comment 12 Richard Biener 2009-09-06 12:41:31 UTC
This is what I am thinking of:

Index: gcc/tree-ssa-pre.c
===================================================================
--- gcc/tree-ssa-pre.c	(revision 151458)
+++ gcc/tree-ssa-pre.c	(working copy)
@@ -1802,7 +1802,10 @@ phi_translate_set (bitmap_set_t dest, bi
 	phi_trans_add (expr, translated, pred);
 
       if (translated != NULL)
-	bitmap_value_insert_into_set (dest, translated);
+	{
+	  bitmap_value_insert_into_set (dest, translated);
+	  bitmap_value_insert_into_set (maximal_set, translated);
+	}
     }
   VEC_free (pre_expr, heap, exprs);
 }
Comment 13 Richard Biener 2009-09-06 12:50:50 UTC
Equivalent to that would be to clean ANTIC_IN using the maximal set - that is,
do not allow expressions in ANTIC_IN that are not in the maximal set.
Comment 14 Daniel Berlin 2009-09-06 14:15:05 UTC
Subject: Re:  [4.4/4.5 Regression] ICE in 
	compute_antic, at tree-ssa-pre.c:2419

On Sun, Sep 6, 2009 at 8:41 AM, rguenth at gcc dot gnu dot
org<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #12 from rguenth at gcc dot gnu dot org  2009-09-06 12:41 -------
> This is what I am thinking of:

Think harder ;)

The maximal set is the initialization vector for this problem.
If we were using sbitmaps, it would be the equivalent of all ones.

Adding more things to it in the middle of the problem can't possibly be right.
Comment 15 rguenther@suse.de 2009-09-06 14:17:33 UTC
Subject: Re:  [4.4/4.5 Regression] ICE in
 compute_antic, at tree-ssa-pre.c:2419

On Sun, 6 Sep 2009, dberlin at dberlin dot org wrote:

> 
> 
> ------- Comment #14 from dberlin at gcc dot gnu dot org  2009-09-06 14:15 -------
> Subject: Re:  [4.4/4.5 Regression] ICE in 
>         compute_antic, at tree-ssa-pre.c:2419
> 
> On Sun, Sep 6, 2009 at 8:41 AM, rguenth at gcc dot gnu dot
> org<gcc-bugzilla@gcc.gnu.org> wrote:
> >
> >
> > ------- Comment #12 from rguenth at gcc dot gnu dot org ?2009-09-06 12:41 -------
> > This is what I am thinking of:
> 
> Think harder ;)
> 
> The maximal set is the initialization vector for this problem.
> If we were using sbitmaps, it would be the equivalent of all ones.
> 
> Adding more things to it in the middle of the problem can't possibly be right.

Of course ;)  It was just to illustrate the problem that
maximal_set U ANTIC_IN != maximal_set, thus maximal_set isn't all ones
but there are zeros for some expressions in ANTIC_IN.

Richard.
Comment 16 Daniel Berlin 2009-09-06 14:19:06 UTC
Subject: Re:  [4.4/4.5 Regression] ICE in 
	compute_antic, at tree-ssa-pre.c:2419

On Sun, Sep 6, 2009 at 8:50 AM, rguenth at gcc dot gnu dot
org<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #13 from rguenth at gcc dot gnu dot org  2009-09-06 12:50 -------
> Equivalent to that would be to clean ANTIC_IN using the maximal set - that is,
> do not allow expressions in ANTIC_IN that are not in the maximal set.
>

It might be easier to go the other way around, and just special case
the maximal set everywhere again.
Comment 17 rguenther@suse.de 2009-09-06 14:20:47 UTC
Subject: Re:  [4.4/4.5 Regression] ICE in
 compute_antic, at tree-ssa-pre.c:2419

On Sun, 6 Sep 2009, dberlin at dberlin dot org wrote:

> ------- Comment #16 from dberlin at gcc dot gnu dot org  2009-09-06 14:19 -------
> Subject: Re:  [4.4/4.5 Regression] ICE in 
>         compute_antic, at tree-ssa-pre.c:2419
> 
> On Sun, Sep 6, 2009 at 8:50 AM, rguenth at gcc dot gnu dot
> org<gcc-bugzilla@gcc.gnu.org> wrote:
> >
> >
> > ------- Comment #13 from rguenth at gcc dot gnu dot org ?2009-09-06 12:50 -------
> > Equivalent to that would be to clean ANTIC_IN using the maximal set - that is,
> > do not allow expressions in ANTIC_IN that are not in the maximal set.
> >
> 
> It might be easier to go the other way around, and just special case
> the maximal set everywhere again.

Can you elaborate on that a bit?

Thanks,
Richard.
Comment 18 Richard Biener 2009-09-06 17:41:57 UTC
Do you mean sth like the following?  Simply assuming that the maximal-set is
all ones and obviously translating all ones also results in all ones.

Index: gcc/tree-ssa-pre.c
===================================================================
--- gcc/tree-ssa-pre.c	(revision 151459)
+++ gcc/tree-ssa-pre.c	(working copy)
@@ -2201,49 +2201,45 @@ compute_antic_aux (basic_block block, bo
     {
       VEC(basic_block, heap) * worklist;
       size_t i;
-      basic_block bprime, first;
+      basic_block bprime, first = NULL;
 
       worklist = VEC_alloc (basic_block, heap, EDGE_COUNT (block->succs));
       FOR_EACH_EDGE (e, ei, block->succs)
-	VEC_quick_push (basic_block, worklist, e->dest);
-      first = VEC_index (basic_block, worklist, 0);
-
-      if (phi_nodes (first))
 	{
-	  bitmap_set_t from = ANTIC_IN (first);
-
-	  if (!BB_VISITED (first))
-	    from = maximal_set;
-	  phi_translate_set (ANTIC_OUT, from, block, first);
+	  if (!first
+	      && BB_VISITED (e->dest))
+	    first = e->dest;
+	  else if (BB_VISITED (e->dest))
+	    VEC_quick_push (basic_block, worklist, e->dest);
+	}
+
+      /* Of multiple successors we have to have visited one already.  */
+      if (!first)
+	{
+	  SET_BIT (changed_blocks, block->index);
+	  BB_VISITED (block) = 0;
+	  BB_DEFERRED (block) = 1;
+	  changed = true;
+	  VEC_free (basic_block, heap, worklist);
+	  goto maybe_dump_sets;
 	}
+
+      if (phi_nodes (first))
+	phi_translate_set (ANTIC_OUT, ANTIC_IN (first), block, first);
       else
-	{
-	  if (!BB_VISITED (first))
-	    bitmap_set_copy (ANTIC_OUT, maximal_set);
-	  else
-	    bitmap_set_copy (ANTIC_OUT, ANTIC_IN (first));
-	}
+	bitmap_set_copy (ANTIC_OUT, ANTIC_IN (first));
 
-      for (i = 1; VEC_iterate (basic_block, worklist, i, bprime); i++)
+      for (i = 0; VEC_iterate (basic_block, worklist, i, bprime); i++)
 	{
 	  if (phi_nodes (bprime))
 	    {
 	      bitmap_set_t tmp = bitmap_set_new ();
-	      bitmap_set_t from = ANTIC_IN (bprime);
-
-	      if (!BB_VISITED (bprime))
-		from = maximal_set;
-	      phi_translate_set (tmp, from, block, bprime);
+	      phi_translate_set (tmp, ANTIC_IN (bprime), block, bprime);
 	      bitmap_set_and (ANTIC_OUT, tmp);
 	      bitmap_set_free (tmp);
 	    }
 	  else
-	    {
-	      if (!BB_VISITED (bprime))
-		bitmap_set_and (ANTIC_OUT, maximal_set);
-	      else
-		bitmap_set_and (ANTIC_OUT, ANTIC_IN (bprime));
-	    }
+	    bitmap_set_and (ANTIC_OUT, ANTIC_IN (bprime));
 	}
       VEC_free (basic_block, heap, worklist);
     }


Now obviously I'm not convinced we'll not defer blocks forever because
we can't seed one with the maximal set ...

Minimally tested with this testcase and tree-ssa.exp.
Comment 19 Richard Biener 2009-09-06 17:48:22 UTC
Err... I wonder how this works at all ;)
Comment 20 Richard Biener 2009-09-06 17:53:09 UTC
Ah, we start from the exit block with ANTIC_IN {} and go to its predecessors
getting just EXP_GEN - TMP_GEN there.  Not unsound.
Comment 21 Richard Biener 2009-09-06 19:51:31 UTC
Both patches bootstrap & regtest ok apart from

FAIL: 23_containers/forward_list/operations/6.cc execution test

hmm.
Comment 22 Richard Biener 2009-09-07 10:01:04 UTC
Created attachment 18526 [details]
preprocessed libstdc++ testcase (32bit)

Testcase that fails at -O2 with the patch.  The 2nd patch bootstraps and tests ok
on the 4.4 branch.
Comment 23 Richard Biener 2009-09-09 15:05:19 UTC
Subject: Bug 41101

Author: rguenth
Date: Wed Sep  9 15:04:27 2009
New Revision: 151561

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=151561
Log:
2009-09-09  Richard Guenther  <rguenther@suse.de>

	PR tree-optimization/41101
	* tree-ssa-pre.c (maximal_set): Remove.
	(compute_antic_aux): Treat the maximal set as implicitly all ones.
	Defer all blocks we didn't visit at least one successor.
	(add_to_exp_gen): Do not add to the maximal set.
	(make_values_for_phi): Likewise.
	(compute_avail): Likewise.
	(init_pre): Do not allocate the maximal set.
	(execute_pre): Do not dump it.

	* gcc.c-torture/compile/pr41101.c: New testcase.

Added:
    trunk/gcc/testsuite/gcc.c-torture/compile/pr41101.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-ssa-pre.c

Comment 24 Richard Biener 2009-09-09 15:05:27 UTC
Fixed on trunk.  Let's wait a bit to watch for fallout before backporting it.
Comment 25 Debian GCC Maintainers 2009-09-11 16:50:44 UTC
checked the backport of the 2nd chunk on the 4.4 branch without regressions on i386 and amd64.

  Matthias
Comment 26 Richard Biener 2009-09-16 11:56:48 UTC
Subject: Bug 41101

Author: rguenth
Date: Wed Sep 16 11:56:31 2009
New Revision: 151744

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=151744
Log:
2009-09-16  Richard Guenther  <rguenther@suse.de>

	Backport from mainline
	2009-09-09  Richard Guenther  <rguenther@suse.de>

	PR tree-optimization/41101
	* tree-ssa-pre.c (maximal_set): Remove.
	(compute_antic_aux): Treat the maximal set as implicitly all ones.
	Defer all blocks we didn't visit at least one successor.
	(add_to_exp_gen): Do not add to the maximal set.
	(make_values_for_phi): Likewise.
	(compute_avail): Likewise.
	(init_pre): Do not allocate the maximal set.
	(execute_pre): Do not dump it.

	* gcc.c-torture/compile/pr41101.c: New testcase.

Added:
    branches/gcc-4_4-branch/gcc/testsuite/gcc.c-torture/compile/pr41101.c
Modified:
    branches/gcc-4_4-branch/gcc/ChangeLog
    branches/gcc-4_4-branch/gcc/testsuite/ChangeLog
    branches/gcc-4_4-branch/gcc/tree-ssa-pre.c

Comment 27 Richard Biener 2009-09-16 11:57:04 UTC
Fixed.