[forwarded from http://bugs.debian.org/541816] works with trunk and 4.4.1 release, fails with gcc-4.4-branch 20090708 (-O2, but works with -O0 and -O1), seen on at least ix86 and x86_64. Matthias gcc-4.4 -S -O2 -Wall -fno-strict-aliasing gcc44bug.i
Created attachment 18394 [details] preprocessed source
Pfff... and no, the fix for PR40321 didn't fix it. Reducing.
Instead it likely broke it.
It did.
Trunk is broken as well at -O -ftree-pre.
One reduced testcase, requires both the typedef and the unprototyped foo (thus, implicit return type int). typedef long *GEN; int foo(GEN); void int_elt_val(GEN nf, GEN x, GEN y, long N, int b) { GEN tmp, a; int i; while (1) { for (i=1; i<=N; i++) { a = (GEN) (__SIZE_TYPE__) foo(((GEN *)x)[1]); ((GEN *)y)[i] = (GEN) (__SIZE_TYPE__) foo(a); if (b) return; } tmp=x; x=y; y=tmp; } }
Other one: typedef long *GEN; GEN dvmdii(GEN x); GEN mulii(GEN x); void int_elt_val(GEN x, GEN y, long N, int b, long k) { long i; GEN r,a; while (1) { for (i=1; i<=N; i++) { a = mulii((((GEN*) (x))[1])); a = mulii((((GEN*) (x))[k])); (((GEN*) (y))[i]) = a; if (b) return; } r=x; x=y; y=r; } }
*** Bug 41263 has been marked as a duplicate of this bug. ***
Testcase from PR41263: int func(int); int bug(int* x, int* y, int N) { int i; int* t; while (1) { for (i=1; i<=N; i++) { y[i] = func(x[i] - x[1]); if (y[i]) return i; } t=x; x=y; y=t; } }
Testcase with less IL: int func(int); void bug(int* x, int* y, unsigned long int N) { unsigned long int i; int* t; while (1) { for (i=1; i<=N; i++) { y[i] = func(x[i] - x[1]); if (y[i]) return; } t=x; x=y; y=t; } }
We again have ANTIC_IN/OUT oscillation ANTIC_OUT[] := { x_2 (0001), y_3 (0002), {pointer_plus_expr,x_2,4} (0011) } ANTIC_IN[] := { x_2 (0001), y_3 (0002), {pointer_plus_expr,x_2,4} (0011) } ANTIC_OUT[] := { x_2 (0001), y_3 (0002), {pointer_plus_expr,y_3,4} (0025) } ANTIC_IN[] := { x_2 (0001), y_3 (0002), {pointer_plus_expr,y_3,4} (0025) } for blocks 3, 9 and 6 which form a conditionally infinite loop. <bb 2>: <bb 3>: # x_2 = PHI <x_4(D)(2), y_3(6)> # y_3 = PHI <y_5(D)(2), x_2(6)> if (N_7(D) != 0) goto <bb 8>; else goto <bb 9>; <bb 9>: goto <bb 6>; <bb 8>: <bb 4>: # i_10 = PHI <i_20(10), 1(8)> D.1986_8 = i_10 * 4; D.1987_9 = y_3 + D.1986_8; D.1988_11 = x_2 + D.1986_8; D.1989_12 = *D.1988_11; D.1990_13 = x_2 + 4; D.1991_14 = *D.1990_13; D.1992_15 = D.1989_12 - D.1991_14; D.1993_16 = func (D.1992_15); *D.1987_9 = D.1993_16; if (D.1993_16 != 0) goto <bb 7>; else goto <bb 5>; <bb 5>: i_20 = i_10 + 1; if (N_7(D) >= i_20) goto <bb 10>; else goto <bb 11>; <bb 10>: goto <bb 4>; <bb 11>: <bb 6>: goto <bb 3>; <bb 7>: return; The maximal set has only {pointer_plus_expr,x_2,4} (0011), {pointer_plus_expr,y_3,4} (0025) is the result of phi translating {pointer_plus_expr,y_3,D.1986_8} (0007). Injecting that pair into the cycle via BB3 -> BB8 yields to the oscillation as from BB9 we get oscillating 11 which will mask out the other. It seems to me that whenever we encounter such a cycle as 3 -> 9 -> 6 -> 3 we may not inject the maximal set there as it may not really be the maximal set (which we can't really enlarge during phi translation, can we? It does fix this case obviously though).
This is what I am thinking of: Index: gcc/tree-ssa-pre.c =================================================================== --- gcc/tree-ssa-pre.c (revision 151458) +++ gcc/tree-ssa-pre.c (working copy) @@ -1802,7 +1802,10 @@ phi_translate_set (bitmap_set_t dest, bi phi_trans_add (expr, translated, pred); if (translated != NULL) - bitmap_value_insert_into_set (dest, translated); + { + bitmap_value_insert_into_set (dest, translated); + bitmap_value_insert_into_set (maximal_set, translated); + } } VEC_free (pre_expr, heap, exprs); }
Equivalent to that would be to clean ANTIC_IN using the maximal set - that is, do not allow expressions in ANTIC_IN that are not in the maximal set.
Subject: Re: [4.4/4.5 Regression] ICE in compute_antic, at tree-ssa-pre.c:2419 On Sun, Sep 6, 2009 at 8:41 AM, rguenth at gcc dot gnu dot org<gcc-bugzilla@gcc.gnu.org> wrote: > > > ------- Comment #12 from rguenth at gcc dot gnu dot org 2009-09-06 12:41 ------- > This is what I am thinking of: Think harder ;) The maximal set is the initialization vector for this problem. If we were using sbitmaps, it would be the equivalent of all ones. Adding more things to it in the middle of the problem can't possibly be right.
Subject: Re: [4.4/4.5 Regression] ICE in compute_antic, at tree-ssa-pre.c:2419 On Sun, 6 Sep 2009, dberlin at dberlin dot org wrote: > > > ------- Comment #14 from dberlin at gcc dot gnu dot org 2009-09-06 14:15 ------- > Subject: Re: [4.4/4.5 Regression] ICE in > compute_antic, at tree-ssa-pre.c:2419 > > On Sun, Sep 6, 2009 at 8:41 AM, rguenth at gcc dot gnu dot > org<gcc-bugzilla@gcc.gnu.org> wrote: > > > > > > ------- Comment #12 from rguenth at gcc dot gnu dot org ?2009-09-06 12:41 ------- > > This is what I am thinking of: > > Think harder ;) > > The maximal set is the initialization vector for this problem. > If we were using sbitmaps, it would be the equivalent of all ones. > > Adding more things to it in the middle of the problem can't possibly be right. Of course ;) It was just to illustrate the problem that maximal_set U ANTIC_IN != maximal_set, thus maximal_set isn't all ones but there are zeros for some expressions in ANTIC_IN. Richard.
Subject: Re: [4.4/4.5 Regression] ICE in compute_antic, at tree-ssa-pre.c:2419 On Sun, Sep 6, 2009 at 8:50 AM, rguenth at gcc dot gnu dot org<gcc-bugzilla@gcc.gnu.org> wrote: > > > ------- Comment #13 from rguenth at gcc dot gnu dot org 2009-09-06 12:50 ------- > Equivalent to that would be to clean ANTIC_IN using the maximal set - that is, > do not allow expressions in ANTIC_IN that are not in the maximal set. > It might be easier to go the other way around, and just special case the maximal set everywhere again.
Subject: Re: [4.4/4.5 Regression] ICE in compute_antic, at tree-ssa-pre.c:2419 On Sun, 6 Sep 2009, dberlin at dberlin dot org wrote: > ------- Comment #16 from dberlin at gcc dot gnu dot org 2009-09-06 14:19 ------- > Subject: Re: [4.4/4.5 Regression] ICE in > compute_antic, at tree-ssa-pre.c:2419 > > On Sun, Sep 6, 2009 at 8:50 AM, rguenth at gcc dot gnu dot > org<gcc-bugzilla@gcc.gnu.org> wrote: > > > > > > ------- Comment #13 from rguenth at gcc dot gnu dot org ?2009-09-06 12:50 ------- > > Equivalent to that would be to clean ANTIC_IN using the maximal set - that is, > > do not allow expressions in ANTIC_IN that are not in the maximal set. > > > > It might be easier to go the other way around, and just special case > the maximal set everywhere again. Can you elaborate on that a bit? Thanks, Richard.
Do you mean sth like the following? Simply assuming that the maximal-set is all ones and obviously translating all ones also results in all ones. Index: gcc/tree-ssa-pre.c =================================================================== --- gcc/tree-ssa-pre.c (revision 151459) +++ gcc/tree-ssa-pre.c (working copy) @@ -2201,49 +2201,45 @@ compute_antic_aux (basic_block block, bo { VEC(basic_block, heap) * worklist; size_t i; - basic_block bprime, first; + basic_block bprime, first = NULL; worklist = VEC_alloc (basic_block, heap, EDGE_COUNT (block->succs)); FOR_EACH_EDGE (e, ei, block->succs) - VEC_quick_push (basic_block, worklist, e->dest); - first = VEC_index (basic_block, worklist, 0); - - if (phi_nodes (first)) { - bitmap_set_t from = ANTIC_IN (first); - - if (!BB_VISITED (first)) - from = maximal_set; - phi_translate_set (ANTIC_OUT, from, block, first); + if (!first + && BB_VISITED (e->dest)) + first = e->dest; + else if (BB_VISITED (e->dest)) + VEC_quick_push (basic_block, worklist, e->dest); + } + + /* Of multiple successors we have to have visited one already. */ + if (!first) + { + SET_BIT (changed_blocks, block->index); + BB_VISITED (block) = 0; + BB_DEFERRED (block) = 1; + changed = true; + VEC_free (basic_block, heap, worklist); + goto maybe_dump_sets; } + + if (phi_nodes (first)) + phi_translate_set (ANTIC_OUT, ANTIC_IN (first), block, first); else - { - if (!BB_VISITED (first)) - bitmap_set_copy (ANTIC_OUT, maximal_set); - else - bitmap_set_copy (ANTIC_OUT, ANTIC_IN (first)); - } + bitmap_set_copy (ANTIC_OUT, ANTIC_IN (first)); - for (i = 1; VEC_iterate (basic_block, worklist, i, bprime); i++) + for (i = 0; VEC_iterate (basic_block, worklist, i, bprime); i++) { if (phi_nodes (bprime)) { bitmap_set_t tmp = bitmap_set_new (); - bitmap_set_t from = ANTIC_IN (bprime); - - if (!BB_VISITED (bprime)) - from = maximal_set; - phi_translate_set (tmp, from, block, bprime); + phi_translate_set (tmp, ANTIC_IN (bprime), block, bprime); bitmap_set_and (ANTIC_OUT, tmp); bitmap_set_free (tmp); } else - { - if (!BB_VISITED (bprime)) - bitmap_set_and (ANTIC_OUT, maximal_set); - else - bitmap_set_and (ANTIC_OUT, ANTIC_IN (bprime)); - } + bitmap_set_and (ANTIC_OUT, ANTIC_IN (bprime)); } VEC_free (basic_block, heap, worklist); } Now obviously I'm not convinced we'll not defer blocks forever because we can't seed one with the maximal set ... Minimally tested with this testcase and tree-ssa.exp.
Err... I wonder how this works at all ;)
Ah, we start from the exit block with ANTIC_IN {} and go to its predecessors getting just EXP_GEN - TMP_GEN there. Not unsound.
Both patches bootstrap & regtest ok apart from FAIL: 23_containers/forward_list/operations/6.cc execution test hmm.
Created attachment 18526 [details] preprocessed libstdc++ testcase (32bit) Testcase that fails at -O2 with the patch. The 2nd patch bootstraps and tests ok on the 4.4 branch.
Subject: Bug 41101 Author: rguenth Date: Wed Sep 9 15:04:27 2009 New Revision: 151561 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=151561 Log: 2009-09-09 Richard Guenther <rguenther@suse.de> PR tree-optimization/41101 * tree-ssa-pre.c (maximal_set): Remove. (compute_antic_aux): Treat the maximal set as implicitly all ones. Defer all blocks we didn't visit at least one successor. (add_to_exp_gen): Do not add to the maximal set. (make_values_for_phi): Likewise. (compute_avail): Likewise. (init_pre): Do not allocate the maximal set. (execute_pre): Do not dump it. * gcc.c-torture/compile/pr41101.c: New testcase. Added: trunk/gcc/testsuite/gcc.c-torture/compile/pr41101.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-ssa-pre.c
Fixed on trunk. Let's wait a bit to watch for fallout before backporting it.
checked the backport of the 2nd chunk on the 4.4 branch without regressions on i386 and amd64. Matthias
Subject: Bug 41101 Author: rguenth Date: Wed Sep 16 11:56:31 2009 New Revision: 151744 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=151744 Log: 2009-09-16 Richard Guenther <rguenther@suse.de> Backport from mainline 2009-09-09 Richard Guenther <rguenther@suse.de> PR tree-optimization/41101 * tree-ssa-pre.c (maximal_set): Remove. (compute_antic_aux): Treat the maximal set as implicitly all ones. Defer all blocks we didn't visit at least one successor. (add_to_exp_gen): Do not add to the maximal set. (make_values_for_phi): Likewise. (compute_avail): Likewise. (init_pre): Do not allocate the maximal set. (execute_pre): Do not dump it. * gcc.c-torture/compile/pr41101.c: New testcase. Added: branches/gcc-4_4-branch/gcc/testsuite/gcc.c-torture/compile/pr41101.c Modified: branches/gcc-4_4-branch/gcc/ChangeLog branches/gcc-4_4-branch/gcc/testsuite/ChangeLog branches/gcc-4_4-branch/gcc/tree-ssa-pre.c
Fixed.