static const int e[4] = { 16, 16, 20, 20 }; extern void foo (unsigned long *); void baz (unsigned long *r) { unsigned long i; for (i = 0; i < 4; i++) if (e[i] == 16) break; if (i == 4) { foo (r); } } We have the following in the .vars tree dump: ;; Function baz (baz) baz (r) { long unsigned int i; <bb 0>: i = 0; <L0>:; if (MEM[symbol: e, index: (int *) i, step: 4B] == 16) goto <L5>; else goto <L1>; <L1>:; i = i + 1; if (i != 4) goto <L0>; else goto <L3>; <L3>:; if (i == 4) goto <L4>; else goto <L5>; <L4>:; foo (r); <L5>:; return; } The first jump pass on RTL immediately optimizes away the redundant jump. This kind of jump threading opportunity occurs very often. About half of the jump threadings on RTL in jump1 that I have looked at so far (several dozen) are of this form. I'm seeing this for AMD64 and i686, but I guess it happens everywhere.
I should have said, this is at "-O1 -fthread-jumps". I guess VRP catches this at -O2 and better.
Actually VRP doesn't catch it. Do: - if (e[i] == 16) + if (e[i] == 16) so that store-CCP doesn't load e[0] anymore to find that it is 16. With that, the .vrp dump at -O2 looks like this: baz (r) { long unsigned int i; int D.1638; long unsigned int i.0; <bb 0>: goto <bb 3> (<L2>); <L0>:; i_4 = i_1; D.1638_6 = e[i_1]; if (D.1638_6 == 17) goto <L3>; else goto <L1>; <L1>:; i_7 = i_1 + 1; # i_1 = PHI <0(0), i_7(2)>; <L2>:; if (i_1 <= 3) goto <L0>; else goto <L3>; <L3>:; if (i_1 == 4) goto <L4>; else goto <L5>; <L4>:; foo (r_3); <L5>:; return; } The tree loop optimizers turn "(i_1 <= 3)" into "(i_1 != 4)", but there are no passes after VRP that use the ASSERT_EXPRs to propagate "i_1 == 4" down along the false-edge going to the block starting with label L3. So even at -O2 we don't catch this jump threading opportunity at the tree level.
With a minor hack, we optimize the test case in dom3: Index: tree-ssa-dom.c =================================================================== --- tree-ssa-dom.c (revision 107822) +++ tree-ssa-dom.c (working copy) @@ -2776,7 +2776,9 @@ cprop_operand (tree stmt, use_operand_p /* If the operand has a known constant value or it is known to be a copy of some other variable, use the value or copy stored in CONST_AND_COPIES. */ - val = SSA_NAME_VALUE (op); + val = op; + while (TREE_CODE (val) == SSA_NAME && SSA_NAME_VALUE (val)) + val = SSA_NAME_VALUE (val); if (val && val != op && TREE_CODE (val) != VALUE_HANDLE) { tree op_type, val_type; Jeff, thoughts?
This looks very much related to PR 21829. Hmm, in 4.0.0 we got Before DOM (likewise for 4.1.0 and above): <L1>:; i_2 = i_10 + 1; if (i_2 != 4) goto <L0>; else goto <L3>; # i_7 = PHI <i_2(2)>; <L3>:; if (i_7 == 4) goto <L4>; else goto <L5>; but after: <L1>:; i_2 = i_10 + 1; if (i_2 != 4) goto <L0>; else goto <L3>; Invalid sum of incoming frequencies 2375, should be 0 # i_7 = PHI <i_2(2)>; <L3>:; foo (r_3); Which means that GCC got it right in 4.0.0
Actually, it's more related to Bug 21488. What happens is that we record a value for the left hand side of a single-argument PHI node (i.e. for "rhs=PHI(lhs)" we record an equivalence rhs==lhs), but the left hand side also already has a value (in this case, lhs==4). Later on we don't copy propagate lhs into the uses of rhs because lhs is defined in a loop and we don't copy propagate out of loops, so we never see that rhs==4 too. My hack in comment #3 propagates the value "4" by following SSA_NAME_VALUE links as deeply as possible. What we should probably do instead is look through SSA_NAME_VALUE chains in record_equivalences_from_phis.
Subject: Re: [4.1/4.2 Regression] Jump threading opportunity missed in tree-ssa but caught in jump1 On Sat, 2005-12-03 at 17:46 +0000, steven at gcc dot gnu dot org wrote: > > ------- Comment #5 from steven at gcc dot gnu dot org 2005-12-03 17:46 ------- > Actually, it's more related to Bug 21488. What happens is that we record a > value for the left hand side of a single-argument PHI node (i.e. for > "rhs=PHI(lhs)" we record an equivalence rhs==lhs), but the left hand side also > already has a value (in this case, lhs==4). > > Later on we don't copy propagate lhs into the uses of rhs because lhs is > defined in a loop and we don't copy propagate out of loops, so we never see > that rhs==4 too. > > My hack in comment #3 propagates the value "4" by following SSA_NAME_VALUE > links as deeply as possible. What we should probably do instead is look > through SSA_NAME_VALUE chains in record_equivalences_from_phis. The reason we don't generally walk through those chains is it is possible to get loops in the value chain. I've always considered this a semi-bug in DOM. Jeff
Created attachment 10410 [details] follow SSA_NAME_VALUE deep Hmmwell, the attached patch does bootstrap on i686,ia64, and x86-64, and it passes regression testing on those targets too. Jeff, do you have an example of when this could go into an infinite loop?
Subject: Re: [4.1/4.2 Regression] Jump threading opportunity missed in tree-ssa but caught in jump1 On Mon, 2005-12-05 at 18:05 +0000, steven at gcc dot gnu dot org wrote: > > ------- Comment #7 from steven at gcc dot gnu dot org 2005-12-05 18:05 ------- > Created an attachment (id=10410) > --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=10410&action=view) > follow SSA_NAME_VALUE deep > > Hmmwell, the attached patch does bootstrap on i686,ia64, and x86-64, and it > passes regression testing on those targets too. Jeff, do you have an example > of when this could go into an infinite loop? Try putting it in lookup_avail_expr. ie, instead of following one chain back, make it a loop. I'd also really rather look at moving away from the equivalency chains, which would also address this problem. What's preventing that is things actually get slower when we fully propagate constants/copies as they are discovered. jeff
Fixed with the 2nd VRP pass Jeff added recently: 2006-02-07 Jeff Law <law at redhat dot com> * tree-vrp.c (find_conditional_asserts): Update comments. (simplify_stmt_for_jump_threading): New. (etc.)