/* { dg-do compile } */ /* { dg-options "-O2 -fdump-tree-sccp-details" } */ void bar(int); void foo(int i1, int j1) { int i, j; for (j=0; j<=j1; ++j) for (i=0; i<=i1; ++i) bar(j+1); } /* { dg-final { scan-tree-dump-not "set_nb_iterations_in_loop = scev_not_known" "sccp"} } */ /* { dg-final { cleanup-tree-dump "sccp" } } */ Compare to using j-1 in the inner loop, which makes # iterations analysis succeed.
Note this testcase requires the fix(es) for PR26900 to figure out the number of iterations for the inner loop. The failure to figure out the number of iterations of the outer loop can be reproduced with mainline. Basically we have (in the working case with using j-1): foo (i1, j1) { int pretmp.20; int j; int i; int D.1534; <bb 2>: if (j1_4 >= 0) goto <L14>; else goto <L5>; <L14>:; goto <bb 8> (<L2>); <L16>:; # i_14 = PHI <i_9(4), 0(9)>; <L1>:; D.1534_8 = pretmp.20_10; bar (pretmp.20_10); i_9 = i_14 + 1; if (i1_6 >= i_9) goto <L16>; else goto <L3>; <L3>:; j_7 = j_12 + 1; if (j1_4 >= j_7) goto <L18>; else goto <L5>; <L18>:; # j_12 = PHI <j_7(7), 0(3)>; <L2>:; if (i1_6 >= 0) goto <L20>; else goto <L3>; <L20>:; pretmp.20_10 = j_12 - 1; goto <bb 5> (<L1>); <L5>:; return; } while with using j+1 we get foo (i1, j1) { int prephitmp.21; int pretmp.20; int j; int i; int D.1534; <bb 2>: if (j1_4 >= 0) goto <L14>; else goto <L5>; <L14>:; goto <bb 9> (<L2>); <L16>:; # i_14 = PHI <i_9(4), 0(11)>; <L1>:; D.1534_8 = pretmp.20_13; bar (pretmp.20_13); i_9 = i_14 + 1; if (i1_6 >= i_9) goto <L16>; else goto <L17>; # D.1534_2 = PHI <pretmp.20_13(5)>; <L17>:; # prephitmp.21_11 = PHI <D.1534_2(6), pretmp.20_1(10)>; <L3>:; j_7 = prephitmp.21_11; if (j1_4 >= prephitmp.21_11) goto <L18>; else goto <L5>; <L18>:; # j_12 = PHI <prephitmp.21_11(8), 0(3)>; <L2>:; if (i1_6 >= 0) goto <L20>; else goto <L21>; <L21>:; pretmp.20_1 = j_12 + 1; goto <bb 7> (<L3>); <L20>:; pretmp.20_13 = j_12 + 1; goto <bb 5> (<L1>); <L5>:; return; }
You can see that PRE makes a mess out of it because of the copied loop header of the inner loop. So maybe Zdeneks patch to move the loop header copy outside of the first loop helps here. Though I'd prefer to prevent PRE from doing the transformation here. Danny, any ideas?
Created attachment 11164 [details] candidate patch The attached patch is maybe a fix - though a better way to detect what could be a loop header copy is appreciated ;)
(In reply to comment #2) > You can see that PRE makes a mess out of it because of the copied loop header > of the inner loop. So maybe Zdeneks patch to move the loop header copy outside > of the first loop helps here. Though I'd prefer to prevent PRE from doing the > transformation here. Danny, any ideas? > Errr, why can't we just improve SCEV to handle this case if we can detect it? :) Either that, or stop mucking up the inner loop in the first place by copying it's header (which, BTW, also screws up other transformations). Zdenek has said that loop header copying is there, in part, to increase the effectiveness of code motion. This is the increased code motion you get :).
Loop header copying for the inner loop is required for # of iterations analysis - though we should move that header copy out of the outer loop, too, if possible.
Subject: Re: PRE confuses loop number of iterations analysis On Thu, 2006-03-30 at 14:37 +0000, rguenth at gcc dot gnu dot org wrote: > > ------- Comment #5 from rguenth at gcc dot gnu dot org 2006-03-30 14:37 ------- > Loop header copying for the inner loop is required for # of iterations analysis > - though we should move that header copy out of the outer loop, too, if > possible. Again, you should spend your time improving SCEV, or improving the # of iterations analysis. Really. You are saying "we want to copy loop headers, but don't want PRE to take advantage of the copied loop headers". If we can copy the loop header and prove that we iterate at least once (or whatever condition this proves for # of iterations analysis), you should be able to do it without copying the loop header as well. Though, it's probably just better to teach SCEV to deal with the code that PRE has made, since improving SCEV improves a lot of things. > >
You are probably right about improving SCEV - I hope Sebastian can make it work for this and similar cases. Wrt the loop header it is that we convert the loop to a do-while style loop, which at least iterates once, but to make this transformation valid, we need to copy the loop header. The "bug" is of course that we later try to prove again that this is a do-while style loop, which we could better have remembered somehow. I.e. for (i = start; i < end; ++i) ; iterates end-start times, if start <= end. So we transform it to if (start < end) for (i = start; i < end; ++i) ; and later we "prove" that this new loop runs at least once by taking the loop exit condition and trying to simplify that based on dominating conditions from the loop header.
Just for the record, the attached patch bootstrapped and regtested on x86_64-unknown-linux-gnu, with the following fallout: FAIL: gcc.dg/tree-ssa/loadpre4.c scan-tree-dump-times Eliminated: 1 1 FAIL: gcc.dg/tree-ssa/loadpre6.c scan-tree-dump-times Eliminated: 2 1
Won't this get fixed by the patch which patches loop header copy for PR 23855?
Yes, probably - though that patch doesn't apply anymore and wasn't reviewed either.
Note that the patch for 23855 will only help for invariant conditions in the loop header, while the problem exists also for non-invariant ones. So, as Danny notes, SCEV should be improved to maybe handle this case.
Created attachment 11184 [details] fix This patch fixes this problem. I'll bootntestncommit.
Subject: Bug 26939 Author: spop Date: Sun Apr 2 14:08:02 2006 New Revision: 112623 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=112623 Log: PR tree-optimization/26939 * tree-chrec.c (chrec_merge): Use eq_evolutions_p. Modified: trunk/gcc/ChangeLog trunk/gcc/tree-chrec.c
With the patch we now can no longer figure out the number of iterations of the inner loop, because we have an exit condition of i1D.1522_6 != 2147483647 now!? Which we cannot simplify against i1D.1522_6 >= 0 (I will extend my compare_conditions patch to handle this case - but I have to think about it some more).
Hmm, this cannot be simplified.
No news for almost three years. Where are we with this one today?
The situation is still worse than originally reported. Even without PRE we have Analyzing # of iterations of loop 2 exit condition [1, + , 1] <= i1_6(D) bounds on difference of bases: -1 ... 2147483646 result: under assumptions i1_6(D) != 2147483647 # of iterations (unsigned int) i1_6(D), bounded by 2147483647 (set_nb_iterations_in_loop = scev_not_known)) for the inner loop which is just <bb 4>: <bb 5>: # i_13 = PHI <i_8(4), 0(9)> D.1244_7 = j_10 + 1; bar (D.1244_7); i_8 = i_13 + 1; if (i1_6(D) >= i_8) goto <bb 4>; else goto <bb 6>; ... <bb 8>: # j_10 = PHI <j_9(7), 0(3)> if (i1_6(D) >= 0) goto <bb 9>; else goto <bb 6>; <bb 9>: goto <bb 5>; which means we cannot prove that with i_8 = {1, +, 1}_2 the loop runs at least once if only i1_6 >= 0 is known (remember its the number of latch executions that are counted). Smaller testcase: void bar(); void foo(int i1) { int i; for (i=0; i<=i1; ++i) bar(); } It "works" once you change the loop exit condition to i < i1. Same effects with unsigned variables (adjust the lower bound to sth like 2 to avoid ill effects). I changed the Summary to what is more appropriate.
> It "works" once you change the loop exit condition to i < i1. Same effects > with unsigned variables (adjust the lower bound to sth like 2 to avoid ill > effects). There is nothing to fix if unsigned variables are used, as the inner loop may be infinite in that case. For the signed case, we can use the fact that i cannot wrap (with flag_strict_overflow), but there are some complications (see PR25985).
> Smaller testcase: > > void bar(); > void foo(int i1) > { > int i; > > for (i=0; i<=i1; ++i) > bar(); > } What the # of iterations analysis does is the following: -- the number of iterations is i1, unless i1==MAXINT, in which case it is infinity -- we cannot prove that i1 is not MAXINT (*) -- therefore, we must record the assumption i1!=MAXINT There is not much that can be done about this in general. We could make # of iterations analysis to be more specific, e.g. return the assumption i1==MAXINT as a condition for the number of iterations to be infinite (similarly as the rtl level analysis does), however, I don't think any of the tree level optimization passes can use that information. For some optimizations (final value replacement is the only one that I can think about now), we could use the fact that we are only interested in the number of iterations under the assumption that the given exit is taken (# of iterations analysis already supports that, by the only_exit flag). I would suggest changing the summary to something more appropriate, as the # of iterations analysis seems to work just fine for this testcase :-) (*) it might seem that we can use the fact that the induction variable i does not wrap; however, the program might terminate in the function bar before the induction variable i would wrap, regardless of whether i1==MAXINT
Subject: Re: loop number of iterations analysis not working If the program terminates before i would wrap, then the number of iterations was not MAXINT. And since it can't wrap, it is not infinite in any case. I agree you can't prove the number of iterations (since bar could exit), but the requiring the assumption i != MAXINT still seems useless. On Tue, Feb 17, 2009 at 7:50 PM, rakdver at gcc dot gnu dot org <gcc-bugzilla@gcc.gnu.org> wrote: > > > ------- Comment #19 from rakdver at gcc dot gnu dot org 2009-02-18 00:50 ------- >> Smaller testcase: >> >> void bar(); >> void foo(int i1) >> { >> int i; >> >> for (i=0; i<=i1; ++i) >> bar(); >> } > > What the # of iterations analysis does is the following: > > -- the number of iterations is i1, unless i1==MAXINT, in which case it is > infinity > -- we cannot prove that i1 is not MAXINT (*) > -- therefore, we must record the assumption i1!=MAXINT > > There is not much that can be done about this in general. We could make # of > iterations analysis to be more specific, e.g. return the assumption i1==MAXINT > as a condition for the number of iterations to be infinite (similarly as the > rtl level analysis does), however, I don't think any of the tree level > optimization passes can use that information. > > For some optimizations (final value replacement is the only one that I can > think about now), we could use the fact that we are only interested in the > number of iterations under the assumption that the given exit is taken (# of > iterations analysis already supports that, by the only_exit flag). > > I would suggest changing the summary to something more appropriate, as the # of > iterations analysis seems to work just fine for this testcase :-) > > (*) it might seem that we can use the fact that the induction variable i does > not wrap; however, the program might terminate in the function bar before the > induction variable i would wrap, regardless of whether i1==MAXINT > > > -- > > rakdver at gcc dot gnu dot org changed: > > What |Removed |Added > ---------------------------------------------------------------------------- > AssignedTo|rakdver at gcc dot gnu dot |unassigned at gcc dot gnu > |org |dot org > Status|ASSIGNED |NEW > > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26939 > > ------- You are receiving this mail because: ------- > You are on the CC list for the bug, or are watching someone who is. >
Subject: Re: loop number of iterations analysis not working > If the program terminates before i would wrap, then the number of > iterations was not MAXINT. > And since it can't wrap, it is not infinite in any case. > > I agree you can't prove the number of iterations (since bar could > exit), but the requiring the assumption i != MAXINT still seems > useless. What do you propose that the number of iterations analysis should return, then?
(In reply to comment #21) > Subject: Re: loop number of iterations analysis not working > > > If the program terminates before i would wrap, then the number of > > iterations was not MAXINT. > > And since it can't wrap, it is not infinite in any case. > > > > I agree you can't prove the number of iterations (since bar could > > exit), but the requiring the assumption i != MAXINT still seems > > useless. > > What do you propose that the number of iterations analysis should > return, then? actually, scratch that. You can redefine the semantics of what number of iterations analysis returns to make i1 a correct answer, while still keeping it safe to use for optimizations: "number_of_iterations_exit (EXIT) returns an expression N such that you don't change the semantics of the program by replacing the condition for taking EXIT with [0,+,1] == N" Now I just need to figure out how to make it work this way without completely rewriting the whole analysis.
Subject: Re: loop number of iterations analysis not working On Wed, 18 Feb 2009, rakdver at kam dot mff dot cuni dot cz wrote: > > > ------- Comment #21 from rakdver at kam dot mff dot cuni dot cz 2009-02-18 04:11 ------- > Subject: Re: loop number of iterations analysis not working > > > If the program terminates before i would wrap, then the number of > > iterations was not MAXINT. > > And since it can't wrap, it is not infinite in any case. > > > > I agree you can't prove the number of iterations (since bar could > > exit), but the requiring the assumption i != MAXINT still seems > > useless. > > What do you propose that the number of iterations analysis should > return, then? Note that the function call is artificial in the testcase (just to make the loop non-empty). A poor choice admittedly ;) But yes, I expected that i != MAXINT follows from i's signedness. Richard.
is this the same problem? -- 'i*2 < 35' can't overflow void foo(char *ptr, unsigned size) { unsigned i; for(i=0; i*2 < size && i*2 < 35; i++ ) { *ptr++ = 0; } } # gcc -Wunsafe-loop-optimizations -O3 -c unsafe_loop.c unsafe_loop.c: In function ‘foo’: unsafe_loop.c:5:5: warning: cannot optimize loop, the loop counter may overflow [-Wunsafe-loop-optimizations] gcc/x86/[trunk revision 162274]
In GCC 4.7 (and above), we get: Analyzing # of iterations of loop 2 exit condition [1, + , 1](no_overflow) <= i1_6(D) bounds on difference of bases: -1 ... 2147483646 result: under assumptions i1_6(D) != 2147483647 # of iterations (unsigned int) i1_6(D), bounded by 2147483647 Analyzing # of iterations of loop 1 exit condition [1, + , 1](no_overflow) <= j1_4(D) bounds on difference of bases: -1 ... 2147483646 result: under assumptions j1_4(D) != 2147483647 # of iterations (unsigned int) j1_4(D), bounded by 2147483647 Is there anything to fix in this bug left? (In reply to Dmitry G. Dyachenko from comment #24) > is this the same problem? -- 'i*2 < 35' can't overflow The warning for this one is gone in GCC 4.6.0+