Bug 26939 - loop number of iterations analysis not working
Summary: loop number of iterations analysis not working
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.2.0
: P3 minor
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks: 26944
  Show dependency treegraph
 
Reported: 2006-03-30 11:06 UTC by Richard Biener
Modified: 2021-11-28 04:22 UTC (History)
5 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2009-02-09 00:54:06


Attachments
candidate patch (535 bytes, patch)
2006-03-30 13:34 UTC, Richard Biener
Details | Diff
fix (199 bytes, patch)
2006-04-02 08:12 UTC, sebastian.pop@cri.ensmp.fr
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Richard Biener 2006-03-30 11:06:41 UTC
/* { dg-do compile } */
/* { dg-options "-O2 -fdump-tree-sccp-details" } */

void bar(int);
void foo(int i1, int j1)
{
  int i, j;

  for (j=0; j<=j1; ++j)
    for (i=0; i<=i1; ++i)
      bar(j+1);
}

/* { dg-final { scan-tree-dump-not "set_nb_iterations_in_loop = scev_not_known" "sccp"} } */
/* { dg-final { cleanup-tree-dump "sccp" } } */


Compare to using j-1 in the inner loop, which makes # iterations analysis
succeed.
Comment 1 Richard Biener 2006-03-30 12:22:45 UTC
Note this testcase requires the fix(es) for PR26900 to figure out the number of
iterations for the inner loop.  The failure to figure out the number of iterations of the outer loop can be reproduced with mainline.  Basically we have (in the working case with using j-1):


foo (i1, j1)
{
  int pretmp.20;
  int j;
  int i;
  int D.1534;

<bb 2>:
  if (j1_4 >= 0) goto <L14>; else goto <L5>;

<L14>:;
  goto <bb 8> (<L2>);

<L16>:;

  # i_14 = PHI <i_9(4), 0(9)>;
<L1>:;
  D.1534_8 = pretmp.20_10;
  bar (pretmp.20_10);
  i_9 = i_14 + 1;
  if (i1_6 >= i_9) goto <L16>; else goto <L3>;

<L3>:;
  j_7 = j_12 + 1;
  if (j1_4 >= j_7) goto <L18>; else goto <L5>;

<L18>:;
  
  # j_12 = PHI <j_7(7), 0(3)>;
<L2>:;
  if (i1_6 >= 0) goto <L20>; else goto <L3>;

<L20>:;
  pretmp.20_10 = j_12 - 1;
  goto <bb 5> (<L1>);

<L5>:;
  return;

}

while with using j+1 we get

foo (i1, j1)
{
  int prephitmp.21;
  int pretmp.20;
  int j;
  int i;
  int D.1534;

<bb 2>:
  if (j1_4 >= 0) goto <L14>; else goto <L5>;

<L14>:;
  goto <bb 9> (<L2>);

<L16>:;

  # i_14 = PHI <i_9(4), 0(11)>;
<L1>:;
  D.1534_8 = pretmp.20_13;
  bar (pretmp.20_13);
  i_9 = i_14 + 1;
  if (i1_6 >= i_9) goto <L16>; else goto <L17>;

  # D.1534_2 = PHI <pretmp.20_13(5)>;
<L17>:;

  # prephitmp.21_11 = PHI <D.1534_2(6), pretmp.20_1(10)>;
<L3>:;
  j_7 = prephitmp.21_11;
  if (j1_4 >= prephitmp.21_11) goto <L18>; else goto <L5>;

<L18>:;

  # j_12 = PHI <prephitmp.21_11(8), 0(3)>;
<L2>:;
  if (i1_6 >= 0) goto <L20>; else goto <L21>;

<L21>:;
  pretmp.20_1 = j_12 + 1;
  goto <bb 7> (<L3>);

<L20>:;
  pretmp.20_13 = j_12 + 1;
  goto <bb 5> (<L1>);
  
<L5>:;
  return;
  
}

Comment 2 Richard Biener 2006-03-30 12:31:53 UTC
You can see that PRE makes a mess out of it because of the copied loop header of the inner loop.  So maybe Zdeneks patch to move the loop header copy outside of the first loop helps here.  Though I'd prefer to prevent PRE from doing the
transformation here.  Danny, any ideas?
Comment 3 Richard Biener 2006-03-30 13:34:52 UTC
Created attachment 11164 [details]
candidate patch

The attached patch is maybe a fix - though a better way to detect what could be
a loop header copy is appreciated ;)
Comment 4 Daniel Berlin 2006-03-30 14:04:07 UTC
(In reply to comment #2)
> You can see that PRE makes a mess out of it because of the copied loop header
> of the inner loop.  So maybe Zdeneks patch to move the loop header copy outside
> of the first loop helps here.  Though I'd prefer to prevent PRE from doing the
> transformation here.  Danny, any ideas?
> 

Errr, why can't we just improve SCEV to handle this case if we can detect it?
:)

Either that, or stop mucking up the inner loop in the first place by copying it's header (which, BTW, also screws up other transformations).

Zdenek has said that loop header copying is there, in part, to increase the effectiveness of code motion.
This is the increased code motion you get :).
Comment 5 Richard Biener 2006-03-30 14:37:23 UTC
Loop header copying for the inner loop is required for # of iterations analysis - though we should move that header copy out of the outer loop, too, if possible.
Comment 6 Daniel Berlin 2006-03-30 14:43:09 UTC
Subject: Re:  PRE confuses loop number of
	iterations analysis

On Thu, 2006-03-30 at 14:37 +0000, rguenth at gcc dot gnu dot org wrote:
> 
> ------- Comment #5 from rguenth at gcc dot gnu dot org  2006-03-30 14:37 -------
> Loop header copying for the inner loop is required for # of iterations analysis
> - though we should move that header copy out of the outer loop, too, if
> possible.

Again, you should spend your time improving SCEV, or improving the # of
iterations analysis.  Really.   You are saying "we want to copy loop
headers, but don't want PRE to take advantage of the copied loop
headers".

If we can copy the loop header and prove that we iterate at least once
(or whatever condition this proves for # of iterations analysis), you
should be able to do it without copying the loop header as well.

Though, it's probably just better to teach SCEV to deal with the code
that PRE has made, since improving SCEV improves a lot of things.


> 
> 

Comment 7 Richard Biener 2006-03-30 15:11:32 UTC
You are probably right about improving SCEV - I hope Sebastian can make it work
for this and similar cases.  Wrt the loop header it is that we convert the loop
to a do-while style loop, which at least iterates once, but to make this transformation valid, we need to copy the loop header.  The "bug" is of course that we later try to prove again that this is a do-while style loop, which we could better have remembered somehow.  I.e.

  for (i = start; i < end; ++i)
    ;

iterates end-start times, if start <= end.  So we transform it to

  if (start < end)
    for (i = start; i < end; ++i)
      ;

and later we "prove" that this new loop runs at least once by taking
the loop exit condition and trying to simplify that based on dominating
conditions from the loop header.
Comment 8 Richard Biener 2006-03-30 15:29:06 UTC
Just for the record, the attached patch bootstrapped and regtested on x86_64-unknown-linux-gnu, with the following fallout:

FAIL: gcc.dg/tree-ssa/loadpre4.c scan-tree-dump-times Eliminated: 1 1
FAIL: gcc.dg/tree-ssa/loadpre6.c scan-tree-dump-times Eliminated: 2 1
Comment 9 Andrew Pinski 2006-03-30 22:16:48 UTC
Won't this get fixed by the patch which patches loop header copy for PR 23855?
Comment 10 Richard Biener 2006-03-30 22:24:06 UTC
Yes, probably - though that patch doesn't apply anymore and wasn't reviewed either.
Comment 11 Richard Biener 2006-03-30 22:25:33 UTC
Note that the patch for 23855 will only help for invariant conditions in the loop header, while the problem exists also for non-invariant ones.  So, as Danny notes, SCEV should be improved to maybe handle this case.
Comment 12 sebastian.pop@cri.ensmp.fr 2006-04-02 08:12:07 UTC
Created attachment 11184 [details]
fix

This patch fixes this problem.  I'll bootntestncommit.
Comment 13 Sebastian Pop 2006-04-02 14:08:05 UTC
Subject: Bug 26939

Author: spop
Date: Sun Apr  2 14:08:02 2006
New Revision: 112623

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=112623
Log:
	PR tree-optimization/26939
	* tree-chrec.c (chrec_merge): Use eq_evolutions_p.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/tree-chrec.c

Comment 14 Richard Biener 2006-04-03 09:12:11 UTC
With the patch we now can no longer figure out the number of iterations of the inner loop, because we have an exit condition of i1D.1522_6 != 2147483647 now!?
Which we cannot simplify against i1D.1522_6 >= 0 (I will extend my compare_conditions patch to handle this case - but I have to think about it some
more).
Comment 15 Richard Biener 2006-04-03 09:19:09 UTC
Hmm, this cannot be simplified.
Comment 16 Steven Bosscher 2009-02-08 14:17:25 UTC
No news for almost three years.  Where are we with this one today?
Comment 17 Richard Biener 2009-02-08 14:45:07 UTC
The situation is still worse than originally reported.  Even without PRE we have

Analyzing # of iterations of loop 2
  exit condition [1, + , 1] <= i1_6(D)
  bounds on difference of bases: -1 ... 2147483646
  result:
    under assumptions i1_6(D) != 2147483647
    # of iterations (unsigned int) i1_6(D), bounded by 2147483647
  (set_nb_iterations_in_loop = scev_not_known))

for the inner loop which is just

<bb 4>:

<bb 5>:
  # i_13 = PHI <i_8(4), 0(9)>
  D.1244_7 = j_10 + 1;
  bar (D.1244_7);
  i_8 = i_13 + 1;
  if (i1_6(D) >= i_8)
    goto <bb 4>;
  else
    goto <bb 6>;

...
<bb 8>:
  # j_10 = PHI <j_9(7), 0(3)>
  if (i1_6(D) >= 0)
    goto <bb 9>;
  else
    goto <bb 6>;

<bb 9>:
  goto <bb 5>;


which means we cannot prove that with i_8 = {1, +, 1}_2 the loop
runs at least once if only i1_6 >= 0 is known (remember its the number
of latch executions that are counted).

Smaller testcase:

void bar();
void foo(int i1)
{
  int i;

  for (i=0; i<=i1; ++i)
    bar();
}

It "works" once you change the loop exit condition to i < i1.  Same effects
with unsigned variables (adjust the lower bound to sth like 2 to avoid ill
effects).

I changed the Summary to what is more appropriate.
Comment 18 Zdenek Dvorak 2009-02-12 19:58:55 UTC
> It "works" once you change the loop exit condition to i < i1.  Same effects
> with unsigned variables (adjust the lower bound to sth like 2 to avoid ill
> effects).

There is nothing to fix if unsigned variables are used, as the inner loop may be infinite in that case.  For the signed case, we can use the fact that i cannot wrap (with flag_strict_overflow), but there are some complications (see PR25985).
Comment 19 Zdenek Dvorak 2009-02-18 00:50:06 UTC
> Smaller testcase:
>
> void bar();
> void foo(int i1)
> {
>   int i;
>
>   for (i=0; i<=i1; ++i)
>     bar();
> }

What the # of iterations analysis does is the following:

-- the number of iterations is i1, unless i1==MAXINT, in which case it is infinity
-- we cannot prove that i1 is not MAXINT (*)
-- therefore, we must record the assumption i1!=MAXINT

There is not much that can be done about this in general.  We could make # of iterations analysis to be more specific, e.g. return the assumption i1==MAXINT as a condition for the number of iterations to be infinite (similarly as the rtl level analysis does), however, I don't think any of the tree level optimization passes can use that information.

For some optimizations (final value replacement is the only one that I can think about now), we could use the fact that we are only interested in the number of iterations under the assumption that the given exit is taken (# of iterations analysis already supports that, by the only_exit flag).

I would suggest changing the summary to something more appropriate, as the # of iterations analysis seems to work just fine for this testcase :-)

(*) it might seem that we can use the fact that the induction variable i does not wrap; however, the program might terminate in the function bar before the induction variable i would wrap, regardless of whether i1==MAXINT
Comment 20 Daniel Berlin 2009-02-18 02:54:00 UTC
Subject: Re:  loop number of iterations analysis 
	not working

If the program terminates before i would wrap, then the number of
iterations was not MAXINT.
And since it can't wrap, it is not infinite in any case.

I agree you can't prove the number of iterations (since bar could
exit), but the requiring the assumption i != MAXINT still seems
useless.



On Tue, Feb 17, 2009 at 7:50 PM, rakdver at gcc dot gnu dot org
<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #19 from rakdver at gcc dot gnu dot org  2009-02-18 00:50 -------
>> Smaller testcase:
>>
>> void bar();
>> void foo(int i1)
>> {
>>   int i;
>>
>>   for (i=0; i<=i1; ++i)
>>     bar();
>> }
>
> What the # of iterations analysis does is the following:
>
> -- the number of iterations is i1, unless i1==MAXINT, in which case it is
> infinity
> -- we cannot prove that i1 is not MAXINT (*)
> -- therefore, we must record the assumption i1!=MAXINT
>
> There is not much that can be done about this in general.  We could make # of
> iterations analysis to be more specific, e.g. return the assumption i1==MAXINT
> as a condition for the number of iterations to be infinite (similarly as the
> rtl level analysis does), however, I don't think any of the tree level
> optimization passes can use that information.
>
> For some optimizations (final value replacement is the only one that I can
> think about now), we could use the fact that we are only interested in the
> number of iterations under the assumption that the given exit is taken (# of
> iterations analysis already supports that, by the only_exit flag).
>
> I would suggest changing the summary to something more appropriate, as the # of
> iterations analysis seems to work just fine for this testcase :-)
>
> (*) it might seem that we can use the fact that the induction variable i does
> not wrap; however, the program might terminate in the function bar before the
> induction variable i would wrap, regardless of whether i1==MAXINT
>
>
> --
>
> rakdver at gcc dot gnu dot org changed:
>
>           What    |Removed                     |Added
> ----------------------------------------------------------------------------
>         AssignedTo|rakdver at gcc dot gnu dot  |unassigned at gcc dot gnu
>                   |org                         |dot org
>             Status|ASSIGNED                    |NEW
>
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26939
>
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug, or are watching someone who is.
>
Comment 21 rakdver@kam.mff.cuni.cz 2009-02-18 04:11:27 UTC
Subject: Re:  loop number of iterations analysis not working

> If the program terminates before i would wrap, then the number of
> iterations was not MAXINT.
> And since it can't wrap, it is not infinite in any case.
> 
> I agree you can't prove the number of iterations (since bar could
> exit), but the requiring the assumption i != MAXINT still seems
> useless.

What do you propose that the number of iterations analysis should
return, then? 
Comment 22 Zdenek Dvorak 2009-02-18 04:47:45 UTC
(In reply to comment #21)
> Subject: Re:  loop number of iterations analysis not working
> 
> > If the program terminates before i would wrap, then the number of
> > iterations was not MAXINT.
> > And since it can't wrap, it is not infinite in any case.
> > 
> > I agree you can't prove the number of iterations (since bar could
> > exit), but the requiring the assumption i != MAXINT still seems
> > useless.
> 
> What do you propose that the number of iterations analysis should
> return, then? 

actually, scratch that.  You can redefine the semantics of what number of iterations analysis returns to make i1 a correct answer, while still keeping it safe to use for optimizations: "number_of_iterations_exit (EXIT) returns an expression N such that you don't change the semantics of the program by replacing the condition for taking EXIT with [0,+,1] == N"
Now I just need to figure out how to make it work this way without completely rewriting the whole analysis.
Comment 23 rguenther@suse.de 2009-02-18 09:34:31 UTC
Subject: Re:  loop number of iterations analysis
 not working

On Wed, 18 Feb 2009, rakdver at kam dot mff dot cuni dot cz wrote:

> 
> 
> ------- Comment #21 from rakdver at kam dot mff dot cuni dot cz  2009-02-18 04:11 -------
> Subject: Re:  loop number of iterations analysis not working
> 
> > If the program terminates before i would wrap, then the number of
> > iterations was not MAXINT.
> > And since it can't wrap, it is not infinite in any case.
> > 
> > I agree you can't prove the number of iterations (since bar could
> > exit), but the requiring the assumption i != MAXINT still seems
> > useless.
> 
> What do you propose that the number of iterations analysis should
> return, then? 

Note that the function call is artificial in the testcase (just to
make the loop non-empty).  A poor choice admittedly ;)

But yes, I expected that i != MAXINT follows from i's signedness.

Richard.
Comment 24 Dmitry G. Dyachenko 2010-07-18 12:56:04 UTC
is this the same problem? -- 'i*2 < 35' can't overflow

void
foo(char *ptr, unsigned size)
{
    unsigned i;
    for(i=0; i*2 < size && i*2 < 35; i++ ) {
        *ptr++ = 0;
    }
}

# gcc -Wunsafe-loop-optimizations  -O3 -c unsafe_loop.c
unsafe_loop.c: In function ‘foo’:
unsafe_loop.c:5:5: warning: cannot optimize loop, the loop counter may overflow [-Wunsafe-loop-optimizations]

gcc/x86/[trunk revision 162274]

Comment 25 Andrew Pinski 2021-11-28 04:22:29 UTC
In GCC 4.7 (and above), we get:

Analyzing # of iterations of loop 2
  exit condition [1, + , 1](no_overflow) <= i1_6(D)
  bounds on difference of bases: -1 ... 2147483646
  result:
    under assumptions i1_6(D) != 2147483647
    # of iterations (unsigned int) i1_6(D), bounded by 2147483647
Analyzing # of iterations of loop 1
  exit condition [1, + , 1](no_overflow) <= j1_4(D)
  bounds on difference of bases: -1 ... 2147483646
  result:
    under assumptions j1_4(D) != 2147483647
    # of iterations (unsigned int) j1_4(D), bounded by 2147483647

Is there anything to fix in this bug left?


(In reply to Dmitry G. Dyachenko from comment #24)
> is this the same problem? -- 'i*2 < 35' can't overflow
The warning for this one is gone in GCC 4.6.0+