Bug 35341 - Early exit loop with short known trip count not unrolled
Summary: Early exit loop with short known trip count not unrolled
Status: UNCONFIRMED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.3.0
: P3 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2008-02-24 04:20 UTC by davidxl
Modified: 2022-01-10 11:06 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description davidxl 2008-02-24 04:20:54 UTC
Gcc fully unrolls short trip counted (known) loop if the unrolled loop body size d oes not exceed a threshold. However, if  the loop has early exit, this is not done -- leading to missing scalar opt later on.

Example:

int a[100];
int b[100];
int foo(void)
{
     int i, j;

     for (i = 0; i < 5; i ++)
     {
          a[2*i] += a[i];
          if (a[2*i] == 10) break;
     }

     return 0;
}
Comment 1 Bernhard Reutner-Fischer 2015-03-24 00:55:16 UTC
Did you forget to specify -funroll-loops?

gcc-4.2 -O2 -funroll-loops and gcc-4.4 as well as 5.0 with these options basically generate this optimized dump, which IIUC is what you want.

;; Function foo (foo)

Analyzing Edge Insertions.
foo ()
{
  int temp.34;
  int temp.31;
  int temp.29;
  int temp.26;
  int temp.25;

<bb 2>:
  temp.25 = a[0];
  temp.26 = temp.25 + temp.25;
  a[0] = temp.26;
  if (temp.26 == 10)
    goto <bb 7>;
  else
    goto <bb 3>;

<bb 3>:
  temp.29 = a[2] + a[1];
  a[2] = temp.29;
  if (temp.29 == 10)
    goto <bb 7>;
  else
    goto <bb 4>;

<bb 4>:
  temp.31 = temp.29 + a[4];
  a[4] = temp.31;
  if (temp.31 == 10)
    goto <bb 7>;
  else
    goto <bb 5>;

<bb 5>:
  temp.34 = a[6] + a[3];
  a[6] = temp.34;
  if (temp.34 == 10)
    goto <bb 7>;
  else
    goto <bb 6>;

<bb 6>:
  a[8] = [plus_expr] a[8] + a[4];

<bb 7>:
  return 0;

}
Comment 2 Andrew Pinski 2022-01-10 11:06:24 UTC
Currently we get at -O2:

Estimating sizes for loop 1
 BB: 3, after_exit: 0
  size:   1 _1 = i_13 * 2;
   Induction variable computation will be folded away.
  size:   1 _2 = a[_1];
  size:   1 _3 = a[i_13];
  size:   1 _4 = _2 + _3;
  size:   1 a[_1] = _4;
  size:   2 if (_4 == 10)
 BB: 6, after_exit: 1
 BB: 4, after_exit: 0
  size:   1 i_10 = i_13 + 1;
   Induction variable computation will be folded away.
  size:   1 ivtmp_7 = ivtmp_12 - 1;
   Induction variable computation will be folded away.
  size:   2 if (ivtmp_7 != 0)
   Exit condition will be eliminated in peeled copies.
   Exit condition will be eliminated in last copy.
   Constant conditional.
size: 11-5, last_iteration: 11-5
  Loop size: 11
  Estimated size after unrolling: 20
Not unrolling loop 1: size would grow.

We do unroll at -O3.