This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Variable Expansion Optimization


On Sunday 15 August 2004 12:20, Revital Eres wrote:
> Hello,
>
> I intend to implement variable expansion
> in the body of the unrolled loop.
>
> For example, consider the following loop
> which contains an accumulator:
>
> for (i = 0 ; i < n; i++)
>     sum += a[i];
>
> After the unrolling we get the following code:
>
> sum += a[i];
> ....
> i = i+1;
> sum += a[i];
> ....
> i = i+1
> sum += a[i];
> ....
>
> The variable expansion optimization will expand the variable
> sum into n separate copies, one for each copy of the loop body
> and combine them at the loop exit:
>
> sum += a[i]
> ....
> i = i+1;
> sum1 += a[i]
> ....
> i = i+1
> sum2 += a[i];
> ....
>
> This transformation decreases the number of dependences
> in the loop, thus making instruction scheduling
> more effective when applied on it.

Hmm, I guess you're going to have carefully analyze what the
effects of such a transformation are on register pressure.

Compare the code for the following cases on amd64 for example,
at -O3 -funroll-loops.  foo1 is unrolled 8 times so I manually
unrolled it for foo2 and manually applied the transformation
you want to implement.

--------------------------------------------
#define n 256

int a[n];

int foo1 (void)
{
  int i, sum;
  sum = 0;
  for (i = 0 ; i < n; i++)
    sum += a[i];
  return sum;
}

int foo2 (void)
{
  int i, sum;
  int i1, i2, i3, i4, i5, i6, i7, i8;
  int sum1, sum2, sum3, sum4, sum5, sum6, sum7, sum8;

  sum1 = sum2 = sum3 = sum4 = sum5 = sum6 = sum7 = sum8 = 0;
  for (i = 0 ; i < n / 8; i++)
    {
      i1 = i;
      sum1 += a[i1];

      i2 = i + 1;
      sum2 += a[i2];

      i3 = i + 2;
      sum3 += a[i3];

      i4 = i + 3;
      sum4 += a[i4];

      i5 = i + 4;
      sum5 += a[i5];

      i6 = i + 5;
      sum6 += a[i6];

      i7 = i + 6;
      sum7 += a[i7];

      i8 = i + 7;
      sum8 += a[i8];
    }

  sum = sum1 + sum2 + sum3 + sum4 + sum5 + sum6 + sum7 + sum8;
  return sum;
}
--------------------------------------------

The code for foo2 is really bad.  I guess you need to work with
the unroller to figure out when you can profitably apply this
transformation.  Perhaps you even want to be able to partially
reroll a loop?

Gr.
Steven



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]