This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: rtlopt loop unroller question


(Again, on behalf of Yossi)
> as I suspected, my favourite piece of cse strikes again.  With the
> patch below, the code produced is much better

First, tnx for your patch; the code is indeed much better!
However, if we complicate the example a little bit:

{
  int A[N];
  int B[N];
  int C[N];
  int i;

  for (i=0; i<N; i++)
    A[i]=B[i]+C[i];
  return A;
}

we still get the inefficient addressing calculations.
Using static variables instead of local ones yields
much better code with or without your patch.

The second thing is that the redundant address calculation are being
removed
in the simple cse after reload (postreload) so the unroller still gets
warped code.
Can it be fixed earlier?

>             addi r4,r12,4
>            addi r2,r12,8
>            addi r29,r12,12
>            addi r28,r12,16
>            addi r27,r12,20
>            addi r26,r12,24
>            addi r25,r12,28

Finally, we still get redundant adds (compared to old-unroll); can
something be
done to fix them too?

Yossi


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]