This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Unrolling addressing optimization
Hello,
> > CSE then optimizes this into the same code you describe below:
>
> > while (...)
> > {
> > i1 = i0 + 1
> > load a[i0 + 1]
> > ....
> > i2 = i0 + 2
> > load a[i0 + 2]
> > ....
> > i3 = i0 + 3
> > load a[i0 + 3]
> > ....
> > i0 = i0 + k
> > load a[i0 + k]
> >}
>
>
> It is not what our optimization does. In some architectures
> (like PowerPC), load a[i0 + k] requires two assembly instructions,
> so the above transformation is not useful.
right, you also do strength reduction in the same time; i.e. the problem
here again is not that we would miss this optimization, but that
strength reduction should do its job.
Note that it indeed may sometimes be a better idea to do the two
instructions necessary for the load rather than doing the strength
reduction, since the later increases register pressure (which probably
is not that much of a problem, but doing this on i686 as well might be
disaster, especially since we do not need two instructions there).
So either we must leave this on strength reduction that already
does consider these issues, or take the register pressure into account
during the optimization pass you propose.
> This is probably the reason for which such cse is not done on PowerPC.
No it is not -- it is purely due to stupidity of cse. It has a somewhat weird
idea that keeping the increments separate is a good idea on machines
that have autoincrement addressing modes, so that our autoinc pass is
able to use them; which might be correct for the particular test we
consider, but in general it just spoils the code. The proper fix would
of course be to rewrite the autoinc optimization pass to behave more
sanely, but somehow I did not have a time for it so far.
Zdenek