This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] Unrolling addressing optimization

From: Zdenek Dvorak <rakdver at atrey dot karlin dot mff dot cuni dot cz>
To: Revital Eres <ERES at il dot ibm dot com>
Cc: gcc-patches at gcc dot gnu dot org
Date: Sun, 11 Apr 2004 12:43:06 +0200
Subject: Re: [PATCH] Unrolling addressing optimization
References: <20040408090445.GA6518@atrey.karlin.mff.cuni.cz> <OF12B04997.4551D753-ONC2256E73.0026D551-C2256E73.002E8E2C@il.ibm.com>

Hello,

> > CSE then optimizes this into the same code you describe below:
> 
> > while (...)
> > {
> >   i1 = i0 + 1
> >   load a[i0 + 1]
> >   ....
> >   i2 = i0 + 2
> >   load a[i0 + 2]
> >   ....
> >   i3 = i0 + 3
> >   load a[i0 + 3]
> >   ....
> >   i0 = i0 + k
> >   load a[i0 + k]
> >}
> 
> 
> It is not what our optimization does. In some architectures 
> (like PowerPC), load a[i0 + k] requires two assembly instructions, 
> so the above transformation is not useful. 

right, you also do strength reduction in the same time; i.e. the problem
here again is not that we would miss this optimization, but that
strength reduction should do its job.

Note that it indeed may sometimes be a better idea to do the two
instructions necessary for the load rather than doing the strength
reduction, since the later increases register pressure (which probably
is not that much of a problem, but doing this on i686 as well might be
disaster, especially since we do not need two instructions there).

So either we must leave this on strength reduction that already
does consider these issues, or take the register pressure into account
during the optimization pass you propose.

> This is probably the reason for which such cse is not done on PowerPC.

No it is not -- it is purely due to stupidity of cse.  It has a somewhat weird
idea that keeping the increments separate is a good idea on machines
that have autoincrement addressing modes, so that our autoinc pass is
able to use them; which might be correct for the particular test we
consider, but in general it just spoils the code.  The proper fix would
of course be to rewrite the autoinc optimization pass to behave more
sanely, but somehow I did not have a time for it so far.

Zdenek

References:
- Re: [PATCH] Unrolling addressing optimization
  - From: Zdenek Dvorak
- Re: [PATCH] Unrolling addressing optimization
  - From: Revital Eres

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]