This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: RFA: autoincrement patches for gcc 4 - updated patch

From: Joern RENNECKE <joern dot rennecke at st dot com>
To: law at redhat dot com
Cc: Paul Brook <paul at codesourcery dot com>, gcc-patches at gcc dot gnu dot org, Ian Lance Taylor <iant at google dot com>, Richard Henderson <rth at redhat dot com>, Bernd Schmidt <bernd dot schmidt at analog dot com>
Date: Mon, 17 Jul 2006 15:30:02 +0100
Subject: Re: RFA: autoincrement patches for gcc 4 - updated patch
References: <421F4698.1050809@st.com> <44996F60.8000108@st.com> <m3r70nmz0x.fsf@localhost.localdomain> <200607151602.32788.paul@codesourcery.com> <1153124452.2709.863.camel@fuel98.slc.redhat.com>

Jeffrey Law wrote:

On Sat, 2006-07-15 at 16:02 +0100, Paul Brook wrote:
When I look at this patch, it seems to me that it turns
pseudo-assembly code like this:
   r1 = r0 + 8
   r2 = (r1)
   r3 = r0 + 12
   r4 = (r3)
...
However, it seems to me that we can view this as two separate
optimizations.  The first one is to change the first code above into
this:
   r1 = r0 + 8
   r2 = (r1)
   r1 = r1 + 4
   r4 = (r1)
On machines without register offset addressing and with relatively few registers, this is a useful optimization because it decreases register pressure.
This can also useful on machines with limited immediate ranges (eg. most RISC machines). Typically it occurs with large structures, ie. when "+ 12" requires multiple instructions.

We've seen evidence that this transformation would help Thumb code on CSiBE.
If you look at PRE+strength reduction that's exactly what it will do. It considers r0 + 8 and r0 + 12 as equivalent and thus removes the r0 + 12 expression evaluation as it's redundant.

combine wants r1 and r3 separate so that it can generate r2 = (r0+r8) r4 = (r0+r12) on processors where this is possible. So will you be running thes PRE+strength reduction pass between combine and flow (or its replacement)? And what are you going to do about the increased scheduling rigidity? What are you going to do when r0+8 is used in more than one place, but separated by another r0 use? Applying PRE naiively will increase the instruction count.

References:
- Re: RFA: autoincrement patches for gcc 4 - updated patch
  - From: Ian Lance Taylor
- Re: RFA: autoincrement patches for gcc 4 - updated patch
  - From: Paul Brook
- Re: RFA: autoincrement patches for gcc 4 - updated patch
  - From: Jeffrey Law

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]