This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: paired register loads and stores


On Fri, Sep 29, 2006 at 05:27:10AM +0000, Erich Plondke wrote:
> rs6000 and Sparc ports seem to use a peephole2 to get the ldd or lfq
> instructions (respectively), but it looks like there's no reason for
> the register allocater to allocate registers together.  The peephole2
> just picks up loads to adjacent memory locations if the allocater
> happens to choose adjacent registers (is that correct?) or the
> variables are specified as living in hard registers with the help
> of an asm.
> 
> Several other architectures have paired loads: some ARM targets have ldrd
> which can be cheaper than a ldm, and ia64 has a pair load.
> 
> It seems like GCC does a good job of knowing how to modify register-
> sized subregs of two- or four-register larger modes.  So if I could
> tell GCC to turn:
> 
>        [(set (reg:SI X) (mem:SI (addr)))
>         (set (reg:SI Y) (mem:SI (addr+4)))]
> 
> (where addr is aligned to DI) into something like:
>        [(set (reg:DI T) (mem:DI (addr)))
>         (set (reg:SI X) (subreg:SI (reg:DI T) 0))
>         (set (reg:SI Y) (subreg:SI (reg:DI T) 4))]
> 
> and I could do so early enough, GCC would know to access the subregs
> directly in instruction(s) using the loaded values, and I would end up 
> loading
> the register pair and using the individual elements.  But it has to
> be done early on; after register allocation even if I could get a
> DI temporary I'd probably have the two SI moves and that's probably
> not a win.

   You may have success using the combine pass to do this. The difficulty is
that combine only tries to combine instructions when the LOG_LINKS field is
set up. I think this only happens for plain SET insns when subregs are
involved, e.g.

	(set (subreg:SI (reg:DI T) 0) (mem:SI addr))
	(set (subreg:SI (reg:DI T) 4) (mem:SI addr+4))

   For example, I don't know how to make this work with adjecent structure
fields. You could try to extend the optimization that GCC already does for
loading adjecent structure fields smaller than a word; the one enabled by
SLOW_BYTE_ACCESS.

-- 
Rask Ingemann Lambertsen


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]