This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Does GCC generate LDRD/STRD (Register) forms?


On 07 Jul 2015, at 13:52, Bin.Cheng <amker.cheng@gmail.com> wrote:

> On Tue, Jul 7, 2015 at 10:05 AM, Anmol Paralkar (anmparal)
> <anmparal@cisco.com> wrote:
>> Hello,
>> 
>> Does GCC generate LDRD/STRD (Register) forms [A8.8.74/A8.8.211 per ARMv7-A
>> & ARMv7-R ARM]?
>> 
>> Based on various attempts to write code to get GCC to generate a sample
>> form, and subsequently inspecting the code I see in
>> config/arm/arm.c/output_move_double () & arm.md [GCC 4.9.2], I think that
>> these register based forms of LDRD/STRD are
>> not generated, but I thought it might be a good idea to ask on the list,
>> just in case.
> Register based LDRD is harder than immediate version.  ARM doesn't
> support [base + reg + offset] addressing mode, so address computation
> of the second memory reference is scattered both in and out of memory
> reference.  To identify such opportunities, one needs to trace
> registers in address expression the memory access instruction and does
> some kind of value computation and re-association.

Basically, this is what we're trying to do with AMS.  For each mem access it tries to trace the reg values and figure out the effective address expression.  For now we've limited it to the form 'base_reg + index_reg*scale + const_displacement'.  Then we try to see how to fit the address expressions to the available address modes.

It's still work in progress but already shows some improvements.
A classic SH4 example:

float fun (float* x)
{
  return x[0] + x[1] + x[2] + x[3];
}

no AMS:
	mov	r4,r1
	add	#4,r1
	fmov.s	@r4,fr0
	fmov.s	@r1,fr1
	mov	r4,r1
	add	#8,r1
	fadd	fr1,fr0
	fmov.s	@r1,fr1
	add	#12,r4
	fadd	fr1,fr0
	fmov.s	@r4,fr1
	rts	
	fadd	fr1,fr0

AMS:
	fmov.s	@r4+,fr0
	fmov.s	@r4+,fr1
	fadd	fr1,fr0
	fmov.s	@r4+,fr1
	fadd	fr1,fr0
	fmov.s	@r4,fr1
	rts	
	fadd	fr1,fr0

If I understand correctly, ARM's LDRD/STRD are similar to SH's FPU 2x32 pair loads/stores.  It needs the mem access insns of adjacent addresses to be adjacent in the insn stream.  We'll try to do some mem access reordering in AMS, mainly to improve post/pre inc/dec address mode utilization.  Afterwards, adjacent mem accesses can be fused together in a separate RTL pass or AMS sub-pass to avoid re-discovering mem access sequence information, which AMS already has.

Cheers,
Oleg

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]