Richard Earnshaw (lists)
Mon Feb 3 17:10:00 GMT 2020

On 03/02/2020 15:49, Jeff Law wrote:
> On Mon, 2020-02-03 at 15:50 +0100, Henri Cloetens wrote:
>> Hello all,
>> I have a question on the peephole2 optimizer.
>> - My target has a "load double" instruction:
>>      - It does an indexed load of a 64-bit operand to two 32-bit registers.
>>      - The requirement is that the registers are adjacant
>>        (Ri and Ri+1), and that the offset for the second load is 4 byte more
>>       than for the first load.
>> - I can not find a way to describe this in gcc. I tried
>>     "load_multiple", and this is OK, but gcc only calls that for stack
>> pushing.
>>     I tried the vector facility, but this does not work either.
>> - I tried to write a peephole2 optimizer, and this works out OK, it
>>     manages to recognize the sequence, ... but the peephole2 optimizer is
>>     run AFTER register allocation, and the optimization needs to be done
>>     BEFORE, as there are constraints on the 2 registers, Ri and Ri+1.
>> Any suggestions ?. Is there any way to run peephole2 BEFORE register
>> allocation ?.
> I suggest looking at ldp/stp support in the aarch64 backend.
> jeff

Closer would be the ldrd/strd support when generating code for Arm (not 
thumb); that has a similar restriction on register pairs being adjacent.

Summary, it's hard; and GCC's infrastructure does not support it 
particularly well.


