This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [Patch,AVR]: PR50447: Tweak addhi3


2011/10/18 Georg-Johann Lay <avr@gjlay.de>:
> Denis Chertykov schrieb:
>> 2011/10/18 Georg-Johann Lay <avr@gjlay.de>:
>>> Denis Chertykov schrieb:
>>>> 2011/10/18 Georg-Johann Lay <avr@gjlay.de>:
>>>>> This patch do some tweaks to addhi3 like adding QI scratch register.
>>>>>
>>>>> The original *addhi3 insn is still there and located prior to new
>>>>> addhi3_clobber insn because addhi3 is special to reload (thanks Danis for this
>>>>> note) so that there is a version with and a version without scratch register.
>>>>>
>>>>> Patch passes without regressions.
>>>>>
>>>> Which improvements added by this patch ?
>>>>
>>>> Denis.
>>> If the addhi3 is expanded early, the addition happens with QI scratch which
>>> avoids reload of constant if target register is in NO_LD. And reduce register
>>> pressure as only QI is needed and not reload of constant to HI.
>>>
>>> Otherwise, there might be sequences like
>>>
>>> ldi r31, 2 Â Â; *reload_inhi
>>> mov r12, r31
>>> clr r13
>>>
>>> add r14, r12 Â; *addhi3
>>> adc r15, r13
>>>
>>> which now will be
>>>
>>> ldi r31, 2 Â Â; addhi3_clobber
>>> add r14, r31
>>> adc r15, __zero_reg__
>>>
>>> Similar applies if the reload of the constant happens to LD regs:
>>>
>>> ldi r30, 2 Â Â; *movhi
>>> clr r31
>>>
>>> add r14, r12 Â; *addhi3
>>> adc r15, r13
>>>
>>> will become
>>>
>>> ldi r30, 2 Â Â; addhi3_clobber
>>> add r14, r30
>>> adc r15, __zero_reg__
>>>
>>> For *addhi3 insns the register pressure is not reduced but the insn sequence
>>> might be smarter if peep2 comes up with a QI scratch or if it detects a
>>> *reload_inhi insn just prior to the addition (and the reg that holds the
>>> reloaded constant dies after the addition).
>>>
>>> As *addhi3 is special to reload, there is still an "ordinary" add addhi insn
>>> without scratch. This is easier because, e.g. prologue and epilogue generation
>>> generate add insns (not by means of addhi3 expander but by explicit
>>> gan_rtx_PLUS). Yet the addhi3 expander factors out the situations when an
>>> addhi3 insn is to be generated via addhi3 expander late in the compilation process
>>
>> Please provide any real world example.
>>
>> Denis.
>
> Consider avr-libc (under the assumption that it is "real world" code):
>
> In avr-libc's build directory, and with the patch integrated:
>
> $ cd avr/lib/avr4
> $ make clean && make CFLAGS='-save-temps -dp -Os'
> $ grep -A 2 'addhi3_clobber\/2' *.s > out-nopeep2.txt (see attachment)
> $ grep 'addhi3_clobber\/2' *.s | wc -l
> 33
>
> This shows that the insns are already there before peep2 and thus no reload of
> 16-bit constant is needed; an 8-bit scratch is sufficient.
>
> Alternatively, the implementation could omit the expansion to addhi3_clobber in
> addhi3 expander and instead rely completely on peep2. However, that does not
> reduce register pressure because a 16-bit register will be allocated and the
> peep2 just prints things smarter and needs just a QI scratch to call
> avr_out_plus_clobber.
>
> For +/-1, the addition with SEC/ADD/ADC resp. SEC/SBC/SBC leaves cc0 in a mess.
> Âas most loops use +/-1 on the counter variable, LDI/SUB/SBC is not shorter but
> better because it sets cc0.
>
> So you like this patch?
> Or prefer a patch that is neutral with respect to register allocator and just
> uses peep2 to print things smarter?

I'm interested in code improvements.
What difference in size of avr-libc ?

Denis.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]