[Patch,AVR]: PR49687 (better widening 32-bit mul)

Georg-Johann Lay avr@gjlay.de
Wed Jul 27 14:08:00 GMT 2011


http://gcc.gnu.org/ml/gcc-patches/2011-07/msg02113.html

Weddington, Eric wrote:
> 
>> -----Original Message-----
>> From: Georg-Johann Lay
>>
>> This means that a pure __mulsi3 will have 30+30+20 = 80 bytes (+18).
>>
>> If all functions are used they occupy 116 bytes (-4), so they actually
>> save a little space if they are used all with the benefit that they also
>> can one-extend, extend 32 = 16*32 as well as 32=16*16 and work for
>> small (17 bit signed) constants.
>>
>> __umulhisi3 reads:
>>
>> DEFUN __umulhisi3
>>     mul     A0, B0
>>     movw    C0, r0
>>     mul     A1, B1
>>     movw    C2, r0
>>     mul     A0, B1
>>     add     C1, r0
>>     adc     C2, r1
>>     clr     __zero_reg__
>>     adc     C3, __zero_reg__
>>     mul     A1, B0
>>     add     C1, r0
>>     adc     C2, r1
>>     clr     __zero_reg__
>>     adc     C3, __zero_reg__
>>     ret
>> ENDF __umulhisi3
>>
>> It could be compressed to the following sequence, i.e.
>> 24 bytes instead of 30, but I think that's too much of
>> quenching the last byte out of the code:
>>
>> DEFUN __umulhisi3
>>     mul     A0, B0
>>     movw    C0, r0
>>     mul     A1, B1
>>     movw    C2, r0
>>     mul     A0, B1
>>     rcall   1f
>>     mul     A1, B0
>> 1:  add     C1, r0
>>     adc     C2, r1
>>     clr     __zero_reg__
>>     adc     C3, __zero_reg__
>>     ret
>> ENDF __umulhisi3
>>
>>
>> In that lack of real-world-code that uses 32-bit arithmetic I trust
>> my intuition that code size will decrease in general ;-)
>>
> 
> Hi Johann,
> 
> I would agree with you that it seems that overall code size will decrease in general.
> 
> However, I also like your creative compression in the second sequence above, and I think that it would be best to implement that sequence and try to find others like that where possible.
> 
> Remember that to AVR users, code size is *everything*. Even saving 6 bytes here or there has a positive effect.
> 
> I'll let Richard (or Denis if he's back from vacation) do the actual approval of the patch, as they are a lot more technically competent in this area. But I'm ok with the general tactic of the code reuse with looking at further ways to reduce code size like the example above.
> 
> Eric Weddington


This is a revised patch for review with the changes proposed by Eric,
i.e. __umulhisi3 is calling it's own tail.

A pure __mulsi3 will now cost 30+24+20 = 74 bytes (+12).

Using all functions will cost 110 bytes (-10).

__mulsi3 missed a final ENDF __mulsi3, I added it.

The rest of the patch is just technical:

* postponing emit of implicit library call from expand to split1,
  i.e. after combiner but prior to reload, of course.

* The patch covers QI->SI extensions where such extensions are
  done in two steps:  First an explicit QI-HI extension expanded
  inline and second the implicit HI->SI extension as by, e.g.
  __muluhisi3 (32 = 16 * 32)

* There is a bunch of possible HI/QI combinations. This is done
  with help of code iterators; the cross product covers all 16
  cases of QI->SI resp. HI->SI as signed resp. unsigned extension
  for operand1 resp. operand2.

* extendhisi2 need not to early-clobber the output because HI will
  always start in even register.

Tested without regressions.

Ok to install?

Johann

	PR target/49687
	* config/avr/t-avr (LIB1ASMFUNCS): Remove _xmulhisi3_exit.
	Add _muluhisi3, _mulshisi3, _usmulhisi3.
	* config/avr/libgcc.S (__mulsi3): Rewrite.
	(__mulhisi3): Rewrite.
	(__umulhisi3): Rewrite.
	(__usmulhisi3): New.
	(__muluhisi3): New.
	(__mulshisi3): New.
	(__mulohisi3): New.
	(__mulqi3, __mulqihi3, __umulqihi3, __mulhi3): Use DEFUN/ENDF to
	declare.
	* config/avr/predicates.md (pseudo_register_operand): Rewrite.
	(pseudo_register_or_const_int_operand): New.
	(combine_pseudo_register_operand): New.
	(u16_operand): New.
	(s16_operand): New.
	(o16_operand): New.
	* config/avr/avr.c (avr_rtx_costs): Handle costs for mult:SI.
	* config/avr/avr.md (QIHI, QIHI2): New mode iterators.
	(any_extend, any_extend2): New code iterators.
	(extend_prefix): New code attribute.
	(mulsi3): Rewrite. Turn insn to expander.
	(mulhisi3): Ditto.
	(umulhisi3): Ditto.
	(usmulhisi3): New expander.
	(*mulsi3): New insn-and-split.
	(mulu<mode>si3): New insn-and-split.
	(muls<mode>si3): New insn-and-split.
	(mulohisi3): New insn-and-split.
	(*uumulqihisi3, *uumulhiqisi3, *uumulhihisi3, *uumulqiqisi3,
	*usmulqihisi3, *usmulhiqisi3, *usmulhihisi3, *usmulqiqisi3,
	*sumulqihisi3, *sumulhiqisi3, *sumulhihisi3, *sumulqiqisi3,
	*ssmulqihisi3, *ssmulhiqisi3, *ssmulhihisi3, *ssmulqiqisi3): New
	insn-and-split.
	(*mulsi3_call): Rewrite.
	(*mulhisi3_call): Rewrite.
	(*umulhisi3_call): Rewrite.
	(*usmulhisi3_call): New insn.
	(*muluhisi3_call): New insn.
	(*mulshisi3_call): New insn.
	(*mulohisi3_call): New insn.
	(extendqihi2): Use combine_pseudo_register_operand as predicate
	for operand 1.
	(extendqisi2): Ditto.
	(zero_extendqihi2): Ditto.
	(zero_extendqisi2): Ditto.
	(zero_extendhisi2): Ditto.
	(extendhisi2): Ditto. Don't early-clobber operand 0.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: opt-mul32.diff
Type: text/x-patch
Size: 31959 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20110727/0b7dff0c/attachment.bin>


More information about the Gcc-patches mailing list