This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [Patch,AVR]: ad PR49313: 64-bit division


2011/11/20 Georg-Johann Lay <avr@gjlay.de>:
> This implements assembler drop-in replacement for 64-bit division/remainder.
>
> The original libgcc implementation is extremely resource gulping because it
> uses inline in several places and DImode is resource gulping, anyway.
>
> With the patch the sizes (accumulated over all modules with same name) are:
>
> Col #2 = Original libgcc (C-Code)
> Col #3 = New implementation in Asm
> Col #4 = Change (relative)
> Col #5 = Change (absolute)
>
> _udivmod64.o             Â:   Â0  1362 100000.0%  1362
> _negdi2.o               :  1296  Â288  Â-77.8% Â-1008
> _umoddi3.o              Â: Â22326   Â0  -100.0% -22326
> _udivdi3.o              Â: Â26878  Â246  Â-99.1% -26632
> _moddi3.o               : Â28986   Â0  -100.0% -28986
> _divdi3.o               : Â33076  Â840  Â-97.5% -32236
> :::::: Total ::::::: Â Â Â Â Â Â Â Â Â: 362188 252362 Â Â-30.3% -109826
>
> The detailed size report:
>
> avr3/libgcc.a!_udivmod64.o      Â:   Â0  Â174 100000.0%  Â174
> avr31/libgcc.a!_udivmod64.o      :   Â0  Â174 100000.0%  Â174
> avr6/libgcc.a!_udivmod64.o      Â:   Â0  Â162 100000.0%  Â162
> avr5/libgcc.a!_udivmod64.o      Â:   Â0  Â162 100000.0%  Â162
> avr35/libgcc.a!_udivmod64.o      :   Â0  Â162 100000.0%  Â162
> avr51/libgcc.a!_udivmod64.o      :   Â0  Â162 100000.0%  Â162
> avr25/libgcc.a!_udivmod64.o      :   Â0  Â126 100000.0%  Â126
> avr4/libgcc.a!_udivmod64.o      Â:   Â0  Â126 100000.0%  Â126
> libgcc.a!_udivmod64.o         :   Â0  Â114 100000.0%  Â114
> avr4/libgcc.a!_negdi2.o        :  Â140   32  Â-77.1%  -108
> avr25/libgcc.a!_negdi2.o       Â:  Â140   32  Â-77.1%  -108
> avr5/libgcc.a!_negdi2.o        :  Â144   32  Â-77.8%  -112
> avr6/libgcc.a!_negdi2.o        :  Â144   32  Â-77.8%  -112
> libgcc.a!_negdi2.o          Â:  Â144   32  Â-77.8%  -112
> avr35/libgcc.a!_negdi2.o       Â:  Â144   32  Â-77.8%  -112
> avr51/libgcc.a!_negdi2.o       Â:  Â144   32  Â-77.8%  -112
> avr31/libgcc.a!_negdi2.o       Â:  Â148   32  Â-78.4%  -116
> avr3/libgcc.a!_negdi2.o        :  Â148   32  Â-78.4%  -116
> avr4/libgcc.a!_umoddi3.o       Â:  2304   Â0  -100.0% Â-2304
> avr6/libgcc.a!_umoddi3.o       Â:  2360   Â0  -100.0% Â-2360
> avr51/libgcc.a!_umoddi3.o       :  2360   Â0  -100.0% Â-2360
> avr5/libgcc.a!_umoddi3.o       Â:  2360   Â0  -100.0% Â-2360
> avr25/libgcc.a!_umoddi3.o       :  2364   Â0  -100.0% Â-2364
> avr35/libgcc.a!_umoddi3.o       :  2420   Â0  -100.0% Â-2420
> libgcc.a!_umoddi3.o          :  2682   Â0  -100.0% Â-2682
> avr3/libgcc.a!_umoddi3.o       Â:  2738   Â0  -100.0% Â-2738
> avr31/libgcc.a!_umoddi3.o       :  2738   Â0  -100.0% Â-2738
> avr4/libgcc.a!_udivdi3.o       Â:  2784   26  Â-99.1% Â-2758
> avr25/libgcc.a!_udivdi3.o       :  2828   26  Â-99.1% Â-2802
> avr5/libgcc.a!_udivdi3.o       Â:  2852   28  Â-99.0% Â-2824
> avr6/libgcc.a!_udivdi3.o       Â:  2852   28  Â-99.0% Â-2824
> avr51/libgcc.a!_udivdi3.o       :  2852   28  Â-99.0% Â-2824
> avr35/libgcc.a!_udivdi3.o       :  2896   28  Â-99.0% Â-2868
> avr4/libgcc.a!_moddi3.o        :  3016   Â0  -100.0% Â-3016
> avr5/libgcc.a!_moddi3.o        :  3072   Â0  -100.0% Â-3072
> avr6/libgcc.a!_moddi3.o        :  3072   Â0  -100.0% Â-3072
> avr51/libgcc.a!_moddi3.o       Â:  3072   Â0  -100.0% Â-3072
> avr25/libgcc.a!_moddi3.o       Â:  3124   Â0  -100.0% Â-3124
> avr35/libgcc.a!_moddi3.o       Â:  3180   Â0  -100.0% Â-3180
> libgcc.a!_udivdi3.o          :  3226   26  Â-99.2% Â-3200
> avr31/libgcc.a!_udivdi3.o       :  3294   28  Â-99.1% Â-3266
> avr3/libgcc.a!_udivdi3.o       Â:  3294   28  Â-99.1% Â-3266
> avr4/libgcc.a!_divdi3.o        :  3396   86  Â-97.5% Â-3310
> avr5/libgcc.a!_divdi3.o        :  3464   98  Â-97.2% Â-3366
> avr51/libgcc.a!_divdi3.o       Â:  3464   98  Â-97.2% Â-3366
> avr6/libgcc.a!_divdi3.o        :  3464   98  Â-97.2% Â-3366
> libgcc.a!_moddi3.o          Â:  3446   Â0  -100.0% Â-3446
> avr25/libgcc.a!_divdi3.o       Â:  3578   86  Â-97.6% Â-3492
> avr31/libgcc.a!_moddi3.o       Â:  3502   Â0  -100.0% Â-3502
> avr3/libgcc.a!_moddi3.o        :  3502   Â0  -100.0% Â-3502
> avr35/libgcc.a!_divdi3.o       Â:  3646   98  Â-97.3% Â-3548
> libgcc.a!_divdi3.o          Â:  3976   78  Â-98.0% Â-3898
> avr31/libgcc.a!_divdi3.o       Â:  4044  Â100  Â-97.5% Â-3944
> avr3/libgcc.a!_divdi3.o        :  4044   98  Â-97.6% Â-3946
> :::::: Total ::::::: Â Â Â Â Â Â Â Â Â: 362188 252362 Â Â-30.3% -109826
>
> The implementation is basically the same as the division/modulo already present
> in lib1funcs.S. However, the algorithm does not compute the complement by
> recycling the carry bit, instead it expands directly to the quotient. That way,
> n instructions can be saved when dealing with n-byte values.
>
> The implementation provides speed-up of the algorithm for the case when there
> is enough flash. ÂThe assumption is that speed for arithmetic matters.
>
> As you can see above, the size of __udivmod64 varies from 114 to 174 bytes:
>
> 114 = small devices without MOVW (no speed-up: SPEED_DIV = 0)
> 126 = small devices with MOVW (small speed-up: SPEED_DIV = 16)
> 162, 174 = devices >= 16k (best speed-up: SPEED_DIV = 8)
>
> Passed without regressions.
>
> Moreover, the algorithm is individually tested against the old implementation.
> The only difference I observed was for divisor = 0.
>
> Ok for trunk?
>
> Johann
>
> libgcc/
> Â Â Â ÂPR target/49313
> Â Â Â Â* config/avr/t-avr (LIB2FUNCS_EXCLUDE): Add _moddi3, _umoddi3.
> Â Â Â Â(LIB1ASMFUNCS): Add _divdi3, _udivdi3, _udivmod64, _negdi2.
>
> Â Â Â Â* config/avr/lib1funcs.S (wmov): New assembler macro.
> Â Â Â Â(__umoddi3, __udivdi3, __udivdi3_umoddi3): New functions.
> Â Â Â Â(__moddi3, __divdi3, __divdi3_moddi3): New functions.
> Â Â Â Â(__udivmod64): New function.
> Â Â Â Â(__negdi2): New function.
>

Approved.

Denis.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]