This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [Patch,AVR]: PR50447: Tweak addhi3


Denis Chertykov schrieb:
> 2011/10/18 Georg-Johann Lay <avr@gjlay.de>:
>> Denis Chertykov schrieb:
>>> 2011/10/18 Georg-Johann Lay <avr@gjlay.de>:
>>>> Denis Chertykov schrieb:
>>>>> 2011/10/18 Georg-Johann Lay <avr@gjlay.de>:
>>>>>> This patch do some tweaks to addhi3 like adding QI scratch register.
>>>>>>
>>>>>> The original *addhi3 insn is still there and located prior to new
>>>>>> addhi3_clobber insn because addhi3 is special to reload (thanks Danis for this
>>>>>> note) so that there is a version with and a version without scratch register.
>>>>>>
>>>>>> Patch passes without regressions.
>>>>>>
>>>>> Which improvements added by this patch ?
>>>>>
>>>>> Denis.
>>>> If the addhi3 is expanded early, the addition happens with QI scratch which
>>>> avoids reload of constant if target register is in NO_LD. And reduce register
>>>> pressure as only QI is needed and not reload of constant to HI.
>>>>
>>>> Otherwise, there might be sequences like
>>>>
>>>> ldi r31, 2    ; *reload_inhi
>>>> mov r12, r31
>>>> clr r13
>>>>
>>>> add r14, r12  ; *addhi3
>>>> adc r15, r13
>>>>
>>>> which now will be
>>>>
>>>> ldi r31, 2    ; addhi3_clobber
>>>> add r14, r31
>>>> adc r15, __zero_reg__
>>>>
>>>> Similar applies if the reload of the constant happens to LD regs:
>>>>
>>>> ldi r30, 2    ; *movhi
>>>> clr r31
>>>>
>>>> add r14, r12  ; *addhi3
>>>> adc r15, r13
>>>>
>>>> will become
>>>>
>>>> ldi r30, 2    ; addhi3_clobber
>>>> add r14, r30
>>>> adc r15, __zero_reg__
>>>>
>>>> For *addhi3 insns the register pressure is not reduced but the insn sequence
>>>> might be smarter if peep2 comes up with a QI scratch or if it detects a
>>>> *reload_inhi insn just prior to the addition (and the reg that holds the
>>>> reloaded constant dies after the addition).
>>>>
>>>> As *addhi3 is special to reload, there is still an "ordinary" add addhi insn
>>>> without scratch. This is easier because, e.g. prologue and epilogue generation
>>>> generate add insns (not by means of addhi3 expander but by explicit
>>>> gan_rtx_PLUS). Yet the addhi3 expander factors out the situations when an
>>>> addhi3 insn is to be generated via addhi3 expander late in the compilation process
>>> Please provide any real world example.
>>>
>>> Denis.
>> Consider avr-libc (under the assumption that it is "real world" code):
>>
>> In avr-libc's build directory, and with the patch integrated:
>>
>> $ cd avr/lib/avr4
>> $ make clean && make CFLAGS='-save-temps -dp -Os'
>> $ grep -A 2 'addhi3_clobber\/2' *.s > out-nopeep2.txt (see attachment)
>> $ grep 'addhi3_clobber\/2' *.s | wc -l
>> 33
>>
>> This shows that the insns are already there before peep2 and thus no reload of
>> 16-bit constant is needed; an 8-bit scratch is sufficient.
>>
>> Alternatively, the implementation could omit the expansion to addhi3_clobber in
>> addhi3 expander and instead rely completely on peep2. However, that does not
>> reduce register pressure because a 16-bit register will be allocated and the
>> peep2 just prints things smarter and needs just a QI scratch to call
>> avr_out_plus_clobber.
>>
>> For +/-1, the addition with SEC/ADD/ADC resp. SEC/SBC/SBC leaves cc0 in a mess.
>>  as most loops use +/-1 on the counter variable, LDI/SUB/SBC is not shorter but
>> better because it sets cc0.
>>
>> So you like this patch?
>> Or prefer a patch that is neutral with respect to register allocator and just
>> uses peep2 to print things smarter?
> 
> I'm interested in code improvements.
> What difference in size of avr-libc ?
> 
> Denis.

I have to tool for smart size analysis, so here is just a diff:

After rebuilding avr-libc with respective compiler version, did respectively:

$ find . -name 'lib[mc].a' -exec avr-size {} ';' > size-orig.txt
$ find . -name 'lib[mc].a' -exec avr-size {} ';' > size-patch.txt

and then

$ diff -U 0 size-orig.txt size-patch.txt > size.diff

As far as I can see, there is not a big gain but no object increases in size.

For some files like ./avr/lib/avr2/libc.a:dtoa_prf.o size gain is 3%.
For ./avr/lib/avr4/libc.a:vfprintf_std.o it's 1.7% and for others just one
instruction better.

Johann
--- size-orig.txt	2011-10-18 19:59:52.000000000 +0200
+++ size-patch.txt	2011-10-18 19:50:59.000000000 +0200
@@ -7 +7 @@
-    750	      0	      0	    750	    2ee	dtoa_prf.o (ex ./avr/lib/avr51/libc.a)
+    724	      0	      0	    724	    2d4	dtoa_prf.o (ex ./avr/lib/avr51/libc.a)
@@ -11 +11 @@
-    722	      6	      0	    728	    2d8	malloc.o (ex ./avr/lib/avr51/libc.a)
+    720	      6	      0	    726	    2d6	malloc.o (ex ./avr/lib/avr51/libc.a)
@@ -15,2 +15,2 @@
-    510	      0	      0	    510	    1fe	realloc.o (ex ./avr/lib/avr51/libc.a)
-    747	      0	      0	    747	    2eb	strtod.o (ex ./avr/lib/avr51/libc.a)
+    506	      0	      0	    506	    1fa	realloc.o (ex ./avr/lib/avr51/libc.a)
+    739	      0	      0	    739	    2e3	strtod.o (ex ./avr/lib/avr51/libc.a)
@@ -18 +18 @@
-    536	      0	      0	    536	    218	strtoul.o (ex ./avr/lib/avr51/libc.a)
+    530	      0	      0	    530	    212	strtoul.o (ex ./avr/lib/avr51/libc.a)
@@ -246,2 +246,2 @@
-   1042	      0	      0	   1042	    412	vfprintf_std.o (ex ./avr/lib/avr51/libc.a)
-   1490	      0	      0	   1490	    5d2	vfscanf_std.o (ex ./avr/lib/avr51/libc.a)
+   1026	      0	      0	   1026	    402	vfprintf_std.o (ex ./avr/lib/avr51/libc.a)
+   1488	      0	      0	   1488	    5d0	vfscanf_std.o (ex ./avr/lib/avr51/libc.a)
@@ -423 +423 @@
-    688	      0	      0	    688	    2b0	dtoa_prf.o (ex ./avr/lib/avr35/libc.a)
+    670	      0	      0	    670	    29e	dtoa_prf.o (ex ./avr/lib/avr35/libc.a)
@@ -427 +427 @@
-    708	      6	      0	    714	    2ca	malloc.o (ex ./avr/lib/avr35/libc.a)
+    706	      6	      0	    712	    2c8	malloc.o (ex ./avr/lib/avr35/libc.a)
@@ -431,3 +431,3 @@
-    440	      0	      0	    440	    1b8	realloc.o (ex ./avr/lib/avr35/libc.a)
-    733	      0	      0	    733	    2dd	strtod.o (ex ./avr/lib/avr35/libc.a)
-    564	      0	      0	    564	    234	strtol.o (ex ./avr/lib/avr35/libc.a)
+    436	      0	      0	    436	    1b4	realloc.o (ex ./avr/lib/avr35/libc.a)
+    725	      0	      0	    725	    2d5	strtod.o (ex ./avr/lib/avr35/libc.a)
+    562	      0	      0	    562	    232	strtol.o (ex ./avr/lib/avr35/libc.a)
@@ -662,2 +662,2 @@
-    964	      0	      0	    964	    3c4	vfprintf_std.o (ex ./avr/lib/avr35/libc.a)
-   1352	      0	      0	   1352	    548	vfscanf_std.o (ex ./avr/lib/avr35/libc.a)
+    948	      0	      0	    948	    3b4	vfprintf_std.o (ex ./avr/lib/avr35/libc.a)
+   1350	      0	      0	   1350	    546	vfscanf_std.o (ex ./avr/lib/avr35/libc.a)
@@ -815 +815 @@
-    682	      0	      0	    682	    2aa	dtoa_prf.o (ex ./avr/lib/avr25/libc.a)
+    664	      0	      0	    664	    298	dtoa_prf.o (ex ./avr/lib/avr25/libc.a)
@@ -819 +819 @@
-    704	      6	      0	    710	    2c6	malloc.o (ex ./avr/lib/avr25/libc.a)
+    702	      6	      0	    708	    2c4	malloc.o (ex ./avr/lib/avr25/libc.a)
@@ -823,3 +823,3 @@
-    426	      0	      0	    426	    1aa	realloc.o (ex ./avr/lib/avr25/libc.a)
-    713	      0	      0	    713	    2c9	strtod.o (ex ./avr/lib/avr25/libc.a)
-    554	      0	      0	    554	    22a	strtol.o (ex ./avr/lib/avr25/libc.a)
+    422	      0	      0	    422	    1a6	realloc.o (ex ./avr/lib/avr25/libc.a)
+    705	      0	      0	    705	    2c1	strtod.o (ex ./avr/lib/avr25/libc.a)
+    552	      0	      0	    552	    228	strtol.o (ex ./avr/lib/avr25/libc.a)
@@ -1054,2 +1054,2 @@
-    930	      0	      0	    930	    3a2	vfprintf_std.o (ex ./avr/lib/avr25/libc.a)
-   1286	      0	      0	   1286	    506	vfscanf_std.o (ex ./avr/lib/avr25/libc.a)
+    914	      0	      0	    914	    392	vfprintf_std.o (ex ./avr/lib/avr25/libc.a)
+   1284	      0	      0	   1284	    504	vfscanf_std.o (ex ./avr/lib/avr25/libc.a)
@@ -1447 +1447 @@
-    758	      0	      0	    758	    2f6	dtoa_prf.o (ex ./avr/lib/avr31/libc.a)
+    734	      0	      0	    734	    2de	dtoa_prf.o (ex ./avr/lib/avr31/libc.a)
@@ -1451 +1451 @@
-    752	      6	      0	    758	    2f6	malloc.o (ex ./avr/lib/avr31/libc.a)
+    750	      6	      0	    756	    2f4	malloc.o (ex ./avr/lib/avr31/libc.a)
@@ -1455,4 +1455,4 @@
-    464	      0	      0	    464	    1d0	realloc.o (ex ./avr/lib/avr31/libc.a)
-    811	      0	      0	    811	    32b	strtod.o (ex ./avr/lib/avr31/libc.a)
-    634	      0	      0	    634	    27a	strtol.o (ex ./avr/lib/avr31/libc.a)
-    616	      0	      0	    616	    268	strtoul.o (ex ./avr/lib/avr31/libc.a)
+    466	      0	      0	    466	    1d2	realloc.o (ex ./avr/lib/avr31/libc.a)
+    809	      0	      0	    809	    329	strtod.o (ex ./avr/lib/avr31/libc.a)
+    630	      0	      0	    630	    276	strtol.o (ex ./avr/lib/avr31/libc.a)
+    614	      0	      0	    614	    266	strtoul.o (ex ./avr/lib/avr31/libc.a)
@@ -1686,2 +1686,2 @@
-   1064	      0	      0	   1064	    428	vfprintf_std.o (ex ./avr/lib/avr31/libc.a)
-   1582	      0	      0	   1582	    62e	vfscanf_std.o (ex ./avr/lib/avr31/libc.a)
+   1046	      0	      0	   1046	    416	vfprintf_std.o (ex ./avr/lib/avr31/libc.a)
+   1580	      0	      0	   1580	    62c	vfscanf_std.o (ex ./avr/lib/avr31/libc.a)
@@ -1791 +1791 @@
-    750	      0	      0	    750	    2ee	dtoa_prf.o (ex ./avr/lib/avr6/libc.a)
+    724	      0	      0	    724	    2d4	dtoa_prf.o (ex ./avr/lib/avr6/libc.a)
@@ -1795 +1795 @@
-    722	      6	      0	    728	    2d8	malloc.o (ex ./avr/lib/avr6/libc.a)
+    720	      6	      0	    726	    2d6	malloc.o (ex ./avr/lib/avr6/libc.a)
@@ -1799,2 +1799,2 @@
-    508	      0	      0	    508	    1fc	realloc.o (ex ./avr/lib/avr6/libc.a)
-    747	      0	      0	    747	    2eb	strtod.o (ex ./avr/lib/avr6/libc.a)
+    504	      0	      0	    504	    1f8	realloc.o (ex ./avr/lib/avr6/libc.a)
+    739	      0	      0	    739	    2e3	strtod.o (ex ./avr/lib/avr6/libc.a)
@@ -1802 +1802 @@
-    536	      0	      0	    536	    218	strtoul.o (ex ./avr/lib/avr6/libc.a)
+    530	      0	      0	    530	    212	strtoul.o (ex ./avr/lib/avr6/libc.a)
@@ -2030,2 +2030,2 @@
-   1042	      0	      0	   1042	    412	vfprintf_std.o (ex ./avr/lib/avr6/libc.a)
-   1490	      0	      0	   1490	    5d2	vfscanf_std.o (ex ./avr/lib/avr6/libc.a)
+   1026	      0	      0	   1026	    402	vfprintf_std.o (ex ./avr/lib/avr6/libc.a)
+   1488	      0	      0	   1488	    5d0	vfscanf_std.o (ex ./avr/lib/avr6/libc.a)
@@ -2135 +2135 @@
-    758	      0	      0	    758	    2f6	dtoa_prf.o (ex ./avr/lib/avr3/libc.a)
+    734	      0	      0	    734	    2de	dtoa_prf.o (ex ./avr/lib/avr3/libc.a)
@@ -2139 +2139 @@
-    752	      6	      0	    758	    2f6	malloc.o (ex ./avr/lib/avr3/libc.a)
+    750	      6	      0	    756	    2f4	malloc.o (ex ./avr/lib/avr3/libc.a)
@@ -2143,4 +2143,4 @@
-    464	      0	      0	    464	    1d0	realloc.o (ex ./avr/lib/avr3/libc.a)
-    811	      0	      0	    811	    32b	strtod.o (ex ./avr/lib/avr3/libc.a)
-    634	      0	      0	    634	    27a	strtol.o (ex ./avr/lib/avr3/libc.a)
-    616	      0	      0	    616	    268	strtoul.o (ex ./avr/lib/avr3/libc.a)
+    466	      0	      0	    466	    1d2	realloc.o (ex ./avr/lib/avr3/libc.a)
+    809	      0	      0	    809	    329	strtod.o (ex ./avr/lib/avr3/libc.a)
+    630	      0	      0	    630	    276	strtol.o (ex ./avr/lib/avr3/libc.a)
+    614	      0	      0	    614	    266	strtoul.o (ex ./avr/lib/avr3/libc.a)
@@ -2374,2 +2374,2 @@
-   1064	      0	      0	   1064	    428	vfprintf_std.o (ex ./avr/lib/avr3/libc.a)
-   1582	      0	      0	   1582	    62e	vfscanf_std.o (ex ./avr/lib/avr3/libc.a)
+   1046	      0	      0	   1046	    416	vfprintf_std.o (ex ./avr/lib/avr3/libc.a)
+   1580	      0	      0	   1580	    62c	vfscanf_std.o (ex ./avr/lib/avr3/libc.a)
@@ -2527 +2527 @@
-    688	      0	      0	    688	    2b0	dtoa_prf.o (ex ./avr/lib/avr5/libc.a)
+    670	      0	      0	    670	    29e	dtoa_prf.o (ex ./avr/lib/avr5/libc.a)
@@ -2531 +2531 @@
-    708	      6	      0	    714	    2ca	malloc.o (ex ./avr/lib/avr5/libc.a)
+    706	      6	      0	    712	    2c8	malloc.o (ex ./avr/lib/avr5/libc.a)
@@ -2535,2 +2535,2 @@
-    440	      0	      0	    440	    1b8	realloc.o (ex ./avr/lib/avr5/libc.a)
-    719	      0	      0	    719	    2cf	strtod.o (ex ./avr/lib/avr5/libc.a)
+    436	      0	      0	    436	    1b4	realloc.o (ex ./avr/lib/avr5/libc.a)
+    711	      0	      0	    711	    2c7	strtod.o (ex ./avr/lib/avr5/libc.a)
@@ -2538 +2538 @@
-    492	      0	      0	    492	    1ec	strtoul.o (ex ./avr/lib/avr5/libc.a)
+    486	      0	      0	    486	    1e6	strtoul.o (ex ./avr/lib/avr5/libc.a)
@@ -2766,2 +2766,2 @@
-    960	      0	      0	    960	    3c0	vfprintf_std.o (ex ./avr/lib/avr5/libc.a)
-   1352	      0	      0	   1352	    548	vfscanf_std.o (ex ./avr/lib/avr5/libc.a)
+    944	      0	      0	    944	    3b0	vfprintf_std.o (ex ./avr/lib/avr5/libc.a)
+   1350	      0	      0	   1350	    546	vfscanf_std.o (ex ./avr/lib/avr5/libc.a)
@@ -3855 +3855 @@
-    682	      0	      0	    682	    2aa	dtoa_prf.o (ex ./avr/lib/avr4/libc.a)
+    664	      0	      0	    664	    298	dtoa_prf.o (ex ./avr/lib/avr4/libc.a)
@@ -3859 +3859 @@
-    704	      6	      0	    710	    2c6	malloc.o (ex ./avr/lib/avr4/libc.a)
+    702	      6	      0	    708	    2c4	malloc.o (ex ./avr/lib/avr4/libc.a)
@@ -3863,2 +3863,2 @@
-    426	      0	      0	    426	    1aa	realloc.o (ex ./avr/lib/avr4/libc.a)
-    697	      0	      0	    697	    2b9	strtod.o (ex ./avr/lib/avr4/libc.a)
+    422	      0	      0	    422	    1a6	realloc.o (ex ./avr/lib/avr4/libc.a)
+    689	      0	      0	    689	    2b1	strtod.o (ex ./avr/lib/avr4/libc.a)
@@ -3866 +3866 @@
-    482	      0	      0	    482	    1e2	strtoul.o (ex ./avr/lib/avr4/libc.a)
+    476	      0	      0	    476	    1dc	strtoul.o (ex ./avr/lib/avr4/libc.a)
@@ -4094,2 +4094,2 @@
-    930	      0	      0	    930	    3a2	vfprintf_std.o (ex ./avr/lib/avr4/libc.a)
-   1286	      0	      0	   1286	    506	vfscanf_std.o (ex ./avr/lib/avr4/libc.a)
+    914	      0	      0	    914	    392	vfprintf_std.o (ex ./avr/lib/avr4/libc.a)
+   1284	      0	      0	   1284	    504	vfscanf_std.o (ex ./avr/lib/avr4/libc.a)
@@ -4379 +4379 @@
-    752	      0	      0	    752	    2f0	dtoa_prf.o (ex ./avr/lib/avr2/libc.a)
+    728	      0	      0	    728	    2d8	dtoa_prf.o (ex ./avr/lib/avr2/libc.a)
@@ -4383 +4383 @@
-    748	      6	      0	    754	    2f2	malloc.o (ex ./avr/lib/avr2/libc.a)
+    746	      6	      0	    752	    2f0	malloc.o (ex ./avr/lib/avr2/libc.a)
@@ -4387,4 +4387,4 @@
-    450	      0	      0	    450	    1c2	realloc.o (ex ./avr/lib/avr2/libc.a)
-    791	      0	      0	    791	    317	strtod.o (ex ./avr/lib/avr2/libc.a)
-    624	      0	      0	    624	    270	strtol.o (ex ./avr/lib/avr2/libc.a)
-    606	      0	      0	    606	    25e	strtoul.o (ex ./avr/lib/avr2/libc.a)
+    452	      0	      0	    452	    1c4	realloc.o (ex ./avr/lib/avr2/libc.a)
+    789	      0	      0	    789	    315	strtod.o (ex ./avr/lib/avr2/libc.a)
+    620	      0	      0	    620	    26c	strtol.o (ex ./avr/lib/avr2/libc.a)
+    604	      0	      0	    604	    25c	strtoul.o (ex ./avr/lib/avr2/libc.a)
@@ -4618,2 +4618,2 @@
-   1030	      0	      0	   1030	    406	vfprintf_std.o (ex ./avr/lib/avr2/libc.a)
-   1516	      0	      0	   1516	    5ec	vfscanf_std.o (ex ./avr/lib/avr2/libc.a)
+   1012	      0	      0	   1012	    3f4	vfprintf_std.o (ex ./avr/lib/avr2/libc.a)
+   1514	      0	      0	   1514	    5ea	vfscanf_std.o (ex ./avr/lib/avr2/libc.a)

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]