Bug 53595 - Code size increase of +10% between two 4.7.1 snapshot
Summary: Code size increase of +10% between two 4.7.1 snapshot
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: other (show other bugs)
Version: 4.7.0
: P3 normal
Target Milestone: 4.7.2
Assignee: Not yet assigned to anyone
URL:
Keywords: ra
Depends on:
Blocks:
 
Reported: 2012-06-06 19:52 UTC by Georg-Johann Lay
Modified: 2012-06-28 14:02 UTC (History)
2 users (show)

See Also:
Host:
Target: avr
Build:
Known to work:
Known to fail:
Last reconfirmed: 2012-06-07 00:00:00


Attachments
C source (1.26 KB, text/plain)
2012-06-06 19:52 UTC, Georg-Johann Lay
Details
ira dump of older version (good) (9.17 KB, text/plain)
2012-06-06 19:54 UTC, Georg-Johann Lay
Details
ira dump of newer version (+10%) (9.17 KB, text/plain)
2012-06-06 19:55 UTC, Georg-Johann Lay
Details
reload dump of older version (good) (5.98 KB, text/plain)
2012-06-06 19:56 UTC, Georg-Johann Lay
Details
reload dump of newer version (+10%) (6.40 KB, text/plain)
2012-06-06 19:58 UTC, Georg-Johann Lay
Details
pr53595.diff (903 bytes, patch)
2012-06-07 13:27 UTC, Georg-Johann Lay
Details | Diff
pr53595.diff: tentative patch (903 bytes, patch)
2012-06-07 13:31 UTC, Georg-Johann Lay
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Georg-Johann Lay 2012-06-06 19:52:53 UTC
Created attachment 27568 [details]
C source

This problem report is about a +10% of code size increase between the following two 4.7.1 versions of gcc:

V1: SVN 185693 gcc-4_7-branch from 2012-03-22, 4.7.1 (prerelease)
V2: SVN 188257 gcc-4_7-branch from 2012-06-06, 4.7.1 (prerelease)

The attached test case is a reduced test case out of many other sources that suffer from the same problem, so that the overall performance degradation is unpleasant.

The RTL dumps show that the first pass that differs is IRA.

== Command line ==

$ avr-gcc -mmcu=atmega168 -c -std=gnu99 -Os -mstrict-X -ffixed-2 -ffixed-3 -dp -fdump-rtl-ira-details -fdump-rtl-reload-details -fdump-rtl-postreload-details bresenham-i.c -o bresenham-i-1.o -save-temps=obj

resp. with  -o bresenham-i-2.o for the second 4.7.1 snapshot.

== GCC configure ==

Target: avr
Configured with: ../../gcc.gnu.org/gcc-4_7-branch/configure --target=avr --prefix=/local/gnu/install/gcc-4.7-mingw32 --host=i386-mingw32 --build=i686-linux-gnu --enable-languages=c,c++ --disable-nls --disable-shared --with-dwarf2
Thread model: single
gcc version 4.7.1 20120606 (prerelease) (GCC) 

GNU C (GCC) version 4.7.1 20120606 (prerelease) (avr)
	compiled by GNU C version 3.4.5 (mingw-vista special r2), GMP version 4.3.2, MPFR version 2.4.2, MPC version 0.8.2

It's a canadian cross, but that surely does not matter here...
Comment 1 Georg-Johann Lay 2012-06-06 19:54:34 UTC
Created attachment 27570 [details]
ira dump of older version (good)
Comment 2 Georg-Johann Lay 2012-06-06 19:55:49 UTC
Created attachment 27571 [details]
ira dump of newer version (+10%)
Comment 3 Georg-Johann Lay 2012-06-06 19:56:57 UTC
Created attachment 27572 [details]
reload dump of older version (good)
Comment 4 Georg-Johann Lay 2012-06-06 19:58:10 UTC
Created attachment 27573 [details]
reload dump of newer version (+10%)
Comment 5 Jakub Jelinek 2012-06-06 20:01:06 UTC
That's not very narrow interval.  Please bisect what affected it (or what affected it most).
Comment 6 Georg-Johann Lay 2012-06-06 20:05:02 UTC
And here is the first part of the diff of the reload dumps that shows that something weird is going on:

--- bresenham-i-1.198r.reload	Wed Jun  6 19:59:24 2012
+++ bresenham-i-2.198r.reload	Wed Jun  6 19:59:26 2012
@@ -273,123 +273,378 @@
 changing reg in insn 251
 changing reg in insn 249
 changing reg in insn 252
+Spilling for insn 40.
+Using reg 30 for reload 0
+Spilling for insn 44.
+Using reg 26 for reload 0
+Spilling for insn 45.
+Using reg 28 for reload 0
 Spilling for insn 65.
+Using reg 28 for reload 0
 Spilling for insn 68.
+Using reg 28 for reload 0
 Spilling for insn 71.
+Using reg 28 for reload 0
 Spilling for insn 75.
+Using reg 28 for reload 0
 Spilling for insn 76.
+Using reg 28 for reload 0
 Spilling for insn 78.
+Using reg 26 for reload 0
+Spilling for insn 248.
+Using reg 30 for reload 0
+Spilling for insn 249.
+Using reg 30 for reload 0
 Spilling for insn 202.
+Using reg 30 for reload 0
 Spilling for insn 203.
+Using reg 30 for reload 0
+Spilling for insn 96.
+Using reg 30 for reload 0
+Spilling for insn 204.
+Using reg 30 for reload 0
+Spilling for insn 205.
+Using reg 30 for reload 0
+Spilling for insn 108.
+Using reg 28 for reload 0
+Spilling for insn 112.
+Using reg 28 for reload 0
+Spilling for insn 114.
+Using reg 28 for reload 0
 Spilling for insn 142.
+Using reg 28 for reload 0
 Spilling for insn 143.
+Using reg 30 for reload 0
 Spilling for insn 144.
+Using reg 30 for reload 0
 Spilling for insn 145.
+Using reg 30 for reload 0
 Spilling for insn 146.
+Using reg 30 for reload 0
+Spilling for insn 154.
+Using reg 30 for reload 0
+Spilling for insn 159.
+Using reg 30 for reload 0
+Spilling for insn 165.
+Using reg 30 for reload 0
+Spilling for insn 170.
+Using reg 30 for reload 0
 Spilling for insn 176.
+Using reg 30 for reload 0
 Spilling for insn 179.
+Using reg 30 for reload 0
+Spilling for insn 40.
+Using reg 30 for reload 0
+Spilling for insn 44.
+Using reg 26 for reload 0
+Spilling for insn 45.
+Using reg 28 for reload 0
+Spilling for insn 65.
+Using reg 28 for reload 0
+Spilling for insn 68.
+Using reg 28 for reload 0
+Spilling for insn 71.
+Using reg 28 for reload 0
+Spilling for insn 75.
+Using reg 28 for reload 0
+Spilling for insn 76.
+Using reg 28 for reload 0
+Spilling for insn 78.
+Using reg 26 for reload 0
+Spilling for insn 248.
+Using reg 30 for reload 0
+Spilling for insn 249.
+Using reg 30 for reload 0
+Spilling for insn 202.
+Using reg 30 for reload 0
+Spilling for insn 203.
+Using reg 30 for reload 0
+Spilling for insn 96.
+Using reg 30 for reload 0
+Spilling for insn 204.
+Using reg 30 for reload 0
+Spilling for insn 205.
+Using reg 30 for reload 0
+Spilling for insn 108.
+Using reg 28 for reload 0
+Spilling for insn 112.
+Using reg 28 for reload 0
+Spilling for insn 114.
+Using reg 28 for reload 0
+Spilling for insn 142.
+Using reg 28 for reload 0
+Spilling for insn 143.
+Using reg 30 for reload 0
+Spilling for insn 144.
+Using reg 30 for reload 0
+Spilling for insn 145.
+Using reg 30 for reload 0
+Spilling for insn 146.
+Using reg 30 for reload 0
+Spilling for insn 154.
+Using reg 30 for reload 0
+Spilling for insn 159.
+Using reg 30 for reload 0
+Spilling for insn 165.
+Using reg 30 for reload 0
+Spilling for insn 170.
+Using reg 30 for reload 0
+Spilling for insn 176.
+Using reg 30 for reload 0
+Spilling for insn 179.
+Using reg 30 for reload 0
Comment 7 Georg-Johann Lay 2012-06-06 20:12:19 UTC
(In reply to comment #5)
> That's not very narrow interval.  Please bisect what affected it (or what
> affected it most).

The only changes to 4_7-branch in the IRA/reload area was for PR52804.
Will take until friday to before I can look deeper into it.
Comment 8 Georg-Johann Lay 2012-06-07 13:24:51 UTC
(In reply to comment #5)
> That's not very narrow interval.  Please bisect what affected it (or what
> affected it most).

CCing Vladimir

It is caused by fix for PR53065 in

http://gcc.gnu.org/viewcvs?view=revision&revision=186770

--- branches/gcc-4_7-branch/gcc/config/avr/avr.h	2012/04/24 15:23:22	186769
+++ branches/gcc-4_7-branch/gcc/config/avr/avr.h	2012/04/24 15:23:46	186770
@@ -394,6 +394,11 @@
 
 #define REGNO_OK_FOR_INDEX_P(NUM) 0
 
+#define HARD_REGNO_CALL_PART_CLOBBERED(REGNO, MODE)                    \
+  (((REGNO) < 18 && (REGNO) + GET_MODE_SIZE (MODE) > 18)               \
+   || ((REGNO) < REG_Y && (REGNO) + GET_MODE_SIZE (MODE) > REG_Y)      \
+   || ((REGNO) < REG_Z && (REGNO) + GET_MODE_SIZE (MODE) > REG_Z))
+

What's really confusing to me is that this macro is called with invalid registers, i.e with registers that HARD_REGNO_MODE_OK will reject.

If the macro returns false for the invalid registers, the performance regression goes away.

Vladimir, is it save to return 0 for invalid hard registers?

What's the point with generating different code when invalid registers are treated differently?  As far as I can see the hard register is not about SUBREGs, it contains a pointer.
Comment 9 Georg-Johann Lay 2012-06-07 13:27:33 UTC
Created attachment 27576 [details]
pr53595.diff

And here is a tentative patch:

Let HARD_REGNO_CALL_PART_CLOBBERED return false if HARD_REGNO_MODE_OK is false.
Comment 10 Georg-Johann Lay 2012-06-07 13:31:32 UTC
Created attachment 27577 [details]
pr53595.diff: tentative patch

Better return 0 as false...
Comment 11 Georg-Johann Lay 2012-06-08 17:04:41 UTC
FYI, the patch attachment 27568 [details] also resolves the strange error described in PR53615, at least for the test case given there.
Comment 12 Georg-Johann Lay 2012-06-28 13:53:59 UTC
Author: gjl
Date: Thu Jun 28 13:53:51 2012
New Revision: 189049

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=189049
Log:
	PR 53595
	* config/avr/avr.c (avr_hard_regno_call_part_clobbered): New.
	* config/avr/avr-protos.h (avr_hard_regno_call_part_clobbered): New.
	* config/avr/avr.h (HARD_REGNO_CALL_PART_CLOBBERED): Forward to
	avr_hard_regno_call_part_clobbered.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/avr/avr-protos.h
    trunk/gcc/config/avr/avr.c
    trunk/gcc/config/avr/avr.h
Comment 13 Georg-Johann Lay 2012-06-28 13:58:37 UTC
Author: gjl
Date: Thu Jun 28 13:58:32 2012
New Revision: 189050

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=189050
Log:
	Backport from 2012-06-28 mainline r189049
	PR 53595
	* config/avr/avr.c (avr_hard_regno_call_part_clobbered): New.
	* config/avr/avr-protos.h (avr_hard_regno_call_part_clobbered): New.
	* config/avr/avr.h (HARD_REGNO_CALL_PART_CLOBBERED): Forward to
	avr_hard_regno_call_part_clobbered.


Modified:
    branches/gcc-4_7-branch/gcc/ChangeLog
    branches/gcc-4_7-branch/gcc/config/avr/avr-protos.h
    branches/gcc-4_7-branch/gcc/config/avr/avr.c
    branches/gcc-4_7-branch/gcc/config/avr/avr.h
Comment 14 Georg-Johann Lay 2012-06-28 14:02:39 UTC
Rather worked around than a proper fix of the call site, e.g.
ira-costs.c:ira_tune_allocno_costs