Bug 52066 - IRA and reginfo initialization too expensive
Summary: IRA and reginfo initialization too expensive
Status: UNCONFIRMED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 4.7.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-01-31 13:01 UTC by Jakub Jelinek
Modified: 2012-10-09 20:46 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments
patch (782 bytes, patch)
2012-01-31 13:09 UTC, Jakub Jelinek
Details | Diff
i386 patch (805 bytes, patch)
2012-01-31 14:54 UTC, Jakub Jelinek
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jakub Jelinek 2012-01-31 13:01:42 UTC
On compilation of an empty file, we spend almost half of the runtime on
IRA and reginfo cost initialization.

Inlining reg_class_subset_p and reg_classes_intersect_p gives us at least tiny part of compile time back, at the cost of some small code growth.
For --enable-checking=release x86_64-linux cc1:

~/timing --drop 20 -c 2500 ./cc1.vanilla empty.c -quiet
Strip out best and worst 500 realtime results
minimum: 0.006243475 sec real / 0.000018634 sec CPU
maximum: 0.032220578 sec real / 0.000203440 sec CPU
average: 0.006542800 sec real / 0.000036352 sec CPU
stdev  : 0.000050402 sec real / 0.000004406 sec CPU
~/timing --drop 20 -c 2500 ./cc1 empty.c -quiet
Strip out best and worst 500 realtime results
minimum: 0.006115241 sec real / 0.000017569 sec CPU
maximum: 0.033226603 sec real / 0.000197238 sec CPU
average: 0.006384943 sec real / 0.000036349 sec CPU
stdev  : 0.000049180 sec real / 0.000004284 sec CPU

readelf -WS cc1{.vanilla,} | grep '\text'
  [12] .text             PROGBITS        00000000004b21a0 0b21a0 78e7b4 00  AX  0   0 16
  [12] .text             PROGBITS        00000000004b21a0 0b21a0 78ed34 00  AX  0   0 16
Comment 1 Jakub Jelinek 2012-01-31 13:09:39 UTC
Created attachment 26531 [details]
patch

Patch to inline them.  Quite ugly, but I can't use extern inline __attribute__((gnu_inline)), because it uses static inline functions, and unfortunately some callers of these functions can't include hard-reg-set.h (targhooks.c).
Comment 2 Richard Biener 2012-01-31 13:21:23 UTC
PR44440?
Comment 3 Jakub Jelinek 2012-01-31 14:54:32 UTC
Created attachment 26535 [details]
i386 patch

Incremental patch to speed up i386 *CLASS_P macros.  I agree it isn't as clean
as defining them using the subsets (now used just in rtl checking during initialization), but seems to be faster.  Timing with both patches:

~/timing --drop 20 -c 2500 ./cc1 empty.c -quiet
Strip out best and worst 500 realtime results
minimum: 0.006076317 sec real / 0.000018264 sec CPU
maximum: 0.034520375 sec real / 0.000205298 sec CPU
average: 0.006284155 sec real / 0.000036616 sec CPU
stdev  : 0.000048897 sec real / 0.000004400 sec CPU

  [12] .text             PROGBITS        00000000004b21a0 0b21a0 78e774 00  AX  0   0 16

(i.e. it completely undoes all the code growth by the first patch).
Comment 4 Steven Bosscher 2012-10-09 20:46:39 UTC
(In reply to comment #1)
> Patch to inline them.  Quite ugly, but I can't use extern inline
> __attribute__((gnu_inline)), because it uses static inline functions, and
> unfortunately some callers of these functions can't include hard-reg-set.h
> (targhooks.c).

Huh, if it's ok to know about reg-classes in targhooks, why is it not also OK to know about HARD_REG_SET?