This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: gcc compile-time performance


I am not so sure divides are what kill it. -Will

Jan Hubicka wrote:

>>   From: Stan Shebs <shebs@apple.com>
>>   Date: Fri, 17 May 2002 09:25:32 -0700
>>   
>>   That's my personal suspicion too, but no, I don't have any real
>>   evidence.  The lack of hot spots in profiling is a strong hint.
>>   One oddball idea I've thought about is to functionize all the
>>   tree and rtl macros, and run a profile on that to see what are
>>   the most used/abused macros.
>>   
>>I know that the subreg-byte changes added a lot of overhead
>>particularly via the subreg_regno_offset() function (which was
>>an inline macro in my original diffs).
>>
> 
> Do you have some data?  Perhaps we can replace the division by simple lookup
> table...
> 
> Honza
> 
>>The divisions are what kill it.  That overhead could be eliminated
>>if all the mode sizes were powers of 2 and we had some
>>GET_MODE_SIZE_LOG2() interface.  Then we just transform all the
>>divides there into shifts.
>>
>>   Then there's the extreme approach of having maintainers only
>>   accept patches that either remove code or make the compiler run
>>   faster... :-)
>>
>>There is a better way, have maintainers work on approval of such
>>changes faster than approval of other changes :-)
>>

Given the comment that divides were taking a significant time, I
decided to get some data to see how much of a problem divides were. I
checked out the gcc_3_1_release from gcc.gnu.org. I configured it to
build a native compiler on a RH Linux 7.2 running on an Inspiron 4100
with a 1GHz mobile Pentium III processor, 256MB DRAM, and a 40GB hard
drive. I built the tool. I then went into the build/gcc directory,
"make clean; time make bootstrap" while having oprofile take
measurement using the CPU_CLK_UNHALTED (counter 0) and DIV (counter
1).

Result of the "time make bootstrap" command:
real	25m10.672s
user	23m51.240s
sys	0m32.500s

The real time of 25m10s is 1510 second, there were 32108 samples for
divides on /home/wcohen/gcc31/native/gcc/stage1/cc1 with each sample
representing 4980 divide operations (both floating point and
integer). This means there were 73,991 samples total, 368x10^6 divides
on the entire system.  Assuming worst case divides from Pentium 4
software optimization manual 70 cycles per divide and 1GHz this would
be 25.8 seconds of runtime, about 1.7% of the run time for the entire
system.  The 1.7% is a pretty pessemistic estimate of the time.  That
70 cycles is the latency of the idiv, throughput on the Pentium 4 is
23 cycles.

Of course this data assumes that the bootstrap process represents
typical program behavior and is exercising the same parts of gcc that
other programs are.

-Will

[root@litespeed root]# op_time -r -c 1|more /* DIV 4980 per sample */
32108     43.3945 0.0000 /home/wcohen/gcc31/native/gcc/stage1/cc1
20852     28.1818 0.0000 /home/wcohen/gcc31/native/gcc/stage2/cc1
10893     14.7221 0.0000 /home/wcohen/gcc31/native.install/lib/gcc-lib/i686-pc-linux-gnu/3.1/cc1
5886       7.9550 0.0000 /usr/bin/as
2366       3.1977 0.0000 /lib/modules/2.4.9-21custom/build/vmlinux
731        0.9880 0.0000 /usr/bin/ld
259        0.3500 0.0000 /lib/ext3.o
233        0.3149 0.0000 /usr/bin/make
123        0.1662 0.0000 /home/wcohen/gcc31/native/gcc/cc1
...

[root@litespeed root]# op_time -r -c 0 |more /* CPU_CLK_UNHALTED 498000 per */
1287676   51.8822 0.0000 /home/wcohen/gcc31/native/gcc/stage1/cc1
680858    27.4327 0.0000 /home/wcohen/gcc31/native/gcc/stage2/cc1
239265     9.6403 0.0000 /home/wcohen/gcc31/native.install/lib/gcc-lib/i686-pc-linux-gnu/3.1/cc1
92502      3.7270 0.0000 /usr/bin/as
63091      2.5420 0.0000 /lib/modules/2.4.9-21custom/build/vmlinux
34034      1.3713 0.0000 /home/wcohen/gcc31/native/gcc/genattrtab
28366      1.1429 0.0000 /home/wcohen/gcc31/native/gcc/fixinc/fixincl
9777       0.3939 0.0000 /usr/bin/ld
9127       0.3677 0.0000 /home/wcohen/gcc31/native/gcc/cc1
7339       0.2957 0.0000 /usr/lib/mozilla/mozilla-bin
....

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]