This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Faster compilation speed
- From: Mike Stump <mrs at apple dot com>
- To: Nix <nix at esperi dot demon dot co dot uk>
- Cc: Noel Yap <yap_noel at yahoo dot com>, Neil Booth <neil at daikokuya dot co dot uk>, gcc at gcc dot gnu dot org
- Date: Mon, 12 Aug 2002 15:08:05 -0700
- Subject: Re: Faster compilation speed
On Saturday, August 10, 2002, at 05:49 PM, Nix wrote:
Within our system, builds on Windows are magnitudes
faster since we're able to take advantage of
precompiled headers.
Already solved problem for me.
Example, with GCC-3.1, with a `hello world' iostreams-using program...
The code:
#include <iostream>
int main (void)
{
std::cout << "Hello world";
return 0;
}
[ ... ]
Complete run, with optimization:
nix@loki 66 /tmp% c++ -O2 -ftime-report -o hello hello.C
Execution times (seconds)
garbage collection : 1.10 (11%) usr 0.11 ( 9%) sys 1.74
(11%) wall
cfg cleanup : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 (
0%) wall
life analysis : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.02 (
0%) wall
preprocessing : 1.12 (11%) usr 0.22 (18%) sys 2.04
(13%) wall
lexical analysis : 0.98 (10%) usr 0.22 (18%) sys 1.93
(12%) wall
parser : 6.46 (65%) usr 0.63 (53%) sys 9.98
(62%) wall
expand : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 (
0%) wall
varconst : 0.08 ( 1%) usr 0.00 ( 0%) sys 0.12 (
1%) wall
CSE : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.03 (
0%) wall
CSE 2 : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.03 (
0%) wall
regmove : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.02 (
0%) wall
global alloc : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.04 (
0%) wall
flow 2 : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 (
0%) wall
rename registers : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.03 (
0%) wall
scheduling 2 : 0.00 ( 0%) usr 0.01 ( 1%) sys 0.02 (
0%) wall
final : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.01 (
0%) wall
TOTAL : 9.96 1.20 16.16
Now obviously with a less toy example the time consumed optimizing
would
rise; but that doesn't affect my point, that the lion's share of time
spent in C++ header files is parsing time, and that speeding up the
preprocessor will have limited effect now (thanks to Zack and Neil
speeding it up so much already :) ).
With PFE:
bash-2.05a$ time g++ -O2 -ftime-report --load-pch foo t1.cc
Execution times (seconds)
lexical analysis : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.03 (25%)
wall
parser : 0.01 ( 8%) usr 0.00 ( 0%) sys 0.00 ( 0%)
wall
varconst : 0.08 (62%) usr 0.00 ( 0%) sys 0.06 (50%)
wall
CSE : 0.01 ( 8%) usr 0.00 ( 0%) sys 0.00 ( 0%)
wall
TOTAL : 0.13 0.00 0.12
real 0m0.397s
user 0m0.070s
sys 0m0.260s
Without PFE:
bash-2.05a$ time g++ -O2 -ftime-report -include t1.h t1.cc
Execution times (seconds)
preprocessing : 0.39 (19%) usr 0.00 ( 0%) sys 0.19 ( 9%)
wall
lexical analysis : 0.67 (32%) usr 0.00 ( 0%) sys 0.72 (34%)
wall
parser : 0.90 (43%) usr 0.00 ( 0%) sys 1.09 (51%)
wall
varconst : 0.04 ( 2%) usr 0.00 ( 0%) sys 0.03 ( 1%)
wall
CSE : 0.02 ( 1%) usr 0.00 ( 0%) sys 0.03 ( 1%)
wall
CSE 2 : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 1%)
wall
regmove : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%)
wall
symout : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%)
wall
TOTAL : 2.07 0.00 2.12
real 0m2.378s
user 0m1.180s
sys 0m1.110s
Notice that preprocessing and lexical analysis both disappear. Sure
the parse takes time, but just 8%. varconst is the cxx_finish_file
pagefaulter which is more expensive than it really needs to be. The
speedup this shows is typical (6x) for larger applications.