This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: More on compile performance of Linux kernels in mainline gcc


> 
> This is an addendum for the numbers for linux kernel compiling
> on x86-64 I posted some days ago. gcc tested is the same (041029)
> on the same machine with the same kernel tree/configuration.
> 
> I tracked down why the 4.0 compiled kernels didn't boot. One issue
> was a missing -fno-strict-aliasing for one file (now fixed), 
> the other is a miscompilation of a loop in function in the linux
> radix tree library (PR18241) The miscompilation can be worked around
> by compiling the affected file with -O0.
> 
> There are a lot of new warnings.  Especially 
> pointer targets in passing argument 2 of `foo' differ in signedness
> is extremly common.
> 
> I was asked to retry with an make profiledbootstrap compiled 
> mainline gcc.
> 
> This improves the 4.0 numbers somewhat.
> 
> gcc 3.3-hammer (profiledbootstrap) 
> 210.32user 31.62system 3:57.66elapsed
> 
> 4.0 snapshot with normal bootstrap:
> 262.71user 30.50system 4:48.46elapsed
> 
> 4.0 snapshots with profiledbootstrap:
> 248.01user 30.25system 4:33.66elapsed 
> 
> Still considerably slower than 3.3-hammer though.
> 
> Also Jan asked for oprofile output. Here are all symbols over 0.3%
> for a full kernel compile done with the profiledbootstrap compiler.
> 
> Looks like the likely/unlikely split is not very effective,
> there are a lot of hot unlikely hits. 
> 
> Some hash table lookup(s?) seem to be very hot, perhaps it needs
> a better hash function or a larger table?
> 
> 1.7% memset is somewhat worrying, that's a lot of clearing.
> 1.5% garbage collector accounting looks like a bug if that function
>      isn't misnamed.

Actually both these are common cases in the profiles I've seen -
basically all our cache misses comes to these two places so even tought
they are not that expensive they are.  I use -minline-all-stringops when
profiling the GCC to see where the memset really comes to.

> 
> Standard GLOBAL_POWER_EVENTS:
> 95020    4.0626  cc1                      yyparse.unlikely_section
> 438612    2.5638  cc1                      ht_lookup_with_hash
> 298462    1.7446  libc.so.6                memset
> 288277    1.6851  cc1                      _cpp_lex_direct
> 265789    1.5536  cc1                      ggc_alloc_stat.unlikely_section

I have to look into it - ggc_alloc_stat is very definitly not unlikely
function...  What enable-languages setting did you use?
This might explain why the speedup you are seeing is only roughly 5%
insetad of 10% I used to see (tought I didn't tested Linux kernel tree)

Honza


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]