This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: big slowdown in egcs-1.1.2->gcc-2.95 on alpha


> That's the price we pay for doing register spilling on a per-instruction
> basis, and calling compute_use_by_pseudos twice for every instruction
> and reload pass.
> 
> I think we could fix this by using pseudo register birth / death lists
> instead of complete register sets.
> 
I made a mistake, sorry.  I didn't pass the -O1 -fPIC to cc1 when I
did the timings; reload is not the problem.  Mea culpa.

Here are the timings for the various stages of egcs-1.1.2 and gcc-2.95
with -O1 -fPIC.

popov-2% /usr/lib/gcc-lib/alpha-redhat-linux/egcs-2.91.66/cc1 -fPIC -O1 g0-1.i
 __copysignf copysignf __copysign copysign __fabsf fabsf __fabs fabs __floorf __floor floorf floor __fdimf fdimf __fdim fdim ___H__20_g0_2d_1 ___init_proc ____20_g0_2d_1
time in parse: 10.843360
time in integration: 0.007808
time in jump: 7.701616
time in cse: 8.022720
time in loop: 0.046848
time in flow: 2.352160
time in combine: 7.864608
time in local-alloc: 2.923120
time in global-alloc: 4.692608
time in shorten-branch: 0.361120
time in final: 2.023248

popov-4% /export/u10/gcc-2.95/lib/gcc-lib/alphaev6-unknown-linux-gnu/2.95/cc1 -fPIC -O1 g0-1.i
 __copysignf copysignf __copysign copysign __fabsf fabsf __fabs fabs __floorf __floor floorf floor __fdimf fdimf __fdim fdim ___H__20_g0_2d_1 ___init_proc ____20_g0_2d_1
time in parse: 10.939984
time in integration: 0.000976
time in jump: 8.168144
time in cse: 5.181584
time in loop: 0.039040
time in flow: 2.695712
time in combine: 8.443376
time in local-alloc: 3.038288
time in global-alloc: 119.387248
time in flow2: 2.311168
time in shorten-branch: 0.383568
time in final: 2.120848

You can see there is a big difference in global-alloc.

Here is the beginning of the flat composite profile

Flat profile:

Each sample counts as 0.000976562 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  ms/call  ms/call  name    
 36.61     57.68    57.68       16  3604.80  3605.63  prune_preferences
 12.47     77.32    19.64 391960848     0.00     0.00  bitmap_bit_p
  7.29     88.80    11.48   202789     0.06     0.06  record_one_conflict
  7.22    100.18    11.38       24   474.20  1233.79  build_insn_chain
  1.56    102.64     2.46        8   306.88 19628.50  yyparse
  1.35    104.77     2.13 27806760     0.00     0.00  count_pseudo
  1.32    106.85     2.08    15315     0.14     0.44  order_regs_for_reload
  1.05    108.50     1.65     1436     1.15     1.15  find_reg
  0.83    109.82     1.31  2455661     0.00     0.00  yylex

The biggest time sink seems to be the quadratic algorithm in prune_preferences
in global.c.

Again, the complete profile summary is at:

http://www.math.purdue.edu/~lucier/gmon.summary.gz

Brad Lucier     lucier@math.purdue.edu


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]