This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: optimization/6007: cfg cleanup tremendous performance hog with -O1


> 
> > [RE: crossjumping]
> > 
> > > I will try to check whether I can squeze out some more cycles or
> > > find way how to limit this.
> > 
> > My code uses a lot of computed goto's, with many labels.  Is crossjumping
> > possibly a win in this case?  Can it ignore these jumps?
> > 
> > Brad
> Hi, here is patch I made as a test.  It simply disables crossjumping
> if there is moer than 100 outgoing edges.  Unfortunately I can't benchmark
> your testcase as my machine runs out of space before getting there.  Can you
> check if this solves your problem?  If so, I will prepare more polished
> version of this patch.

First of all, there were no regressions in the test suite on 
sparcv9-sun-solaris2.8 with your patch.

I made a slightly smaller test case that requires only 1200 MB to compile on
sparcv9.  (Perhaps it would require less on regular sparc with 32-bit pointers.)
I did not build a profiled version of cc1 yet.

The results are not so good with your patch.

banach-725% ~/programs/gcc/gcc-3.1/objdir-sparcv9/gcc/stage2/cc1 -fpreprocessed denoise3.i -mptr64 -mstack-bias -mno-v8plus -dumpbase denoise3.c -m64 -mcpu=ultrasparc -mtune=ultrasparc -O1 -Wall -W -Wno-unused -version -fPIC -fschedule-insns2 -fno-math-errno -fno-strict-aliasing -o denoise3.s
GNU CPP version 3.1 20020328 (prerelease) (cpplib) (sparc ELF)
GNU C version 3.1 20020328 (prerelease) (sparcv9-sun-solaris2.8)
        compiled by GNU C version 3.1 20020328 (prerelease).
options passed:  -fpreprocessed -mptr64 -mstack-bias -mno-v8plus -m64
 -mcpu=ultrasparc -mtune=ultrasparc -O1 -Wall -W -Wno-unused -fPIC
 -fschedule-insns2 -fno-math-errno -fno-strict-aliasing
options enabled:  -fdefer-pop -fomit-frame-pointer -fthread-jumps
 -fpeephole -ffunction-cse -fkeep-static-consts -freg-struct-return
 -fdelayed-branch -fgcse-lm -fgcse-sm -fschedule-insns2 -fsched-interblock
 -fsched-spec -fbranch-count-reg -fPIC -fcprop-registers -fcommon
 -fgnu-linker -fargument-alias -fmerge-constants -fident
 -fguess-branch-probability -ftrapping-math -mepilogue -mptr64 -m64
 -mstack-bias -mcpu=ultrasparc -mtune=ultrasparc
 ___H__20_denoise3 {GC 72981k -> 25818k} {GC 33692k -> 25034k} {GC 49841k -> 28604k} {GC 42905k -> 28468k} {GC 42063k -> 33617k} {GC 55871k -> 36690k} ___init_proc ____20_denoise3
Execution times (seconds)
 garbage collection    :   4.31 ( 0%) usr   0.02 ( 0%) sys   6.00 ( 0%) wall
 cfg construction      :  66.65 ( 2%) usr  22.01 (51%) sys  89.00 ( 2%) wall
 cfg cleanup           :3156.77 (87%) usr   0.04 ( 0%) sys3193.00 (86%) wall
 life analysis         :  79.31 ( 2%) usr   0.00 ( 0%) sys  79.00 ( 2%) wall
 life info update      :   0.80 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 preprocessing         :   0.41 ( 0%) usr   1.81 ( 4%) sys   3.00 ( 0%) wall
 lexical analysis      :   0.45 ( 0%) usr   3.63 ( 8%) sys   5.00 ( 0%) wall
 parser                :   4.22 ( 0%) usr   2.46 ( 6%) sys   5.00 ( 0%) wall
 expand                :   2.00 ( 0%) usr   0.26 ( 1%) sys   3.00 ( 0%) wall
 varconst              :   0.65 ( 0%) usr   0.03 ( 0%) sys   0.00 ( 0%) wall
 integration           :   0.93 ( 0%) usr   0.04 ( 0%) sys   1.00 ( 0%) wall
 jump                  :   0.69 ( 0%) usr   0.01 ( 0%) sys   0.00 ( 0%) wall
 CSE                   :   4.66 ( 0%) usr   0.00 ( 0%) sys   5.00 ( 0%) wall
 loop analysis         :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 flow analysis         : 138.86 ( 4%) usr  13.01 (30%) sys 152.00 ( 4%) wall
 combiner              :   8.42 ( 0%) usr   0.00 ( 0%) sys   9.00 ( 0%) wall
 if-conversion         :  11.72 ( 0%) usr   0.01 ( 0%) sys  11.00 ( 0%) wall
 local alloc           :   2.63 ( 0%) usr   0.00 ( 0%) sys   2.00 ( 0%) wall
 global alloc          :  25.46 ( 1%) usr   0.00 ( 0%) sys  26.00 ( 1%) wall
 reload CSE regs       : 106.46 ( 3%) usr   0.00 ( 0%) sys 106.00 ( 3%) wall
 flow 2                :   4.41 ( 0%) usr   0.00 ( 0%) sys   5.00 ( 0%) wall
 if-conversion 2       :   0.18 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
 rename registers      :   8.73 ( 0%) usr   0.00 ( 0%) sys   9.00 ( 0%) wall
 scheduling 2          :   4.88 ( 0%) usr   0.01 ( 0%) sys   5.00 ( 0%) wall
 delay branch sched    :   4.05 ( 0%) usr   0.00 ( 0%) sys   3.00 ( 0%) wall
 shorten branches      :   0.50 ( 0%) usr   0.00 ( 0%) sys   2.00 ( 0%) wall
 final                 :   1.17 ( 0%) usr   0.03 ( 0%) sys   0.00 ( 0%) wall
 rest of compilation   :   2.50 ( 0%) usr   0.02 ( 0%) sys   3.00 ( 0%) wall
 TOTAL                 :3641.89            43.42          3722.00

The file denoise3.i is at

http://www.math.purdue.edu/~lucier/GNATS/GNATS-4/denoise3.i.gz

Do you want me to build a profiled version of cc1 with your patch?

Brad


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]