This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [tree-ssa] Merge status


> > For SPEC2000int, the branch is 2% behind mainline on x86.  For
> > SPEC2000fp, the branch is 0.4% ahead.  I will post x86-64 resuts soon.
> 
> Are the performance problems on SPEC2000int (and the improvements in
> SPEC2000fp) concentrated in a single test?
> 
> > Currently, bootstrap times on the branch are 14% slower than mainline
> > [...] Work is underway to remove some RTL passes
> 
> It may be interesting to try without ADDRESSOF; I cannot test it fully but I
> made some experiments; I simply tweaked function.c to not generate it, but Jan
> has posted a polished patch which AFAIK is still waiting for review.  I
> successfully bootstrapped mainline without all three of ADDRESSOF generation,
> CSE1 (oh yeah) and the small purge_addressof pass.  This helps bootstrapping
> time, but somebody should do SPEC tests to see if it causes performance
> problems for the other merge criterion -- with Jan's patch, the necessary
> modification is as easy as emptying rest_of_handle_cse and
> rest_of_handle_addressof.

Problem of the ADDRESSOF patch is that it cause seriuos regressions on
C++ code (like Gerald's testcase).  For C++ code we commonly produce
a=*(const char *)&char_vairable
like constructions produced by C++ casts.  RTL backend is able to use
subreg to do the cast in register, but for tree we can't.
Afaik Andrew and Richard has some plans to fix C++ forntend to do more
firendly casts, but I am not sure what is status of this task.
> 
> Another low-hanging candidate in this area is the extended basic block stuff
> of CSE.  -fno-cse-follow-jumps -fno-cse-skip-blocks (or whatever they're
> named) should give a first idea of the changes in SPEC numbers, but probably
> more can be gained by touching cse.c itself to remove unnecessary code and
> tests.

I did some SPEC testing and the regresisons were serious, tought I can't
dig out the numbers anymore unforutnately :(
> 
> Finally, in rest_of_handle_gcse, with -fexpensive-optimizations CSE is run
> repeatedly after GCSE until no jumps change.  Is this really helpful,
> especially with EBB CSE disabled?
> 
> I also have a patch to remove CONSTANT_P_RTX; performance tests made on
> mainline when purge_builtin_constant_p was introduced, showed that it took
> about 1% of bootstrap time.  However, purge_builtin_constant_p should actually
> be unused on the branch since builtin_constant_p is lowered well before the
> RTL expander.  There still may be a very small improvement (0.5% maybe),
> because it would simplify the CONSTANT_P predicate in rtl.h: a third of the 5%
> improvement gained by my RTX classes patch was due to simplifying CONSTANT_P.
> If you are interested in the patch I can polish it and send it on Monday or
> Tuesday.

I think elliminate of CONSTNT_P_RTX makes a lot of sense now.
> 
> IIRC, jump bypassing takes about 2% of compile time.  Actually when jump
> bypassing was introduced, it sped up bootstrap because GCC's enormous
> conditionals are well suited to jump bypassing; but now tree-ssa-dom should
> have made it almost obsolete on the branch, shouldn't it?  Again, SPEC testing
> is the only possible guidance.

Jeff has some numbers showing that RTL jump bypass pass makes very few
matches on GCC bootstrap.  I would think a lot in favour of removing it
or making it run just once instead of three times.

I can run the SPECs for these.

Honza
> 
> Hope this helps,
> 
> Paolo
> 
> 
> 


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]