This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: [tree-ssa] Merge status
- From: Jan Hubicka <hubicka at ucw dot cz>
- To: Paolo Bonzini <bonzini at gnu dot org>
- Cc: GCC Development <gcc at gcc dot gnu dot org>, dnovillo at redhat dot com
- Date: Sat, 13 Mar 2004 16:58:47 +0100
- Subject: Re: [tree-ssa] Merge status
- References: <000e01c4090b$4b9f5540$5adf1d97@philo>
> > For SPEC2000int, the branch is 2% behind mainline on x86. For
> > SPEC2000fp, the branch is 0.4% ahead. I will post x86-64 resuts soon.
>
> Are the performance problems on SPEC2000int (and the improvements in
> SPEC2000fp) concentrated in a single test?
>
> > Currently, bootstrap times on the branch are 14% slower than mainline
> > [...] Work is underway to remove some RTL passes
>
> It may be interesting to try without ADDRESSOF; I cannot test it fully but I
> made some experiments; I simply tweaked function.c to not generate it, but Jan
> has posted a polished patch which AFAIK is still waiting for review. I
> successfully bootstrapped mainline without all three of ADDRESSOF generation,
> CSE1 (oh yeah) and the small purge_addressof pass. This helps bootstrapping
> time, but somebody should do SPEC tests to see if it causes performance
> problems for the other merge criterion -- with Jan's patch, the necessary
> modification is as easy as emptying rest_of_handle_cse and
> rest_of_handle_addressof.
Problem of the ADDRESSOF patch is that it cause seriuos regressions on
C++ code (like Gerald's testcase). For C++ code we commonly produce
a=*(const char *)&char_vairable
like constructions produced by C++ casts. RTL backend is able to use
subreg to do the cast in register, but for tree we can't.
Afaik Andrew and Richard has some plans to fix C++ forntend to do more
firendly casts, but I am not sure what is status of this task.
>
> Another low-hanging candidate in this area is the extended basic block stuff
> of CSE. -fno-cse-follow-jumps -fno-cse-skip-blocks (or whatever they're
> named) should give a first idea of the changes in SPEC numbers, but probably
> more can be gained by touching cse.c itself to remove unnecessary code and
> tests.
I did some SPEC testing and the regresisons were serious, tought I can't
dig out the numbers anymore unforutnately :(
>
> Finally, in rest_of_handle_gcse, with -fexpensive-optimizations CSE is run
> repeatedly after GCSE until no jumps change. Is this really helpful,
> especially with EBB CSE disabled?
>
> I also have a patch to remove CONSTANT_P_RTX; performance tests made on
> mainline when purge_builtin_constant_p was introduced, showed that it took
> about 1% of bootstrap time. However, purge_builtin_constant_p should actually
> be unused on the branch since builtin_constant_p is lowered well before the
> RTL expander. There still may be a very small improvement (0.5% maybe),
> because it would simplify the CONSTANT_P predicate in rtl.h: a third of the 5%
> improvement gained by my RTX classes patch was due to simplifying CONSTANT_P.
> If you are interested in the patch I can polish it and send it on Monday or
> Tuesday.
I think elliminate of CONSTNT_P_RTX makes a lot of sense now.
>
> IIRC, jump bypassing takes about 2% of compile time. Actually when jump
> bypassing was introduced, it sped up bootstrap because GCC's enormous
> conditionals are well suited to jump bypassing; but now tree-ssa-dom should
> have made it almost obsolete on the branch, shouldn't it? Again, SPEC testing
> is the only possible guidance.
Jeff has some numbers showing that RTL jump bypass pass makes very few
matches on GCC bootstrap. I would think a lot in favour of removing it
or making it run just once instead of three times.
I can run the SPECs for these.
Honza
>
> Hope this helps,
>
> Paolo
>
>
>