This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [tree-ssa] A pass to remove a large number of casts in C++ code
- From: Diego Novillo <dnovillo at redhat dot com>
- To: Andrew Pinski <pinskia at physics dot uc dot edu>
- Cc: "gcc-patches at gcc dot gnu dot org Patches" <gcc-patches at gcc dot gnu dot org>, Kazu Hirata <kazu at cs dot umass dot edu>, steven at gcc dot gnu dot org
- Date: Mon, 05 Apr 2004 00:30:14 -0400
- Subject: Re: [tree-ssa] A pass to remove a large number of casts in C++ code
- Organization: Red Hat Canada
- References: <13D4C492-8529-11D8-BD72-000393A6D2F2@physics.uc.edu>
Andrew,
Good stuff. Thanks. A couple of things that I'd like to address before
we commit to this:
1. The patch is poorly formatted and contains several typos and
grammar problems. Please format everything so that it fits on
80 columns. Functions need to have documentation about each of
their arguments and you'll need to add some spacing to make
things more clear.
2. Why did you implement it as a separate pass? The
transformations use almost no data flow information. Wouldn't
it be better to implement these routines as subroutines of
fold_stmt()? I want to understand what led you to choose this
route. It doesn't seem to take long, but it does require a full
IL scan, and since all the transformations are related to
"folding", perhaps they belong there?
3. If we decide to have it as a separate pass, it should be
documented in passes.texi (I think there are other tree-ssa
passes missing from passes.texi that we will need to add before
the merge).
I did some tests over the weekend and it looks pretty decent,
particularly for some C++ codes:
* For DLV, code size was reduced by 1.8% and compile time reduced
by 2.7%.
* For cc1-i-files, there was a 0.2% code reduction and almost no
reduction in compile time (less than a second).
* For tramp3d-v3.cpp (compiled with -O2) I noticed no change in
compile time, code size was reduced by 0.4% and run time was
reduced by 1.5% (from 6.8s/it to 6.7s/it).
SPEC2000 results are within the usual values: gzip, perlbmk and twolf
are the best performers, but we lose some in crafty and eon. Overall,
the scores are very similar, though.
Estimated Estimated
Base Base Base Peak Peak Peak
Benchmarks Ref Time Run Time Ratio Ref Time Run Time Ratio
------------ -------- -------- -------- -------- -------- --------
164.gzip 1400 220 637* 1400 218 643*
175.vpr 1400 336 417* 1400 332 422*
176.gcc X X
181.mcf 1800 412 437* 1800 429 435*
186.crafty 1000 153 652* 1000 155 645*
197.parser 1800 314 573* 1800 319 564*
252.eon 1300 228 569* 1300 242 537*
253.perlbmk 1800 242 743* 1800 231 778*
254.gap 1100 152 725* 1100 149 737*
255.vortex 1900 223 852* 1900 224 850*
256.bzip2 1500 284 528* 1500 284 528*
300.twolf 3000 561 534* 3000 555 540*
Est. SPECint_base2000 593
Est. SPECint2000 593
I also tested the patch on ia64, alpha, ia32e and x86-64. No problems
on any arch, modulo the fortran regressions which I think we should
address by removing support for MINUS_EXPR in is_gimple_min_invariant.
Diego.