This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [tree-ssa] A pass to remove a large number of casts in C++ code


Andrew,

Good stuff.  Thanks.  A couple of things that I'd like to address before
we commit to this:

     1. The patch is poorly formatted and contains several typos and
        grammar problems.  Please format everything so that it fits on
        80 columns.  Functions need to have documentation about each of
        their arguments and you'll need to add some spacing to make
        things more clear.
     2. Why did you implement it as a separate pass?  The
        transformations use almost no data flow information.  Wouldn't
        it be better to implement these routines as subroutines of
        fold_stmt()?  I want to understand what led you to choose this
        route.  It doesn't seem to take long, but it does require a full
        IL scan, and since all the transformations are related to
        "folding", perhaps they belong there?
     3. If we decide to have it as a separate pass, it should be
        documented in passes.texi (I think there are other tree-ssa
        passes missing from passes.texi that we will need to add before
        the merge).

I did some tests over the weekend and it looks pretty decent,
particularly for some C++ codes:

      * For DLV, code size was reduced by 1.8% and compile time reduced
        by 2.7%.
      * For cc1-i-files, there was a 0.2% code reduction and almost no
        reduction in compile time (less than a second).
      * For tramp3d-v3.cpp (compiled with -O2) I noticed no change in
        compile time, code size was reduced by 0.4% and run time was
        reduced by 1.5% (from 6.8s/it to 6.7s/it).

SPEC2000 results are within the usual values: gzip, perlbmk and twolf
are the best performers, but we lose some in crafty and eon.  Overall,
the scores are very similar, though.

                                     Estimated                     Estimated
                   Base      Base      Base      Peak      Peak      Peak
   Benchmarks    Ref Time  Run Time   Ratio    Ref Time  Run Time   Ratio
   ------------  --------  --------  --------  --------  --------  --------
   164.gzip          1400       220       637*     1400       218       643*
   175.vpr           1400       336       417*     1400       332       422*
   176.gcc                                   X                             X
   181.mcf           1800       412       437*     1800       429       435*
   186.crafty        1000       153       652*     1000       155       645*
   197.parser        1800       314       573*     1800       319       564*
   252.eon           1300       228       569*     1300       242       537*
   253.perlbmk       1800       242       743*     1800       231       778*
   254.gap           1100       152       725*     1100       149       737*
   255.vortex        1900       223       852*     1900       224       850*
   256.bzip2         1500       284       528*     1500       284       528*
   300.twolf         3000       561       534*     3000       555       540*
   Est. SPECint_base2000                  593
   Est. SPECint2000                                                     593

I also tested the patch on ia64, alpha, ia32e and x86-64.  No problems
on any arch, modulo the fortran regressions which I think we should
address by removing support for MINUS_EXPR in is_gimple_min_invariant.


Diego.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]