This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[patch] Disable old loop optimizer


Hello,

this patch disables old loop optimizer and enables the invariant motion
of the new rtl loop optimizer instead by default.  This is the first
step towards complete removal of the old loop optimizer, which I would
very much like to happen in 4.1 (since it would enable some pretty
significant cleanups of the compiler, like removal of loop notes,
removal of rtl level branch prediction pass, maybe removal of libcall
notes, etc.).

Bootstrapped & regtested on i686, x86_64 and ppc.

Below are the SPEC benchmark results on i686 (peak with the patch).
Both specint and specfp improve overall.

I have also tested the patch on Nullstone microbenchmark, without getting
any significant regressions directly caused by disabling the old loop
optimizers -- only a few regressions (the worst by 12%) due to basically
random changes in the code.

The following patches are needed to get the results mentioned above:

http://gcc.gnu.org/ml/gcc-patches/2005-03/msg02878.html
-- TARGET_MEM_REF patch, to enable ivopts to more precisely control the
   addressing modes used for the memory references
http://gcc.gnu.org/ml/gcc-patches/2005-03/msg01611.html
-- merging of equivalent invariants in rtl level invariant motion
http://gcc.gnu.org/ml/gcc-patches/2005-03/msg02577.html
-- reenables loop header predictor.  The reason why we have regressions
   without this patch is a bit complicated:  old loop optimizer removes
   cfg and thus clears edge probabilities.  Cfg cleanup then sometimes
   produces REG_BR_PROB notes on insns with values 0.  Rtl branch
   prediction respects these notes, thus seting probabilities of the
   edges coming out of the insns to 0% and 100%.
   
   When the old loop optimizer is disabled, the predictions from tree
   level are preserved instead.  However, sometimes is happens
   that the bogus 0%/100% probabilities are by luck more precise than
   the "correct" predictions.  In the gap benchmark, edges going into
   a loop on the hot path got correctly predicted by the "bogus"
   prediction.  However without the LOOP_HEADER predictor, they get
   predicted incorrectly when the old loop optimizer is disabled,
   thus causing some regressions due to wrong bb ordering.

   164.gzip          1400   201       695    *     1400   197       711    *
   175.vpr           1400   392       357    *     1400   391       358    *
   176.gcc                                   X                             X
   181.mcf           1800   733       246    *     1800   731       246    *
   186.crafty        1000   118       848    *     1000   118       849    *
   197.parser        1800   396       454    *     1800   397       454    *
   252.eon           1300   144       902    *     1300   145       899    *
   253.perlbmk       1800   201       895    *     1800   207       869    *
   254.gap           1100   172       639    *     1100   172       639    *
   255.vortex        1900   232       820    *     1900   231       824    *
   256.bzip2         1500   343       438    *     1500   326       459    *
   300.twolf         3000   788       381    *     3000   787       381    *
   Est. SPECint_base2000              559    
   Est. SPECint2000                                                 561    

   168.wupwise       1600   287       557    *     1600   294       544    *
   171.swim          3100   689       450    *     3100   695       446    *
   172.mgrid         1800   640       281    *     1800   492       366    *
   173.applu         2100   574       366    *     2100   554       379    *
   177.mesa          1400   201       695    *     1400   201       696    *
   178.galgel                                X                             X
   179.art           2600  1467       177    *     2600  1466       177    *
   183.equake        1300   290       448    *     1300   293       444    *
   187.facerec       1900   474       400    *     1900   487       390    *
   188.ammp          2200   624       353    *     2200   626       351    *
   189.lucas         2000   430       465    *     2000   430       465    *
   191.fma3d         2100   398       528    *     2100   403       521    *
   200.sixtrack      1100   238       462    *     1100   236       466    *
   301.apsi          2600   697       373    *     2600   728       357    *
   Est. SPECfp_base2000               407    
   Est. SPECfp2000                                                  413    

Zdenek

	* common.opt (floop-optimize2): Removed.
	(fmove-loop-invariants): Enable by default.
	* opts.c (decode_options): Disable old loop optimizer
	by default.
	* passes.c (rest_of_handle_loop2, rest_of_compilation):
	Do not use flag_loop_optimize2.
	* toplev.c (process_options): Ditto.
	* doc/invoke.texi (-floop-optimize2): Removed.

Index: common.opt
===================================================================
RCS file: /cvs/gcc/gcc/gcc/common.opt,v
retrieving revision 1.66
diff -c -3 -p -r1.66 common.opt
*** common.opt	28 Mar 2005 08:04:38 -0000	1.66
--- common.opt	31 Mar 2005 20:39:32 -0000
*************** floop-optimize
*** 496,505 ****
  Common Report Var(flag_loop_optimize)
  Perform loop optimizations
  
- floop-optimize2
- Common Report Var(flag_loop_optimize2)
- Perform loop optimizations using the new loop optimizer
- 
  fmath-errno
  Common Report Var(flag_errno_math) Init(1)
  Set errno after built-in math functions
--- 496,501 ----
*************** Common Report Var(flag_modulo_sched)
*** 528,534 ****
  Perform SMS based modulo scheduling before the first scheduling pass
  
  fmove-loop-invariants
! Common Report Var(flag_move_loop_invariants)
  Move loop invariant computations out of loops
  
  fmudflap
--- 524,530 ----
  Perform SMS based modulo scheduling before the first scheduling pass
  
  fmove-loop-invariants
! Common Report Var(flag_move_loop_invariants) Init(1)
  Move loop invariant computations out of loops
  
  fmudflap

Index: opts.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/opts.c,v
retrieving revision 1.99
diff -c -3 -p -r1.99 opts.c
*** opts.c	31 Mar 2005 14:59:51 -0000	1.99
--- opts.c	31 Mar 2005 20:39:32 -0000
*************** decode_options (unsigned int argc, const
*** 512,518 ****
  #endif
        flag_guess_branch_prob = 1;
        flag_cprop_registers = 1;
-       flag_loop_optimize = 1;
        flag_if_conversion = 1;
        flag_if_conversion2 = 1;
        flag_tree_ccp = 1;
--- 512,517 ----
Index: passes.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/passes.c,v
retrieving revision 2.78
diff -c -3 -p -r2.78 passes.c
*** passes.c	31 Mar 2005 14:59:51 -0000	2.78
--- passes.c	31 Mar 2005 20:39:32 -0000
*************** rest_of_handle_loop2 (void)
*** 1143,1149 ****
        && !flag_unswitch_loops
        && !flag_peel_loops
        && !flag_unroll_loops
!       && !flag_branch_on_count_reg)
      return;
  
    timevar_push (TV_LOOP);
--- 1143,1152 ----
        && !flag_unswitch_loops
        && !flag_peel_loops
        && !flag_unroll_loops
! #ifdef HAVE_doloop_end
!       && (!flag_branch_on_count_reg || !HAVE_doloop_end)
! #endif
!       )
      return;
  
    timevar_push (TV_LOOP);
*************** rest_of_compilation (void)
*** 1612,1619 ****
    if (optimize > 0 && flag_tracer)
      rest_of_handle_tracer ();
  
!   if (optimize > 0
!       && flag_loop_optimize2)
      rest_of_handle_loop2 ();
  
    if (optimize > 0 && flag_web)
--- 1615,1621 ----
    if (optimize > 0 && flag_tracer)
      rest_of_handle_tracer ();
  
!   if (optimize > 0)
      rest_of_handle_loop2 ();
  
    if (optimize > 0 && flag_web)
Index: toplev.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/toplev.c,v
retrieving revision 1.950
diff -c -3 -p -r1.950 toplev.c
*** toplev.c	30 Mar 2005 22:27:42 -0000	1.950
--- toplev.c	31 Mar 2005 20:39:32 -0000
*************** process_options (void)
*** 1728,1746 ****
    if (flag_unroll_loops || flag_peel_loops)
      flag_rerun_cse_after_loop = 1;
  
-   /* If explicitly asked to run new loop optimizer, switch off the old
-      one.  */
-   if (flag_loop_optimize2)
-     flag_loop_optimize = 0;
- 
-   /* Enable new loop optimizer pass if any of its optimizations is called.  */
-   if (flag_move_loop_invariants
-       || flag_unswitch_loops
-       || flag_peel_loops
-       || flag_unroll_loops
-       || flag_branch_on_count_reg)
-     flag_loop_optimize2 = 1;
- 
    if (flag_non_call_exceptions)
      flag_asynchronous_unwind_tables = 1;
    if (flag_asynchronous_unwind_tables)
--- 1728,1733 ----
Index: doc/invoke.texi
===================================================================
RCS file: /cvs/gcc/gcc/gcc/doc/invoke.texi,v
retrieving revision 1.595
diff -c -3 -p -r1.595 invoke.texi
*** doc/invoke.texi	31 Mar 2005 14:21:14 -0000	1.595
--- doc/invoke.texi	31 Mar 2005 20:45:22 -0000
*************** Objective-C and Objective-C++ Dialects}.
*** 297,303 ****
  -finline-functions  -finline-limit=@var{n}  -fkeep-inline-functions @gol
  -fkeep-static-consts  -fmerge-constants  -fmerge-all-constants @gol
  -fmodulo-sched -fno-branch-count-reg @gol
! -fno-default-inline  -fno-defer-pop -floop-optimize2 -fmove-loop-invariants @gol
  -fno-function-cse  -fno-guess-branch-probability @gol
  -fno-inline  -fno-math-errno  -fno-peephole  -fno-peephole2 @gol
  -funsafe-math-optimizations  -ffinite-math-only @gol
--- 297,303 ----
  -finline-functions  -finline-limit=@var{n}  -fkeep-inline-functions @gol
  -fkeep-static-consts  -fmerge-constants  -fmerge-all-constants @gol
  -fmodulo-sched -fno-branch-count-reg @gol
! -fno-default-inline  -fno-defer-pop -fmove-loop-invariants @gol
  -fno-function-cse  -fno-guess-branch-probability @gol
  -fno-inline  -fno-math-errno  -fno-peephole  -fno-peephole2 @gol
  -funsafe-math-optimizations  -ffinite-math-only @gol
*************** exit test conditions and optionally do s
*** 4561,4572 ****
  
  Enabled at levels @option{-O}, @option{-O2}, @option{-O3}, @option{-Os}.
  
- @item -floop-optimize2
- @opindex floop-optimize2
- Perform loop optimizations using the new loop optimizer.  The optimizations
- (loop unrolling, peeling and unswitching, loop invariant motion) are enabled
- by separate flags.
- 
  @item -fcrossjumping
  @opindex crossjumping
  Perform cross-jumping transformation.  This transformation unifies equivalent code and save code size.  The
--- 4561,4566 ----


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]