[PATCH][RFC] Allow tuning down early loop unrolling

Richard Guenther rguenther@suse.de
Fri Oct 9 19:35:00 GMT 2009


This is an attempt to give users and maybe targets or other passes
like graphite the ability to control the early loop unrolling pass.

I'd appreciate feedback from the people that requested this and
see if it would fix their problems.

Thanks,
Richard.

2009-10-09  Richard Guenther  <rguenther@suse.de>

	PR tree-optimization/41647
	* common.opt (fearly-complete-loop-unrolling): New option.
	* params.def (PARAM_EARLY_UNROLL_SIZE_INCREASE): New param.
	* doc/invoke.texi (fearly-complete-loop-unrolling): Document.
	(early-unroll-size-increase): Likewise.
	* tree-ssa-loop.c (tree_complete_unroll_inner): Adjust when
	we allow size increase.
	(gate_tree_complete_unroll_inner): Gate on
	flag_early_complete_loop_unrolling.

Index: gcc/common.opt
===================================================================
*** gcc/common.opt	(revision 152595)
--- gcc/common.opt	(working copy)
*************** fearly-inlining
*** 482,487 ****
--- 482,491 ----
  Common Report Var(flag_early_inlining) Init(1) Optimization
  Perform early inlining
  
+ fearly-complete-loop-unrolling
+ Common Report Var(flag_early_complete_loop_unrolling) Optimization
+ Perform early complete loop unrolling
+ 
  feliminate-dwarf2-dups
  Common Report Var(flag_eliminate_dwarf2_dups)
  Perform DWARF2 duplicate elimination
Index: gcc/doc/invoke.texi
===================================================================
*** gcc/doc/invoke.texi	(revision 152595)
--- gcc/doc/invoke.texi	(working copy)
*************** Objective-C and Objective-C++ Dialects}.
*** 337,342 ****
--- 337,343 ----
  -fcse-follow-jumps -fcse-skip-blocks -fcx-fortran-rules -fcx-limited-range @gol
  -fdata-sections -fdce -fdce @gol
  -fdelayed-branch -fdelete-null-pointer-checks -fdse -fdse @gol
+ -fearly-complete-loop-unrolling @gol
  -fearly-inlining -fipa-sra -fexpensive-optimizations -ffast-math @gol
  -ffinite-math-only -ffloat-store -fexcess-precision=@var{style} @gol
  -fforward-propagate -ffunction-sections @gol
*************** also turns on the following optimization
*** 5694,5699 ****
--- 5695,5701 ----
  -fcrossjumping @gol
  -fcse-follow-jumps  -fcse-skip-blocks @gol
  -fdelete-null-pointer-checks @gol
+ -fearly-complete-loop-unrolling @gol
  -fexpensive-optimizations @gol
  -fgcse  -fgcse-lm  @gol
  -finline-small-functions @gol
*************** having large chains of nested wrapper fu
*** 5859,5864 ****
--- 5861,5875 ----
  
  Enabled by default.
  
+ @item -fearly-complete-loop-unrolling
+ @option fearly-complete-loop-unrolling
+ Perform early complete loop unrolling.  This exposes cross loop iteration
+ optimization opportunities to scalar optimization passes.  When combined
+ with @option{-funroll-loops} or @option{-fpeel-loops} or with optimization
+ level @option{-O3} size increase does not inhibit this transformation.
+ 
+ Enabled at levels @option{-O2}, @option{-O3} and @option{-Os}.
+ 
  @item -fipa-sra
  @opindex fipa-sra
  Perform interprocedural scalar replacement of aggregates, removal of
*************** The maximum number of insns of a complet
*** 8028,8033 ****
--- 8039,8049 ----
  @item max-completely-peel-times
  The maximum number of iterations of a loop to be suitable for complete peeling.
  
+ @item early-unroll-size-increase
+ Whether early complete loop unrolling may increase size.  A value of
+ one leaves the decision to @option{-funroll-loops}, @option{-fpeel-loops}
+ and optimization level @option{-O3}.  A value of zero disables size increase.
+ 
  @item max-unswitch-insns
  The maximum number of insns of an unswitched loop.
  
Index: gcc/opts.c
===================================================================
*** gcc/opts.c	(revision 152595)
--- gcc/opts.c	(working copy)
*************** decode_options (unsigned int argc, const
*** 923,928 ****
--- 923,929 ----
    flag_tree_switch_conversion = 1;
    flag_ipa_cp = opt2;
    flag_ipa_sra = opt2;
+   flag_early_complete_loop_unrolling = opt2;
  
    /* Track fields in field-sensitive alias analysis.  */
    set_param_value ("max-fields-for-field-sensitive",
Index: gcc/tree-ssa-loop.c
===================================================================
*** gcc/tree-ssa-loop.c	(revision 152595)
--- gcc/tree-ssa-loop.c	(working copy)
*************** tree_complete_unroll_inner (void)
*** 520,527 ****
  		       | LOOPS_HAVE_RECORDED_EXITS);
    if (number_of_loops () > 1)
      {
        scev_initialize ();
!       ret = tree_unroll_loops_completely (optimize >= 3, false);
        free_numbers_of_iterations_estimates ();
        scev_finalize ();
      }
--- 520,532 ----
  		       | LOOPS_HAVE_RECORDED_EXITS);
    if (number_of_loops () > 1)
      {
+       bool may_increase_size_p;
        scev_initialize ();
!       may_increase_size_p = (PARAM_VALUE (PARAM_EARLY_UNROLL_SIZE_INCREASE)
! 			     && (flag_unroll_loops
! 				 || flag_peel_loops
! 				 || optimize >= 3));
!       ret = tree_unroll_loops_completely (may_increase_size_p, false);
        free_numbers_of_iterations_estimates ();
        scev_finalize ();
      }
*************** tree_complete_unroll_inner (void)
*** 533,539 ****
  static bool
  gate_tree_complete_unroll_inner (void)
  {
!   return optimize >= 2;
  }
  
  struct gimple_opt_pass pass_complete_unrolli =
--- 538,544 ----
  static bool
  gate_tree_complete_unroll_inner (void)
  {
!   return flag_early_complete_loop_unrolling;
  }
  
  struct gimple_opt_pass pass_complete_unrolli =
Index: gcc/params.def
===================================================================
*** gcc/params.def	(revision 152595)
--- gcc/params.def	(working copy)
*************** DEFPARAM(PARAM_MAX_COMPLETELY_PEEL_TIMES
*** 261,266 ****
--- 261,270 ----
  	"max-completely-peel-times",
  	"The maximum number of peelings of a single loop that is peeled completely",
  	16, 0, 0)
+ DEFPARAM(PARAM_EARLY_UNROLL_SIZE_INCREASE,
+ 	 "early-unroll-size-increase",
+ 	 "A flag wheter early unrolling may increase size",
+ 	 1, 0, 1)
  /* The maximum number of insns of a peeled loop that rolls only once.  */
  DEFPARAM(PARAM_MAX_ONCE_PEELED_INSNS,
  	"max-once-peeled-insns",



More information about the Gcc-patches mailing list