This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Patchv2 3/4] Control SRA and IPA-SRA by a param rather than MOVE_RATIO


Hi,

After hookizing MOVE_BY_PIECES_P and migrating tree-inline.c, we are
left with only one user of MOVE_RATIO - deciding the maximum size of
aggregate for SRA.

Past discussions have made it clear [1] that keeping this use of
MOVE_RATIO is undesirable. Clearly it is now also misnamed.

The previous iteration of this patch was rejected as too complicated. I
went off and tried simplifying it to use MOVE_RATIO, but if we do that we
end up breaking some interface boundaries between the driver and the
backend.

This patch partially hookizes MOVE_RATIO under the new name
TARGET_MAX_SCALARIZATION_SIZE and uses it to set default values for two
new parameters:

  sra-max-scalarization-size-Ospeed - The maximum size of aggregate
  to consider when compiling for speed
  sra-max-scalarization-size-Osize - The maximum size of aggregate
  to consider when compiling for size.

We then modify SRA to use these parameters rather than MOVE_RATIO.

Bootstrapped and regression tested for x86, arm and aarch64 with no
issues.

OK for trunk?

[1]: https://gcc.gnu.org/ml/gcc-patches/2014-08/msg01997.html

---
gcc/

2014-09-25  James Greenhalgh  <james.greenhalgh@arm.com>

	* doc/invoke.texi (sra-max-scalarization-size-Ospeed): Document.
	(sra-max-scalarization-size-Osize): Likewise.
	* doc/tm.texi.in
	(MOVE_RATIO): Reduce documentation to a stub, deprecate.
	(TARGET_MAX_SCALARIZATION_SIZE): Add hook.
	* doc/tm.texi: Regenerate.
	* defaults.h (MOVE_RATIO): Remove default implementation.
	(SET_RATIO): Add a default implementation if MOVE_RATIO
	is not defined.
	* params.def (sra-max-scalarization-size-Ospeed): New.
	(sra-max-scalarization-size-Osize): Likewise.
	* target.def (max_scalarization_size): New.
	* targhooks.c (default_max_scalarization_size): New.
	* targhooks.h (default_max_scalarization_size): New.
	* tree-sra.c (get_max_scalarization_size): New.
	(analyze_all_variable_accesses): Use it.
diff --git a/gcc/defaults.h b/gcc/defaults.h
index c1776b0..f723e2c 100644
--- a/gcc/defaults.h
+++ b/gcc/defaults.h
@@ -1191,18 +1191,6 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #define BRANCH_COST(speed_p, predictable_p) 1
 #endif
 
-/* If a memory-to-memory move would take MOVE_RATIO or more simple
-   move-instruction sequences, we will do a movmem or libcall instead.  */
-
-#ifndef MOVE_RATIO
-#if defined (HAVE_movmemqi) || defined (HAVE_movmemhi) || defined (HAVE_movmemsi) || defined (HAVE_movmemdi) || defined (HAVE_movmemti)
-#define MOVE_RATIO(speed) 2
-#else
-/* If we are optimizing for space (-Os), cut down the default move ratio.  */
-#define MOVE_RATIO(speed) ((speed) ? 15 : 3)
-#endif
-#endif
-
 /* If a clear memory operation would take CLEAR_RATIO or more simple
    move-instruction sequences, we will do a setmem or libcall instead.  */
 
@@ -1219,7 +1207,14 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
    SET_RATIO or more simple move-instruction sequences, we will do a movmem
    or libcall instead.  */
 #ifndef SET_RATIO
+#ifdef MOVE_RATIO
 #define SET_RATIO(speed) MOVE_RATIO (speed)
+#elif defined (HAVE_movmemqi) || defined (HAVE_movmemhi) || defined (HAVE_movmemsi) || defined (HAVE_movmemdi) || defined (HAVE_movmemti)
+#define SET_RATIO(speed) 2
+#else
+/* If we are optimizing for space (-Os), cut down the default move ratio.  */
+#define SET_RATIO(speed) ((speed) ? 15 : 3)
+#endif
 #endif
 
 /* Supply a default definition for FUNCTION_ARG_PADDING:
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index eae4ab1..c3e6eaa 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -10301,6 +10301,16 @@ parameters only when their cumulative size is less or equal to
 @option{ipa-sra-ptr-growth-factor} times the size of the original
 pointer parameter.
 
+@item sra-max-scalarization-size-Ospeed
+@item sra-max-scalarization-size-Osize
+The two Scalar Reduction of Aggregates passes (SRA and IPA-SRA) aim to
+replace scalar parts of aggregates with uses of independent scalar
+variables.  These parameters control the maximum size, in storage units,
+of aggregate which will be considered for replacement when compiling for
+speed
+(@option{sra-max-scalarization-size-Ospeed}) or size
+(@option{sra-max-scalarization-size-Osize}) respectively.
+
 @item tm-max-aggregate-size
 When making copies of thread-local variables in a transaction, this
 parameter specifies the size in bytes after which variables are
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index f59641a..b4061eb 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -6098,20 +6098,25 @@ this macro is defined, it should produce a nonzero value when
 @end defmac
 
 @defmac MOVE_RATIO (@var{speed})
-The threshold of number of scalar memory-to-memory move insns, @emph{below}
-which a sequence of insns should be generated instead of a
-string move insn or a library call.  Increasing the value will always
-make code faster, but eventually incurs high cost in increased code size.
+This macro is deprecated and is only used to guide the default behaviours
+of @code{TARGET_MOVE_BY_PIECES_PROFITABLE_P} and
+@code{TARGET_MAX_TOTAL_SCALARIZATION_SIZE}.  New ports should implement
+that hook in preference to this macro.
+@end defmac
 
-Note that on machines where the corresponding move insn is a
-@code{define_expand} that emits a sequence of insns, this macro counts
-the number of such sequences.
+@deftypefn {Target Hook} {unsigned int} TARGET_MAX_SCALARIZATION_SIZE (bool @var{speed_p})
+This target hook is used by the Scalar Replacement of Aggregates passes
+(SRA and IPA-SRA).  This hook gives the maximimum size, in storage units,
+of aggregate to consider for replacement.  @var{speed_p} is true if we are
+currently compiling for speed.
 
-The parameter @var{speed} is true if the code is currently being
-optimized for speed rather than size.
+By default, the maximum scalarization size is determined by MOVE_RATIO,
+if it is defined.  Otherwise, a sensible default is chosen.
 
-If you don't define this, a reasonable default is used.
-@end defmac
+Note that a user may choose to override this target hook with the
+parameters @code{sra-max-scalarization-size-Ospeed} and
+@code{sra-max-scalarization-size-Osize}.
+@end deftypefn
 
 @defmac MOVE_BY_PIECES_P (@var{size}, @var{alignment})
 A C expression used to implement the default behaviour of
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index d2a4386..bdd1ec4 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -4581,21 +4581,14 @@ this macro is defined, it should produce a nonzero value when
 @end defmac
 
 @defmac MOVE_RATIO (@var{speed})
-The threshold of number of scalar memory-to-memory move insns, @emph{below}
-which a sequence of insns should be generated instead of a
-string move insn or a library call.  Increasing the value will always
-make code faster, but eventually incurs high cost in increased code size.
-
-Note that on machines where the corresponding move insn is a
-@code{define_expand} that emits a sequence of insns, this macro counts
-the number of such sequences.
-
-The parameter @var{speed} is true if the code is currently being
-optimized for speed rather than size.
-
-If you don't define this, a reasonable default is used.
+This macro is deprecated and is only used to guide the default behaviours
+of @code{TARGET_MOVE_BY_PIECES_PROFITABLE_P} and
+@code{TARGET_MAX_TOTAL_SCALARIZATION_SIZE}.  New ports should implement
+that hook in preference to this macro.
 @end defmac
 
+@hook TARGET_MAX_SCALARIZATION_SIZE
+
 @defmac MOVE_BY_PIECES_P (@var{size}, @var{alignment})
 A C expression used to implement the default behaviour of
 @code{TARGET_MOVE_BY_PIECES_PROFITABLE_P}.  New ports should implement
diff --git a/gcc/params.def b/gcc/params.def
index aefdd07..7b6c7e2 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -942,6 +942,18 @@ DEFPARAM (PARAM_TM_MAX_AGGREGATE_SIZE,
 	  "pairs",
 	  9, 0, 0)
 
+DEFPARAM (PARAM_SRA_MAX_SCALARIZATION_SIZE_SPEED,
+	  "sra-max-scalarization-size-Ospeed",
+	  "Maximum size, in storage units, of an aggregate which should be "
+	  "considered for scalarization when compiling for speed",
+	  0, 0, 0)
+
+DEFPARAM (PARAM_SRA_MAX_SCALARIZATION_SIZE_SIZE,
+	  "sra-max-scalarization-size-Osize",
+	  "Maximum size, in storage units, of an aggregate which should be "
+	  "considered for scalarization when compiling for size",
+	  0, 0, 0)
+
 DEFPARAM (PARAM_IPA_CP_VALUE_LIST_SIZE,
 	  "ipa-cp-value-list-size",
 	  "Maximum size of a list of values associated with each parameter for "
diff --git a/gcc/target.def b/gcc/target.def
index 10f3b2e..4e19845 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -3049,6 +3049,24 @@ are the same as to this target hook.",
  int, (enum machine_mode mode, reg_class_t rclass, bool in),
  default_memory_move_cost)
 
+/* Return the maximum size in bytes of aggregate which will be considered
+   for replacement by SRA/IP-SRA.  */
+DEFHOOK
+(max_scalarization_size,
+ "This target hook is used by the Scalar Replacement of Aggregates passes\n\
+(SRA and IPA-SRA).  This hook gives the maximimum size, in storage units,\n\
+of aggregate to consider for replacement.  @var{speed_p} is true if we are\n\
+currently compiling for speed.\n\
+\n\
+By default, the maximum scalarization size is determined by MOVE_RATIO,\n\
+if it is defined.  Otherwise, a sensible default is chosen.\n\
+\n\
+Note that a user may choose to override this target hook with the\n\
+parameters @code{sra-max-scalarization-size-Ospeed} and\n\
+@code{sra-max-scalarization-size-Osize}.",
+ unsigned int, (bool speed_p),
+ default_max_scalarization_size)
+
 DEFHOOK
 (move_by_pieces_profitable_p,
  "GCC will attempt several strategies when asked to copy between\n\
diff --git a/gcc/targhooks.c b/gcc/targhooks.c
index eb0a4cd..abc94ff 100644
--- a/gcc/targhooks.c
+++ b/gcc/targhooks.c
@@ -1421,6 +1421,15 @@ get_move_ratio (bool speed_p ATTRIBUTE_UNUSED)
   return move_ratio;
 }
 
+/* Return the maximum size, in storage units, of aggregate
+   which will be considered for replacement by SRA/IP-SRA.  */
+
+unsigned int
+default_max_scalarization_size (bool speed_p ATTRIBUTE_UNUSED)
+{
+  return get_move_ratio (speed_p) * MOVE_MAX_PIECES;
+}
+
 /* The threshold of move insns below which the movmem optab is expanded or a
    call to memcpy is emitted.  */
 
diff --git a/gcc/targhooks.h b/gcc/targhooks.h
index f76ad31..35467f8 100644
--- a/gcc/targhooks.h
+++ b/gcc/targhooks.h
@@ -181,6 +181,7 @@ extern int default_memory_move_cost (enum machine_mode, reg_class_t, bool);
 extern int default_register_move_cost (enum machine_mode, reg_class_t,
 				       reg_class_t);
 
+extern unsigned int default_max_scalarization_size (bool size_p);
 extern bool default_move_by_pieces_profitable_p (unsigned int,
 						 unsigned int, bool);
 extern unsigned int default_estimate_block_copy_ninsns (HOST_WIDE_INT, bool);
diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index 8259dba..c611d29 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -2482,6 +2482,25 @@ propagate_all_subaccesses (void)
     }
 }
 
+/* Return the appropriate parameter value giving the maximum size of
+   aggregate (in storage units) to be considered for scalerization.
+   SPEED_P, which is true if we are currently optimizing for speed
+   rather than size.  */
+
+unsigned int
+get_max_scalarization_size (bool speed_p)
+{
+  unsigned param_max_scalarization_size
+    = speed_p
+      ? PARAM_VALUE (PARAM_SRA_MAX_SCALARIZATION_SIZE_SPEED)
+      : PARAM_VALUE (PARAM_SRA_MAX_SCALARIZATION_SIZE_SIZE);
+
+  if (!param_max_scalarization_size)
+    return targetm.max_scalarization_size (speed_p);
+
+  return param_max_scalarization_size;
+}
+
 /* Go through all accesses collected throughout the (intraprocedural) analysis
    stage, exclude overlapping ones, identify representatives and build trees
    out of them, making decisions about scalarization on the way.  Return true
@@ -2493,10 +2512,10 @@ analyze_all_variable_accesses (void)
   int res = 0;
   bitmap tmp = BITMAP_ALLOC (NULL);
   bitmap_iterator bi;
-  unsigned i, max_total_scalarization_size;
-
-  max_total_scalarization_size = UNITS_PER_WORD * BITS_PER_UNIT
-    * MOVE_RATIO (optimize_function_for_speed_p (cfun));
+  unsigned i;
+  unsigned int max_scalarization_size
+    = get_max_scalarization_size (optimize_function_for_size_p (cfun))
+      * BITS_PER_UNIT;
 
   EXECUTE_IF_SET_IN_BITMAP (candidate_bitmap, 0, i, bi)
     if (bitmap_bit_p (should_scalarize_away_bitmap, i)
@@ -2508,7 +2527,7 @@ analyze_all_variable_accesses (void)
 	    && type_consists_of_records_p (TREE_TYPE (var)))
 	  {
 	    if (tree_to_uhwi (TYPE_SIZE (TREE_TYPE (var)))
-		<= max_total_scalarization_size)
+		<= max_scalarization_size)
 	      {
 		completely_scalarize_var (var);
 		if (dump_file && (dump_flags & TDF_DETAILS))

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]