This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching

From: Andrew Stubbs <ams at codesourcery dot com>
To: Richard Guenther <richard dot guenther at gmail dot com>
Cc: Michael Matz <matz at suse dot de>, gcc-patches at gcc dot gnu dot org, patches at linaro dot org
Date: Mon, 04 Jul 2011 15:23:13 +0100
Subject: Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
References: <4E034EF2.3070503@codesourcery.com> <4E03504B.9060305@codesourcery.com> <BANLkTi=a8B-DZZdG2bmOWQ8A+-pCSEsAbQ@mail.gmail.com> <4E044559.5000105@linaro.org> <BANLkTimfKyrARS+eRy+MWYVO2gLqp85JsQ@mail.gmail.com> <1A77B5B39081C241A68E6CF16983025F020906F6@EU1-MAIL.mgc.mentorg.com> <BANLkTi=1TRM7uHWxLD3Se=S-ibe0C13T-Q@mail.gmail.com> <4E09B142.4020402@codesourcery.com> <BANLkTim0wLcSTgAL9isO+dVXxgq-U8B4Sw@mail.gmail.com> <Pine.LNX.4.64.1106281741170.17115@wotan.suse.de> <4E09FDEA.3000004@gmail.com> <Pine.LNX.4.64.1106281827230.17115@wotan.suse.de> <1A77B5B39081C241A68E6CF16983025F0209071D@EU1-MAIL.mgc.mentorg.com> <BANLkTimJob8C2L8kYkd-aU6FbM5nckc8Yg@mail.gmail.com>

On 01/07/11 13:25, Richard Guenther wrote:

Well - some operations work the same on both signedness if you
just care about the twos-complement result.  This includes
multiplication (but not for example division).  For this special
case I suggest to not bother trying to invent a generic predicate
but do something local in tree-ssa-math-opts.c.

OK, here's my updated patch.

I've taken the view that we *know* what size and signedness the result of the multiplication is, and we know what size the input to the addition must be, so all the check has to do is make sure it does that same conversion, even if by a roundabout means.

What I hadn't grasped before is that when extending a value it's the source type that is significant, not the destination, so the checks are not as complex as I had thought.

So, this patch adds a test to ensure that:

1. the type is not truncated so far that we lose any information; and

2. the type is only ever extended in the proper signedness.

Also, just to be absolutely sure, I've also added a little bit of logic to permit extends that are then undone by a truncate. I'm really not sure what guarantees there are about what sort of cast sequences can exist? Is this necessary? I haven't managed to coax it to generated any examples of extends followed by truncates myself, but in any case, it's hardly any code and it'll make sure it's future proofed.

OK?

Andrew

2011-06-28  Andrew Stubbs  <ams@codesourcery.com>

	gcc/
	* tree-ssa-math-opts.c (valid_types_for_madd_p): New function.
	(convert_plusminus_to_widen): Use valid_types_for_madd_p to
	identify optimization candidates.

	gcc/testsuite/
	* gcc.target/arm/wmul-5.c: New file.
	* gcc.target/arm/no-wmla-1.c: New file.

---
 .../gcc/testsuite/gcc.target/arm/no-wmla-1.c       |   11 ++
 .../gcc/testsuite/gcc.target/arm/wmul-5.c          |   10 ++
 src/gcc-mainline/gcc/tree-ssa-math-opts.c          |  112 ++++++++++++++++++--
 3 files changed, 123 insertions(+), 10 deletions(-)
 create mode 100644 src/gcc-mainline/gcc/testsuite/gcc.target/arm/no-wmla-1.c
 create mode 100644 src/gcc-mainline/gcc/testsuite/gcc.target/arm/wmul-5.c

diff --git a/src/gcc-mainline/gcc/testsuite/gcc.target/arm/no-wmla-1.c b/src/gcc-mainline/gcc/testsuite/gcc.target/arm/no-wmla-1.c
new file mode 100644
index 0000000..17f7427
--- /dev/null
+++ b/src/gcc-mainline/gcc/testsuite/gcc.target/arm/no-wmla-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv7-a" } */
+
+int
+foo (int a, short b, short c)
+{
+     int bc = b * c;
+        return a + (short)bc;
+}
+
+/* { dg-final { scan-assembler "mul" } } */
diff --git a/src/gcc-mainline/gcc/testsuite/gcc.target/arm/wmul-5.c b/src/gcc-mainline/gcc/testsuite/gcc.target/arm/wmul-5.c
new file mode 100644
index 0000000..65c43e3
--- /dev/null
+++ b/src/gcc-mainline/gcc/testsuite/gcc.target/arm/wmul-5.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv7-a" } */
+
+long long
+foo (long long a, char *b, char *c)
+{
+  return a + *b * *c;
+}
+
+/* { dg-final { scan-assembler "umlal" } } */
diff --git a/src/gcc-mainline/gcc/tree-ssa-math-opts.c b/src/gcc-mainline/gcc/tree-ssa-math-opts.c
index d55ba57..5ef7bb4 100644
--- a/src/gcc-mainline/gcc/tree-ssa-math-opts.c
+++ b/src/gcc-mainline/gcc/tree-ssa-math-opts.c
@@ -2085,6 +2085,78 @@ convert_mult_to_widen (gimple stmt)
   return true;
 }
 
+/* Check the input types, TYPE1 and TYPE2 to a widening multiply,
+   and then the convertions between the output of the multiply, and
+   the input to an addition EXPR, to ensure that they are compatible with
+   a widening multiply-and-accumulate.
+
+   This function assumes that expr is a valid string of conversion expressions
+   terminated by a multiplication.
+
+   This function tries NOT to make any (fragile) assumptions about what
+   sequence of conversions can exist in the input.  */
+
+static bool
+valid_types_for_madd_p (tree type1, tree type2, tree expr)
+{
+  gimple stmt, prev_stmt;
+  enum tree_code code, prev_code;
+  tree prev_expr, type, prev_type;
+  int bitsize, prev_bitsize, initial_bitsize, min_bitsize;
+  bool initial_unsigned;
+
+  initial_bitsize = TYPE_PRECISION (type1) + TYPE_PRECISION (type2);
+  initial_unsigned = TYPE_UNSIGNED (type1) && TYPE_UNSIGNED (type2);
+
+  stmt = SSA_NAME_DEF_STMT (expr);
+  code = gimple_assign_rhs_code (stmt);
+  type = TREE_TYPE (expr);
+  bitsize = TYPE_PRECISION (type);
+  min_bitsize = bitsize;
+
+  if (code == MULT_EXPR || code == WIDEN_MULT_EXPR)
+    return true;
+
+  if (!INTEGRAL_TYPE_P (type)
+      || TYPE_PRECISION (type) < initial_bitsize)
+    return false;
+
+  /* Step through the conversions backwards.  */
+  while (true)
+    {
+      prev_expr = gimple_assign_rhs1 (stmt);
+      prev_stmt = SSA_NAME_DEF_STMT (prev_expr);
+      prev_code = gimple_assign_rhs_code (prev_stmt);
+      prev_type = TREE_TYPE (prev_expr);
+      prev_bitsize = TYPE_PRECISION (prev_type);
+
+      if (prev_code == MULT_EXPR || prev_code == WIDEN_MULT_EXPR)
+	break;
+
+      /* If it's an unsuitable type or a truncate that damages the
+	 original value, then were done.  */
+      if (!INTEGRAL_TYPE_P (prev_type)
+	  || TYPE_PRECISION (prev_type) < initial_bitsize)
+	return false;
+
+      /* If we have the wrong sort of extend for the value, then it
+	 could still be ok if we already saw a truncate that reverses
+	 the effect.  */
+      if (bitsize > prev_bitsize
+	  && TYPE_UNSIGNED (prev_type) != initial_unsigned
+	  && min_bitsize > prev_bitsize)
+	return false;
+
+      stmt = prev_stmt;
+      code = prev_code;
+      type = prev_type;
+      bitsize = prev_bitsize;
+      min_bitsize = bitsize < min_bitsize ? bitsize : min_bitsize;
+    }
+
+  return true;
+}
+
 /* Process a single gimple statement STMT, which is found at the
    iterator GSI and has a either a PLUS_EXPR or a MINUS_EXPR as its
    rhs (given by CODE), and try to convert it into a
@@ -2098,6 +2170,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
   gimple rhs1_stmt = NULL, rhs2_stmt = NULL;
   tree type, type1, type2;
   tree lhs, rhs1, rhs2, mult_rhs1, mult_rhs2, add_rhs;
+  tree tmp, mult_rhs;
   enum tree_code rhs1_code = ERROR_MARK, rhs2_code = ERROR_MARK;
   optab this_optab;
   enum tree_code wmult_code;
@@ -2117,22 +2190,32 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
   rhs1 = gimple_assign_rhs1 (stmt);
   rhs2 = gimple_assign_rhs2 (stmt);
 
-  if (TREE_CODE (rhs1) == SSA_NAME)
+  for (tmp = rhs1, rhs1_code = ERROR_MARK;
+       TREE_CODE (tmp) == SSA_NAME
+       && (CONVERT_EXPR_CODE_P (rhs1_code) || rhs1_code == ERROR_MARK);
+       tmp = gimple_assign_rhs1 (rhs1_stmt))
     {
-      rhs1_stmt = SSA_NAME_DEF_STMT (rhs1);
-      if (is_gimple_assign (rhs1_stmt))
-	rhs1_code = gimple_assign_rhs_code (rhs1_stmt);
+      rhs1_stmt = SSA_NAME_DEF_STMT (tmp);
+      if (!is_gimple_assign (rhs1_stmt))
+	break;
+      rhs1_code = gimple_assign_rhs_code (rhs1_stmt);
     }
-  else
+
+  if (TREE_CODE (tmp) != SSA_NAME)
     return false;
 
-  if (TREE_CODE (rhs2) == SSA_NAME)
+  for (tmp = rhs2, rhs2_code = ERROR_MARK;
+       TREE_CODE (tmp) == SSA_NAME
+       && (CONVERT_EXPR_CODE_P (rhs2_code) || rhs2_code == ERROR_MARK);
+       tmp = gimple_assign_rhs1 (rhs2_stmt))
     {
-      rhs2_stmt = SSA_NAME_DEF_STMT (rhs2);
-      if (is_gimple_assign (rhs2_stmt))
-	rhs2_code = gimple_assign_rhs_code (rhs2_stmt);
+      rhs2_stmt = SSA_NAME_DEF_STMT (tmp);
+      if (!is_gimple_assign (rhs2_stmt))
+	break;
+      rhs2_code = gimple_assign_rhs_code (rhs2_stmt);
     }
-  else
+
+  if (TREE_CODE (tmp) != SSA_NAME)
     return false;
 
   if (code == PLUS_EXPR && rhs1_code == MULT_EXPR)
@@ -2140,6 +2223,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
       if (!is_widening_mult_p (rhs1_stmt, &type1, &mult_rhs1,
 			       &type2, &mult_rhs2))
 	return false;
+      mult_rhs = rhs1;
       add_rhs = rhs2;
     }
   else if (rhs2_code == MULT_EXPR)
@@ -2147,6 +2231,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
       if (!is_widening_mult_p (rhs2_stmt, &type1, &mult_rhs1,
 			       &type2, &mult_rhs2))
 	return false;
+      mult_rhs = rhs2;
       add_rhs = rhs1;
     }
   else if (code == PLUS_EXPR && rhs1_code == WIDEN_MULT_EXPR)
@@ -2155,6 +2240,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
       mult_rhs2 = gimple_assign_rhs2 (rhs1_stmt);
       type1 = TREE_TYPE (mult_rhs1);
       type2 = TREE_TYPE (mult_rhs2);
+      mult_rhs = rhs1;
       add_rhs = rhs2;
     }
   else if (rhs2_code == WIDEN_MULT_EXPR)
@@ -2163,6 +2249,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
       mult_rhs2 = gimple_assign_rhs2 (rhs2_stmt);
       type1 = TREE_TYPE (mult_rhs1);
       type2 = TREE_TYPE (mult_rhs2);
+      mult_rhs = rhs2;
       add_rhs = rhs1;
     }
   else
@@ -2171,6 +2258,11 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
   if (TYPE_UNSIGNED (type1) != TYPE_UNSIGNED (type2))
     return false;
 
+  /* Verify that the convertions between the mult and the add doesn't do
+     anything unexpected.  */
+  if (!valid_types_for_madd_p (type1, type2, mult_rhs))
+    return false;
+
   /* Verify that the machine can perform a widening multiply
      accumulate in this mode/signedness combination, otherwise
      this transformation is likely to pessimize code.  */

Follow-Ups:
- Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
  - From: Richard Guenther

References:
- Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
  - From: Stubbs, Andrew
- Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
  - From: Richard Guenther

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]