PR43902 patch: Widening multiply-accumulate

Wed Jun 23 09:54:00 GMT 2010

On Wed, Jun 23, 2010 at 1:25 AM, Bernd Schmidt <bernds@codesourcery.com> wrote:
> Here's a patch to fix most of PR43902, which is about missing support
> for multiply-accumulate instructions on MIPS.  Jim Wilson did most of
> the work on this patch, adding a new optimization in the
> optimize_widening_multiply pass; I've slightly modified it to add
> support for ternary gimple statements, as well as adding ARM bits.
> There's some history and discussion in the PR.
>
> Most passes probably don't need to handle ternary gimple statements
> (tree-ssa-math-opts runs quite late), so I've provided some wrappers
> around frequently used functions so that passes can for now continue to
> use the simpler interface.
>
> I've tried for a while to convert DOT_PROD_EXPR to use this new
> infrastructure, but it took my rather far down into the vectorizer and I
> gave up.  It's probably something the vectorizer maintainers should look
> into.
>
> Bootstrapped and regression tested on i686-linux.  Ok?

+/* Widening multiply-accumulate.
+   The first two arguments are of type t1.
+   The third argument and the result are of type t2, such as t2 is at least
+   twice the size of t1.  This is equivalent to a WIDEN_MULT_EXPR operation
+   followed by an add or subtract.  */
+DEFTREECODE (WIDEN_MULT_PLUS_EXPR, "widen_mult_plus_expr", tcc_expression, 3)
+/* This is like the above, except in the final expression the multiply result
+   is subtracted from t3.  */
+DEFTREECODE (WIDEN_MULT_MINUS_EXPR, "widen_mult_plus_expr", tcc_expression, 3)

So it computes (op0 * op1) +- op2?  Please adjust the comment
to say which operands are multiplied and which is added/subtracted.

+    case WIDEN_MULT_PLUS_EXPR:
+    case WIDEN_MULT_MINUS_EXPR:
+      if ((!INTEGRAL_TYPE_P (rhs1_type)
+	   && !FIXED_POINT_TYPE_P (rhs1_type)
+	   && !(TREE_CODE (rhs1_type) == VECTOR_TYPE
+		&& INTEGRAL_TYPE_P (TREE_TYPE (rhs1_type))))
+	  || !useless_type_conversion_p (rhs1_type, rhs2_type)
+	  || !useless_type_conversion_p (lhs_type, rhs3_type)
+	  || 2 * TYPE_PRECISION (rhs1_type) != TYPE_PRECISION (lhs_type)
+	  || TYPE_PRECISION (rhs1_type) != TYPE_PRECISION (rhs2_type))

So this restricts this to integral or fixed-point types.  Can you
document it as such in the comment in tree.def?

Your support for ternary gimple is far from complete - I'm not sure
we want to have this half-supported state (though I guess I don't
care too much and definitely like that we start on it rather than
using more single rhss).

I am going to work on FP MAC detection at some point which
will happen before the vectorizer so I guess I can fixup some more
places.

Can you adjust gimple.texi for the new RHS type?

Thanks,
Richard.