This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: PATCH - [RFC] regrouping of expression tree for single multiplyadd.

On Sat, 20 Mar 2004, Fariborz Jahanian wrote:
> I would like to know if this is an appropriate place for this kind of
> optimization, which may not necessarily benefit all architectures.
> Routine fold(...) seems to be used for similar transformations
> extensively.

For the form of reassociation you're attempting with this patch, fold
is currently the most appropriate place.  There are local reassociations
during RTL simplification to ensure that fmadd gets used, but for
rearranging large expressions to enable the use of multiple multiply-adds,
fold is appropriate.  Tree-ssa/gimple may/will complicate things however.

> Test Case:
> double bad(int i, double b, double c, double d, double e, double **w) {
>          return w[i][0] += b*c + d*e;
> }
> ...
> /* Convert a + (b + (c*d)) into (a + (c*d)) + b */

The problem is that this rearrangement as written, just permutes the
order of additions and may increase or decrease the availability of
multiply-adds.  Also, because there is no clear improvement, this
transformation will repermute the same expression each time fold is

The critical aspect for your test case is that there are two additions
and two multiplications that are poorly ordered.  You need to check
that "b" is a multiplication, and that "a" isn't.

You can also use reorder_operands_p instead of testing both operands
for side-effects, this should catch even more cases.  For example, in
C/C++ you can reorder these expressions even if they have side-effects!

i.e. ((a*b)+(c*d)) + e => ((a*b)+e)+(c*d)
// if flag_unsafe_math_optimizations
      && reorder_operands( c*d, e )
      && TREE_CODE (e) != MULT_EXPR
     a + ((b*c)+(d*e)) => (b*c)+(a+(d*e))
// if flag_unsafe_math_optimizations
      && reorder_operands( a, b*c )
      && TREE_CODE (a) != MULT_EXPR

Also if you move your patch one brace earlier, you won't have to
explicitly test for FLOAT_TYPE_P.  The PLUS_EXPR case in fold is
split into the integer optimizations, followed by the FP optimizations.
You should insert your changes in the "FP" section, rather than at
the very end where you have to check this again.

I do like this transformation though.  If you could make the
above changes, and report where you've bootstrapped and regression
tested your patch, there shoud be no problem getting it approved.

Even on hardware without fused-multiply-add instructions, interleaving
the multiplications and additions will often be a big win, so I'm not
concerned that "it may not necessarily benefit all architectures".


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]