This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

PATCH - [tree-ssa] regrouping of expression tree for single multiply add.


This is the revised patch; incorporating Roger's comments. This patch
has been bootstrapped, dejagnu tested on ppc-darwin. It has also been
tested against SPEC2000 with IMA on ppc-darwin. I am proposing this patch for
tree-ssa because it is an 'improvement' patch and I believe mainline is
frozen for improvement patches.


OK for tree-ssa?

- Fariborz Jahanian (fjahanian@apple.com)

ChangeLog.tree-ssa

2004-03-17 Fariborz Jahanian <fjahanian@apple.com>

        * fold-const.c (fold): reassociate multiply expression
        with an adjacent non-multiply expression to use
        architecture's multiply-add instruction.



Attachment: tree-ssa-fmadd-patch.txt
Description: Text document





On Mar 20, 2004, at 11:25 AM, Roger Sayle wrote:


On Sat, 20 Mar 2004, Fariborz Jahanian wrote:
I would like to know if this is an appropriate place for this kind of
optimization, which may not necessarily benefit all architectures.
Routine fold(...) seems to be used for similar transformations
extensively.

For the form of reassociation you're attempting with this patch, fold
is currently the most appropriate place. There are local reassociations
during RTL simplification to ensure that fmadd gets used, but for
rearranging large expressions to enable the use of multiple multiply-adds,
fold is appropriate. Tree-ssa/gimple may/will complicate things however.



Test Case:
double bad(int i, double b, double c, double d, double e, double **w) {
return w[i][0] += b*c + d*e;
}
...
/* Convert a + (b + (c*d)) into (a + (c*d)) + b */

The problem is that this rearrangement as written, just permutes the order of additions and may increase or decrease the availability of multiply-adds. Also, because there is no clear improvement, this transformation will repermute the same expression each time fold is called.

The critical aspect for your test case is that there are two additions
and two multiplications that are poorly ordered.  You need to check
that "b" is a multiplication, and that "a" isn't.

You can also use reorder_operands_p instead of testing both operands
for side-effects, this should catch even more cases.  For example, in
C/C++ you can reorder these expressions even if they have side-effects!


i.e. ((a*b)+(c*d)) + e => ((a*b)+e)+(c*d) // if flag_unsafe_math_optimizations && reorder_operands( c*d, e ) && TREE_CODE (e) != MULT_EXPR and a + ((b*c)+(d*e)) => (b*c)+(a+(d*e)) // if flag_unsafe_math_optimizations && reorder_operands( a, b*c ) && TREE_CODE (a) != MULT_EXPR


Also if you move your patch one brace earlier, you won't have to explicitly test for FLOAT_TYPE_P. The PLUS_EXPR case in fold is split into the integer optimizations, followed by the FP optimizations. You should insert your changes in the "FP" section, rather than at the very end where you have to check this again.



I do like this transformation though.  If you could make the
above changes, and report where you've bootstrapped and regression
tested your patch, there shoud be no problem getting it approved.

Even on hardware without fused-multiply-add instructions, interleaving
the multiplications and additions will often be a big win, so I'm not
concerned that "it may not necessarily benefit all architectures".

Roger
--



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]