This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH]: Add -fno-tree-reassoc to "fix" PR target/27855


Hello!

The problem with current reassociation pass is, that it produces significantly slower code for various tight matrix multiplication loops. One example is taken from PR target/27855 and I was trying to test the code, produced with current gcc mainline.

The tests were performed on core2 in 64bit mode, using '-DREPS=10000 -O3 -msse3 -march=core2 -ffast-math' flags, with and without newly introduced -fno-tree-reassoc flag.

The results were _interesting_, showing extreme differences in the run times:

w/o -fno-tree-reassoc:

ALGORITHM     NB   REPS        TIME      MFLOPS
=========  =====  =====  ==========  ==========

-DTYPE=float:        atlasmm       60  10000       2.000     2159.87
-DTYPE=double:    atlasmm       60  10000       2.500     1727.89


w/ -fno-tree-reassoc:


ALGORITHM     NB   REPS        TIME      MFLOPS
=========  =====  =====  ==========  ==========

-DTYPE=float:        atlasmm       60  10000       0.932     4634.90
-DTYPE=double:    atlasmm       60  10000       1.520     2841.93

That is, more than 50% performance hit for floats and 40% for doubles. This is simply unacceptable, and it actually doesn't matter if it is RA failure or not. As stated in PR Comment #9, increased register life times and register pressure should be addressed at out-of-ssa pass, but (if implemented), this functionality is not effective in this particular case.

To overcome the problems, introduced by reassociation pass, a new compile flag that would disable this optimization is proposed. IMO, gcc shouldn't degrade matrix handling code so much on a fairly new x86 target.

2007-07-09 Uros Bizjak <ubizjak@gmail.com>

       PR target/27855
       * doc/extend.texi: Add ftree-reassoc flag.
       * common.opt (ftree-reassoc): New flag.
       * tree-ssa-reassoc.c (gate_tree_ssa_reassoc): New static function.
       (struct tree_opt_pass pass_reassoc): Use gate_tree_ssa_reassoc.

The patch was bootstrapped on x86_64-linux-gnu. OK for mainline?

Uros.

Index: common.opt
===================================================================
--- common.opt	(revision 126488)
+++ common.opt	(working copy)
@@ -1063,6 +1063,10 @@ ftree-pre
 Common Report Var(flag_tree_pre) Optimization
 Enable SSA-PRE optimization on trees
 
+ftree-reassoc
+Common Report Var(flag_tree_reassoc) Init(1) Optimization
+Enable reassociation on tree level
+
 ftree-salias
 Common Report Var(flag_tree_salias) Optimization
 Perform structural alias analysis
Index: tree-ssa-reassoc.c
===================================================================
--- tree-ssa-reassoc.c	(revision 126488)
+++ tree-ssa-reassoc.c	(working copy)
@@ -1476,15 +1476,21 @@ execute_reassoc (void)
   return 0;
 }
 
+static bool
+gate_tree_ssa_reassoc (void)
+{
+  return flag_tree_reassoc != 0;
+}
+
 struct tree_opt_pass pass_reassoc =
 {
   "reassoc",				/* name */
-  NULL,				/* gate */
-  execute_reassoc,				/* execute */
+  gate_tree_ssa_reassoc,		/* gate */
+  execute_reassoc,			/* execute */
   NULL,					/* sub */
   NULL,					/* next */
   0,					/* static_pass_number */
-  TV_TREE_REASSOC,				/* tv_id */
+  TV_TREE_REASSOC,			/* tv_id */
   PROP_cfg | PROP_ssa | PROP_alias,	/* properties_required */
   0,					/* properties_provided */
   0,					/* properties_destroyed */

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]