This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[PATCH]: Add -fno-tree-reassoc to "fix" PR target/27855
- From: Uros Bizjak <ubizjak at gmail dot com>
- To: GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Mon, 09 Jul 2007 20:12:13 +0200
- Subject: [PATCH]: Add -fno-tree-reassoc to "fix" PR target/27855
Hello!
The problem with current reassociation pass is, that it produces
significantly slower code for various tight matrix multiplication loops.
One example is taken from PR target/27855 and I was trying to test the
code, produced with current gcc mainline.
The tests were performed on core2 in 64bit mode, using '-DREPS=10000 -O3
-msse3 -march=core2 -ffast-math' flags, with and without newly
introduced -fno-tree-reassoc flag.
The results were _interesting_, showing extreme differences in the run
times:
w/o -fno-tree-reassoc:
ALGORITHM NB REPS TIME MFLOPS
========= ===== ===== ========== ==========
-DTYPE=float: atlasmm 60 10000 2.000 2159.87
-DTYPE=double: atlasmm 60 10000 2.500 1727.89
w/ -fno-tree-reassoc:
ALGORITHM NB REPS TIME MFLOPS
========= ===== ===== ========== ==========
-DTYPE=float: atlasmm 60 10000 0.932 4634.90
-DTYPE=double: atlasmm 60 10000 1.520 2841.93
That is, more than 50% performance hit for floats and 40% for doubles.
This is simply unacceptable, and it actually doesn't matter if it is RA
failure or not. As stated in PR Comment #9, increased register life
times and register pressure should be addressed at out-of-ssa pass, but
(if implemented), this functionality is not effective in this particular
case.
To overcome the problems, introduced by reassociation pass, a new
compile flag that would disable this optimization is proposed. IMO, gcc
shouldn't degrade matrix handling code so much on a fairly new x86 target.
2007-07-09 Uros Bizjak <ubizjak@gmail.com>
PR target/27855
* doc/extend.texi: Add ftree-reassoc flag.
* common.opt (ftree-reassoc): New flag.
* tree-ssa-reassoc.c (gate_tree_ssa_reassoc): New static function.
(struct tree_opt_pass pass_reassoc): Use gate_tree_ssa_reassoc.
The patch was bootstrapped on x86_64-linux-gnu. OK for mainline?
Uros.
Index: common.opt
===================================================================
--- common.opt (revision 126488)
+++ common.opt (working copy)
@@ -1063,6 +1063,10 @@ ftree-pre
Common Report Var(flag_tree_pre) Optimization
Enable SSA-PRE optimization on trees
+ftree-reassoc
+Common Report Var(flag_tree_reassoc) Init(1) Optimization
+Enable reassociation on tree level
+
ftree-salias
Common Report Var(flag_tree_salias) Optimization
Perform structural alias analysis
Index: tree-ssa-reassoc.c
===================================================================
--- tree-ssa-reassoc.c (revision 126488)
+++ tree-ssa-reassoc.c (working copy)
@@ -1476,15 +1476,21 @@ execute_reassoc (void)
return 0;
}
+static bool
+gate_tree_ssa_reassoc (void)
+{
+ return flag_tree_reassoc != 0;
+}
+
struct tree_opt_pass pass_reassoc =
{
"reassoc", /* name */
- NULL, /* gate */
- execute_reassoc, /* execute */
+ gate_tree_ssa_reassoc, /* gate */
+ execute_reassoc, /* execute */
NULL, /* sub */
NULL, /* next */
0, /* static_pass_number */
- TV_TREE_REASSOC, /* tv_id */
+ TV_TREE_REASSOC, /* tv_id */
PROP_cfg | PROP_ssa | PROP_alias, /* properties_required */
0, /* properties_provided */
0, /* properties_destroyed */