This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
PR 36599
- From: Jack Howarth <howarth at bromo dot msbb dot uc dot edu>
- To: rguenther at suse dot de
- Cc: gcc at gcc dot gnu dot org
- Date: Sun, 22 Jun 2008 20:53:33 -0400
- Subject: PR 36599
Richard,
There is a regression in the induct polyhedron benchmark execution
when gfortran compiled with -ffast-math -O3 introduced with gcc 4.3
that isn't present in gcc 4.2.4.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36599
According to the SuSe polyhedron benchmark servers, the bottleneck
in the induct benchmark was eliminated between 2008-04-27 and
2008-04-28 in gcc trunk. My guess is that your changes...
r134730 | rguenth | 2008-04-27 12:27:08 -0400 (Sun, 27 Apr 2008) | 42 lines
2008-04-27 Richard Guenther <rguenther@suse.de>
PR tree-optimization/18754
PR tree-optimization/34223
* tree-pass.h (pass_complete_unrolli): Declare.
* tree-ssa-loop-ivcanon.c (try_unroll_loop_completely): Print
loop size before and after unconditionally of UL_NO_GROWTH in effect.
Rewrite loop into loop closed SSA form if it is not already.
(tree_unroll_loops_completely): Re-structure to iterate over
innermost loops with intermediate CFG cleanups.
Unroll outermost loops only if requested or the code does not grow
doing so.
* tree-ssa-loop.c (gate_tree_vectorize): Don't shortcut if no
loops are available.
(tree_vectorize): Instead do so here.
(tree_complete_unroll): Also unroll outermost loops.
(tree_complete_unroll_inner): New function.
(gate_tree_complete_unroll_inner): Likewise.
(pass_complete_unrolli): New pass.
* tree-ssa-loop-manip.c (find_uses_to_rename_use): Only record
uses outside of the loop.
(tree_duplicate_loop_to_header_edge): Only verify loop-closed SSA
form if it is available.
* tree-flow.h (tree_unroll_loops_completely): Add extra parameter.
* passes.c (init_optimization_passes): Schedule complete inner
loop unrolling pass before the first CCP pass after final inlining.
* gcc.dg/tree-ssa/loop-36.c: New testcase.
* gcc.dg/tree-ssa/loop-37.c: Likewise.
* gcc.dg/vect/vect-118.c: Likewise.
* gcc.dg/Wunreachable-8.c: XFAIL bogus warning.
* gcc.dg/vect/vect-66.c: Increase loop trip count.
* gcc.dg/vect/no-section-anchors-vect-66.c: Likewise.
* gcc.dg/vect/no-section-anchors-vect-69.c: Likewise.
* gcc.dg/vect/vect-76.c: Likewise.
* gcc.dg/vect/vect-outer-6.c: Likewise.
* gcc.dg/vect/vect-outer-1.c: Likewise.
* gcc.dg/vect/vect-outer-1a.c: Likewise.
* gcc.dg/vect/vect-11a.c: Likewise.
* gcc.dg/vect/vect-shift-1.c: Likewise.
* gcc.target/i386/vectorize1.c: Likewise.
...solved this performance regression. Is it possible that these changes could
be backported to gcc 4.3.2 to eliminate the performance regression in gcc 4.3
compared to gcc 4.2?
Jack