This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[PATCH] Fix PR42108
- From: Richard Biener <rguenther at suse dot de>
- To: gcc-patches at gcc dot gnu dot org
- Cc: fortran at gcc dot gnu dot org
- Date: Thu, 11 Dec 2014 11:05:13 +0100 (CET)
- Subject: [PATCH] Fix PR42108
- Authentication-results: sourceware.org; auth=none
The following patch fixes the performance regression in PR42108
by allowing PRE and LIM to see the division (to - from) / step
in translating do loops executed unconditionally. This makes
them not care for the fact that step might be zero and thus
the division might trap.
This makes the runtime of the testcase improve from 10.7s to
8s (same as gfortran 4.3).
The caveat is that iff the loop is not executed (to < from
for positive step for example) then there will be an additional
executed division computing the unused countm1.
Bootstrap and regtest running on x86_64-unknown-linux-gnu, ok
for trunk?
Thanks,
Richard.
2014-12-11 Richard Biener <rguenther@suse.de>
PR tree-optimization/42108
* trans-stmt.c (gfc_trans_do): Execute the division computing
countm1 before the loop entry check.
* gfortran.dg/pr42108.f90: Amend.
Index: gcc/fortran/trans-stmt.c
===================================================================
--- gcc/fortran/trans-stmt.c (revision 218515)
+++ gcc/fortran/trans-stmt.c (working copy)
@@ -1645,15 +1645,15 @@ gfc_trans_do (gfc_code * code, tree exit
This code is executed before we enter the loop body. We generate:
if (step > 0)
{
+ countm1 = (to - from) / step;
if (to < from)
goto exit_label;
- countm1 = (to - from) / step;
}
else
{
+ countm1 = (from - to) / -step;
if (to > from)
goto exit_label;
- countm1 = (from - to) / -step;
}
*/
@@ -1675,11 +1675,12 @@ gfc_trans_do (gfc_code * code, tree exit
fold_build2_loc (loc, MINUS_EXPR, utype,
tou, fromu),
stepu);
- pos = fold_build3_loc (loc, COND_EXPR, void_type_node, tmp,
- fold_build1_loc (loc, GOTO_EXPR, void_type_node,
- exit_label),
- fold_build2 (MODIFY_EXPR, void_type_node,
- countm1, tmp2));
+ pos = build2 (COMPOUND_EXPR, void_type_node,
+ fold_build2 (MODIFY_EXPR, void_type_node,
+ countm1, tmp2),
+ build3_loc (loc, COND_EXPR, void_type_node, tmp,
+ build1_loc (loc, GOTO_EXPR, void_type_node,
+ exit_label), NULL_TREE));
/* For a negative step, when to > from, exit, otherwise compute
countm1 = ((unsigned)from - (unsigned)to) / -(unsigned)step */
@@ -1688,11 +1689,12 @@ gfc_trans_do (gfc_code * code, tree exit
fold_build2_loc (loc, MINUS_EXPR, utype,
fromu, tou),
fold_build1_loc (loc, NEGATE_EXPR, utype, stepu));
- neg = fold_build3_loc (loc, COND_EXPR, void_type_node, tmp,
- fold_build1_loc (loc, GOTO_EXPR, void_type_node,
- exit_label),
- fold_build2 (MODIFY_EXPR, void_type_node,
- countm1, tmp2));
+ neg = build2 (COMPOUND_EXPR, void_type_node,
+ fold_build2 (MODIFY_EXPR, void_type_node,
+ countm1, tmp2),
+ build3_loc (loc, COND_EXPR, void_type_node, tmp,
+ build1_loc (loc, GOTO_EXPR, void_type_node,
+ exit_label), NULL_TREE));
tmp = fold_build2_loc (loc, LT_EXPR, boolean_type_node, step,
build_int_cst (TREE_TYPE (step), 0));
Index: gcc/testsuite/gfortran.dg/pr42108.f90
===================================================================
--- gcc/testsuite/gfortran.dg/pr42108.f90 (revision 218584)
+++ gcc/testsuite/gfortran.dg/pr42108.f90 (working copy)
@@ -1,5 +1,5 @@
! { dg-do compile }
-! { dg-options "-O2 -fdump-tree-fre1" }
+! { dg-options "-O2 -fdump-tree-fre1 -fdump-tree-pre-details" }
subroutine eval(foo1,foo2,foo3,foo4,x,n,nnd)
implicit real*8 (a-h,o-z)
@@ -21,7 +21,9 @@ subroutine eval(foo1,foo2,foo3,foo4,x,n
end do
end subroutine eval
+! We should have hoisted the division
+! { dg-final { scan-tree-dump "in all uses of countm1\[^\n\]* / " "pre" } }
! There should be only one load from n left
-
! { dg-final { scan-tree-dump-times "\\*n_" 1 "fre1" } }
! { dg-final { cleanup-tree-dump "fre1" } }
+! { dg-final { cleanup-tree-dump "pre" } }