This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[patch] tree-optimize.c: Move pass_merge_phi before the firstpass_dominator.
- From: Kazu Hirata <kazu at cs dot umass dot edu>
- To: gcc-patches at gcc dot gnu dot org
- Cc: dnovillo at redhat dot com, law at redhat dot com
- Date: Thu, 12 May 2005 23:22:45 -0400 (EDT)
- Subject: [patch] tree-optimize.c: Move pass_merge_phi before the firstpass_dominator.
Hi,
Attached is a patch to move pass_merge_phi before the first
pass_dominator.
DOM attempts to thread each edge going into a basic block BB that ends
with a COND_EXPR or a SWITCH_EXPR. That means that the more incoming
edges BB has, the more edges we can thread. There are cases where DOM
cannot thread jumps even with its iterations if we do not run
merge_phi. For example, consider
a_1 = PHI <0, 1> a_2 = PHI = <0, 1>
\ /
\ /
a_3 = PHI <a_1, a_2>
:
:
if (a_3 != 0) goto here; else goto there;
Suppose we are considering threading jumps through the basic block
with a_3 = PHI <...>. In this case, we cannot thread any of the
incoming edges because a_1 and a_2 are not constants. However, if we
run merge_phi in advance, we transform the above CFG into
a_3 = PHI <0, 1, 0, 1>
:
:
if (a_3 != 0) goto here; else goto there;
in which case DOM will thread every incoming edge in its first
iteration (subject to pruning in tree-ssa-threadupdate.c).
Running merge_phi is also helpful to get most out of the first
iteration of DOM, especially if we might later consider disabling its
iterations to improve compile-time performance. For example, consider
a_1 = PHI <0, 1> ...
\ /
\ /
a_3 = PHI <a_1, 1>
:
:
if (a_3 != 0) goto here; else goto there;
DOM will thread the incoming edge from "..." in its first iteration,
but it does not thread the other incoming edge because a_1 is not
constant. However, if we run merge_phi in advance, we transform the
above CFG into
a_3 = PHI <0, 1, 1>
:
:
if (a_3 != 0) goto here; else goto there;
in which case DOM will thread every incoming edge in its first
iteration (subject to pruning in tree-ssa-threadupdate.c).
I've seen both of these cases many times in tree dumps from real world
code although I don't have exact numbers.
Here are some numbers while compiling cc1-i files.
original patched diff
--------------------------------
DOM1 27774 29937 +7.787%
DOM2 1501 1098 -26.848%
DOM3 2785 2461 -11.633%
BYPASS 1593 1590 -0.188%
Total 33653 35086 +4.258%
Time 237.977 236.800 -0.494% (today)
Time 235.881 236.298 +0.176% (yesterday)
DOM1, DOM2, DOM3, and BYPASS are the number of jumps that are threaded
in corresponding passes. Total are the total number of jumps that are
threaded. Time is the time it took to compile cc1-i files. The
compile time seems to be unaffected. When I tried this patch
yesterday, I got a slight slowdown of 0.176%. Today, I got a slight
speed up of 0.494%. They might be well within errors.
Here are some numbers for MICO files.
original patched diff
--------------------------------
DOM1 7091 9023 +27.245%
DOM2 389 398 +2.313%
DOM3 270 227 -15.925%
BYPASS 70 69 -1.428%
Total 7820 9717 +24.258%
Time 274.196 274.778 +0.212% (real)
Time 263.794 266.897 +0.039% (user)
Tested on i686-pc-linux-gnu. OK to apply?
Kazu Hirata
2005-05-12 Kazu Hirata <kazu@cs.umass.edu>
* tree-optimize.c (init_tree_optimization_passes): Move
pass_merge_phi before the first pass_dominator.
Index: tree-optimize.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-optimize.c,v
retrieving revision 2.90
diff -u -d -p -r2.90 tree-optimize.c
--- tree-optimize.c 11 May 2005 02:24:42 -0000 2.90
+++ tree-optimize.c 11 May 2005 04:58:38 -0000
@@ -373,9 +373,9 @@ init_tree_optimization_passes (void)
NEXT_PASS (pass_vrp);
NEXT_PASS (pass_copy_prop);
NEXT_PASS (pass_dce);
+ NEXT_PASS (pass_merge_phi);
NEXT_PASS (pass_dominator);
- NEXT_PASS (pass_merge_phi);
NEXT_PASS (pass_phiopt);
NEXT_PASS (pass_may_alias);
NEXT_PASS (pass_tail_recursion);