This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Re-organize initial optimization passes


Hi Diego,

> It's modelled after the pipeline as it existed in TCB but without
> the replication of CCP/FRE/COPY-PROP that was causing some serious
> compile time slowdowns in TCB.

Really?  I just tried the attached patch, which limits the DOM's
iteration to 1 and does pretty much everything that DOM is not good at
before running DOM.  It turns out that this patch speeds up the
compiler.

Here are the number of jumps threaded at tree and RTL level and the
time taken to compile cc1-i files.

       original patched     diff%
---------------------------------
DOM1      27774   23420  -15.676%
DOM2       1501    7995 +432.644%
DOM3       2785    3179  +14.147%
BYPASS     1593    1770  +11.111%
TOTAL     33653   36364   +8.055%
Time    235.881 232.229   -1.548%

So we see some reduction in the number of jumps threaded in DOM1, but
we make it up in DOM2.  In the end, we are threading more jumps while
improving the compile time.  Diego, is there any way you could throw
this patch at your nightly SPEC testing?

We may be able to do even better by moving store_ccp and
store_copy_prop to where ccp and copy_prop are in the second hunk of
this pactch just like TCB.  Then we wouldn't be running ccp and
copy-prop too many times (although they are pretty fast anyway).

Another thing I've tried (but have not submitted due to the current
mainline breakage) is a patch that consists of nothing but the first
hunk of the patch below.  Then I get these numbers.

       original patched    diff%
--------------------------------
DOM1      27774   29937  +7.787%
DOM2       1501    1098 -26.848%
DOM3       2785    2461 -11.633%
BYPASS     1593    1590  -0.188%
Total     33653   35086  +4.258%
Time    235.881 236.298  +0.176%

The idea is to do the vast majority of jump threading in DOM1.  Note
the significant decrease in DOM2 and DOM3.  Although I have not tried,
it's worth trying the patch below with a change that DOM1's iteration
is not limited, but DOM2 and DOM3's are.

Anyway, I am pretty sure that we can speed up the compiler along these
lines while retaining or even improving the quality of generated code.
We may be able to do even better both in compile time and quality of
generated code by taking advantage of SSA_NAME_VALUE_RANGE, which we
don't use anywhere but VRP.

Kazu Hirata

Index: tree-optimize.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-optimize.c,v
retrieving revision 2.90
diff -u -d -p -r2.90 tree-optimize.c
--- tree-optimize.c	11 May 2005 02:24:42 -0000	2.90
+++ tree-optimize.c	12 May 2005 04:21:27 -0000
@@ -373,9 +373,9 @@ init_tree_optimization_passes (void)
   NEXT_PASS (pass_vrp);
   NEXT_PASS (pass_copy_prop);
   NEXT_PASS (pass_dce);
+  NEXT_PASS (pass_merge_phi);
   NEXT_PASS (pass_dominator);
 
-  NEXT_PASS (pass_merge_phi);
   NEXT_PASS (pass_phiopt);
   NEXT_PASS (pass_may_alias);
   NEXT_PASS (pass_tail_recursion);
@@ -388,6 +388,14 @@ init_tree_optimization_passes (void)
      pass_may_alias should be a TODO item.  */
   NEXT_PASS (pass_may_alias);
   NEXT_PASS (pass_rename_ssa_copies);
+
+  NEXT_PASS (pass_ccp);
+  NEXT_PASS (pass_fre);
+  NEXT_PASS (pass_copy_prop);
+  NEXT_PASS (pass_dce);
+  NEXT_PASS (pass_forwprop);
+  NEXT_PASS (pass_merge_phi);
+
   NEXT_PASS (pass_dominator);
   NEXT_PASS (pass_copy_prop);
   NEXT_PASS (pass_dce);
@@ -406,6 +414,14 @@ init_tree_optimization_passes (void)
   NEXT_PASS (pass_pre);
   NEXT_PASS (pass_sink_code);
   NEXT_PASS (pass_loop);
+
+  NEXT_PASS (pass_ccp);
+  NEXT_PASS (pass_fre);
+  NEXT_PASS (pass_copy_prop);
+  NEXT_PASS (pass_dce);
+  NEXT_PASS (pass_forwprop);
+  NEXT_PASS (pass_merge_phi);
+
   NEXT_PASS (pass_dominator);
   NEXT_PASS (pass_copy_prop);
   NEXT_PASS (pass_dce);
Index: tree-ssa-dom.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-ssa-dom.c,v
retrieving revision 2.110
diff -u -d -p -r2.110 tree-ssa-dom.c
--- tree-ssa-dom.c	10 May 2005 20:21:27 -0000	2.110
+++ tree-ssa-dom.c	12 May 2005 04:21:27 -0000
@@ -503,7 +503,7 @@ tree_ssa_dominator_optimize (void)
 	    SSA_NAME_VALUE (name) = NULL;
 	}
     }
-  while (optimize > 1 && cfg_altered);
+  while (0 && optimize > 1 && cfg_altered);
 
   /* Debugging dumps.  */
   if (dump_file && (dump_flags & TDF_STATS))


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]