This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Speedup loop header copying [part of PR 46590]

From: Michael Matz <matz at suse dot de>
To: gcc-patches at gcc dot gnu dot org
Date: Sun, 2 Sep 2012 21:35:10 +0200 (CEST)
Subject: Speedup loop header copying [part of PR 46590]

Hi,

as the bug report tells us one speed problem is loop header copying, in 
particular the update_ssa call that is done for each and every copied loop 
header but touches all blocks in a function.

Now, one idea was to use an optimized update_ssa that works only on the 
relevant subset of blocks (it's dominance frontiers that are the problem).  
I've experimented with the original formulation of frontiers as per Cytron 
which allows to calculate the domfrontier of one basic block lazily.  The 
end result was none, no speedup, no slowdown.  I haven't investigated, 
but I guess the problem is that too often most of the blocks are relevant 
for most of the header copies, either because of virtual ops or because 
calculating the domfrontier of a block needs all domfrontiers of all 
dominated childs (i.e. the domfrontier of the entry needs domfrontiers of 
all blocks).  So, no cake there.

But actually there's no reason that we need to keep SSA form uptodate 
during the multiple header copyings.  We use gimple_duplicate_sese_region 
to do the copying which updates the SSA web before returning (actually 
loop header copying is the only caller of it).  The next thing done is 
just another call to gimple_duplicate_sese_region to copy some other BBs, 
then some split edges, repeat from start.  We can just as well defer the 
whole SSA web updating to after we've duplicated everything we want.

That's what this patch does.  Time for various things on the testcase 
(with -O1):

                         without     with patch
tree SSA rewrite         26.2s         4.8s
tree SSA incremental     21.7s         4.6s
dominance computation    15.0s         4.2s
dominance frontiers      25.6s         6.7s
TOTAL                   135.6s        67.8s

Regstrapped on x86_64-linux, no regressions.  Okay for trunk?


Ciao,
Michael.
------------------
	PR tree-optimization/46590

	* tree-cfg.c (gimple_duplicate_sese_region): Don't update
	SSA web here ...
	* tree-ssa-loop-ch.c (copy_loop_headers): ... but here.

Index: tree-cfg.c
===================================================================
--- tree-cfg.c	(revision 190803)
+++ tree-cfg.c	(working copy)
@@ -5530,9 +5530,10 @@ add_phi_args_after_copy (basic_block *re
    important exit edge EXIT.  By important we mean that no SSA name defined
    inside region is live over the other exit edges of the region.  All entry
    edges to the region must go to ENTRY->dest.  The edge ENTRY is redirected
-   to the duplicate of the region.  SSA form, dominance and loop information
-   is updated.  The new basic blocks are stored to REGION_COPY in the same
-   order as they had in REGION, provided that REGION_COPY is not NULL.
+   to the duplicate of the region.  Dominance and loop information is
+   updated, but not the SSA web.  The new basic blocks are stored to
+   REGION_COPY in the same order as they had in REGION, provided that
+   REGION_COPY is not NULL.
    The function returns false if it is unable to copy the region,
    true otherwise.  */
 
@@ -5593,8 +5594,6 @@ gimple_duplicate_sese_region (edge entry
       free_region_copy = true;
     }
 
-  gcc_assert (!need_ssa_update_p (cfun));
-
   /* Record blocks outside the region that are dominated by something
      inside.  */
   doms = NULL;
@@ -5663,9 +5662,6 @@ gimple_duplicate_sese_region (edge entry
   /* Add the other PHI node arguments.  */
   add_phi_args_after_copy (region_copy, n_region, NULL);
 
-  /* Update the SSA web.  */
-  update_ssa (TODO_update_ssa);
-
   if (free_region_copy)
     free (region_copy);
 
Index: tree-ssa-loop-ch.c
===================================================================
--- tree-ssa-loop-ch.c	(revision 190803)
+++ tree-ssa-loop-ch.c	(working copy)
@@ -241,6 +241,7 @@ copy_loop_headers (void)
       split_edge (loop_latch_edge (loop));
     }
 
+  update_ssa (TODO_update_ssa);
   free (bbs);
   free (copied_bbs);

Follow-Ups:
- Re: Speedup loop header copying [part of PR 46590]
  - From: Richard Guenther

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]