This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[RFC/RFA] PR/26830 part 2, delay SSA updating in loop header copying


Hello,

this patch aims at fixing the compile-time-hog part of PR/26830. Here we have a BB with several thousands PHIs, each with a thousand arguments. We also have more than 500 loops whose headers we can copy and, each time a header is copied, tree_duplicate_sese_region calls update_ssa which walks the PHIs to rewrite them.

Zdenek originally wrote special code to update the SSA form, but it was subsequently removed as part of the tree-cleanup-branch merge. It seems to me, however, that tree_duplicate_sese_region is only concerned with the CFG (except for add_phi_args_after_copy, but this function mostly looks at the CFG as well to get the PHI arguments). If this is true, we can completely delay the SSA update until the end of the loop header copying pass. This is what the pass does.

With it, times for PR/26830 are back to sane values (two times slower than 4.0, but that's expected because salias is still creating all those memory tags). Memory is still rocketing up to 1.4 gigabytes though, and that's also an aliasing problem as far as I can tell.

I don't have any confidence in it, though, and I'm only submitting it because it survived bootstrap (regtest is running) and because I cannot afford to sit and write the manual SSA update code. Zdenek, do you remember why you didn't do the update just once?

Ok for mainline if Zdenek confirms it's not a problem and if it passes regtesting (and maybe SPEC)?

Paolo
2006-03-30  Paolo Bonzini  <bonzini@gnu.org>

	PR tree-optimization/26830

	* tree-cfg.c (tree_duplicate_sese_region): Do not update SSA.
	* tree-ssa-loop-ch.c (copy_loop_headers): Count successfully duplicated
	headers and, if there was any, update SSA at the end.

Index: tree-cfg.c
===================================================================
--- tree-cfg.c	(revision 112529)
+++ tree-cfg.c	(working copy)
@@ -4416,10 +4416,10 @@ add_phi_args_after_copy (basic_block *re
    important exit edge EXIT.  By important we mean that no SSA name defined
    inside region is live over the other exit edges of the region.  All entry
    edges to the region must go to ENTRY->dest.  The edge ENTRY is redirected
-   to the duplicate of the region.  SSA form, dominance and loop information
-   is updated.  The new basic blocks are stored to REGION_COPY in the same
-   order as they had in REGION, provided that REGION_COPY is not NULL.
-   The function returns false if it is unable to copy the region,
+   to the duplicate of the region.  SSA form is not updated, but dominance
+   and loop information is.  The new basic blocks are stored to REGION_COPY
+   in the same order as they had in REGION, provided that REGION_COPY is not
+   NULL.  The function returns false if it is unable to copy the region,
    true otherwise.  */
 
 bool
@@ -4479,7 +4479,6 @@ tree_duplicate_sese_region (edge entry, 
       free_region_copy = true;
     }
 
-  gcc_assert (!need_ssa_update_p ());
 
   /* Record blocks outside the region that are dominated by something
      inside.  */
@@ -4549,9 +4549,6 @@ tree_duplicate_sese_region (edge entry, 
   /* Add the other PHI node arguments.  */
   add_phi_args_after_copy (region_copy, n_region);
 
-  /* Update the SSA web.  */
-  update_ssa (TODO_update_ssa);
-
   if (free_region_copy)
     free (region_copy);
 
Index: tree-ssa-loop-ch.c
===================================================================
--- tree-ssa-loop-ch.c	(revision 112529)
+++ tree-ssa-loop-ch.c	(working copy)
@@ -129,7 +129,7 @@ copy_loop_headers (void)
   basic_block header;
   edge exit, entry;
   basic_block *bbs, *copied_bbs;
-  unsigned n_bbs;
+  unsigned n_bbs, n_copied;
   unsigned bbs_size;
 
   loops = loop_optimizer_init (LOOPS_HAVE_PREHEADERS
@@ -145,7 +145,7 @@ copy_loop_headers (void)
   copied_bbs = XNEWVEC (basic_block, n_basic_blocks);
   bbs_size = n_basic_blocks;
 
-  for (i = 1; i < loops->num; i++)
+  for (n_copied = 0, i = 1; i < loops->num; i++)
     {
       /* Copy at most 20 insns.  */
       int limit = 20;
@@ -198,7 +198,9 @@ copy_loop_headers (void)
 
       entry = loop_preheader_edge (loop);
 
-      if (!tree_duplicate_sese_region (entry, exit, bbs, n_bbs, copied_bbs))
+      if (tree_duplicate_sese_region (entry, exit, bbs, n_bbs, copied_bbs))
+	n_copied++;
+      else
 	{
 	  fprintf (dump_file, "Duplication failed.\n");
 	  continue;
@@ -210,6 +212,9 @@ copy_loop_headers (void)
       loop_split_edge_with (loop_latch_edge (loop), NULL);
     }
 
+  if (n_copied)
+    update_ssa (TODO_update_ssa);
+
   free (bbs);
   free (copied_bbs);
 

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]