This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[RFC/RFA] PR/26830 part 2, delay SSA updating in loop header copying
- From: Paolo Bonzini <paolo dot bonzini at lu dot unisi dot ch>
- To: GCC Patches <gcc-patches at gcc dot gnu dot org>, Diego Novillo <dnovillo at redhat dot com>, Zdenek Dvorak <rakdver at atrey dot karlin dot mff dot cuni dot cz>
- Date: Thu, 30 Mar 2006 16:27:57 +0200
- Subject: [RFC/RFA] PR/26830 part 2, delay SSA updating in loop header copying
Hello,
this patch aims at fixing the compile-time-hog part of PR/26830. Here
we have a BB with several thousands PHIs, each with a thousand
arguments. We also have more than 500 loops whose headers we can copy
and, each time a header is copied, tree_duplicate_sese_region calls
update_ssa which walks the PHIs to rewrite them.
Zdenek originally wrote special code to update the SSA form, but it was
subsequently removed as part of the tree-cleanup-branch merge. It seems
to me, however, that tree_duplicate_sese_region is only concerned with
the CFG (except for add_phi_args_after_copy, but this function mostly
looks at the CFG as well to get the PHI arguments). If this is true, we
can completely delay the SSA update until the end of the loop header
copying pass. This is what the pass does.
With it, times for PR/26830 are back to sane values (two times slower
than 4.0, but that's expected because salias is still creating all those
memory tags). Memory is still rocketing up to 1.4 gigabytes though, and
that's also an aliasing problem as far as I can tell.
I don't have any confidence in it, though, and I'm only submitting it
because it survived bootstrap (regtest is running) and because I cannot
afford to sit and write the manual SSA update code. Zdenek, do you
remember why you didn't do the update just once?
Ok for mainline if Zdenek confirms it's not a problem and if it passes
regtesting (and maybe SPEC)?
Paolo
2006-03-30 Paolo Bonzini <bonzini@gnu.org>
PR tree-optimization/26830
* tree-cfg.c (tree_duplicate_sese_region): Do not update SSA.
* tree-ssa-loop-ch.c (copy_loop_headers): Count successfully duplicated
headers and, if there was any, update SSA at the end.
Index: tree-cfg.c
===================================================================
--- tree-cfg.c (revision 112529)
+++ tree-cfg.c (working copy)
@@ -4416,10 +4416,10 @@ add_phi_args_after_copy (basic_block *re
important exit edge EXIT. By important we mean that no SSA name defined
inside region is live over the other exit edges of the region. All entry
edges to the region must go to ENTRY->dest. The edge ENTRY is redirected
- to the duplicate of the region. SSA form, dominance and loop information
- is updated. The new basic blocks are stored to REGION_COPY in the same
- order as they had in REGION, provided that REGION_COPY is not NULL.
- The function returns false if it is unable to copy the region,
+ to the duplicate of the region. SSA form is not updated, but dominance
+ and loop information is. The new basic blocks are stored to REGION_COPY
+ in the same order as they had in REGION, provided that REGION_COPY is not
+ NULL. The function returns false if it is unable to copy the region,
true otherwise. */
bool
@@ -4479,7 +4479,6 @@ tree_duplicate_sese_region (edge entry,
free_region_copy = true;
}
- gcc_assert (!need_ssa_update_p ());
/* Record blocks outside the region that are dominated by something
inside. */
@@ -4549,9 +4549,6 @@ tree_duplicate_sese_region (edge entry,
/* Add the other PHI node arguments. */
add_phi_args_after_copy (region_copy, n_region);
- /* Update the SSA web. */
- update_ssa (TODO_update_ssa);
-
if (free_region_copy)
free (region_copy);
Index: tree-ssa-loop-ch.c
===================================================================
--- tree-ssa-loop-ch.c (revision 112529)
+++ tree-ssa-loop-ch.c (working copy)
@@ -129,7 +129,7 @@ copy_loop_headers (void)
basic_block header;
edge exit, entry;
basic_block *bbs, *copied_bbs;
- unsigned n_bbs;
+ unsigned n_bbs, n_copied;
unsigned bbs_size;
loops = loop_optimizer_init (LOOPS_HAVE_PREHEADERS
@@ -145,7 +145,7 @@ copy_loop_headers (void)
copied_bbs = XNEWVEC (basic_block, n_basic_blocks);
bbs_size = n_basic_blocks;
- for (i = 1; i < loops->num; i++)
+ for (n_copied = 0, i = 1; i < loops->num; i++)
{
/* Copy at most 20 insns. */
int limit = 20;
@@ -198,7 +198,9 @@ copy_loop_headers (void)
entry = loop_preheader_edge (loop);
- if (!tree_duplicate_sese_region (entry, exit, bbs, n_bbs, copied_bbs))
+ if (tree_duplicate_sese_region (entry, exit, bbs, n_bbs, copied_bbs))
+ n_copied++;
+ else
{
fprintf (dump_file, "Duplication failed.\n");
continue;
@@ -210,6 +212,9 @@ copy_loop_headers (void)
loop_split_edge_with (loop_latch_edge (loop), NULL);
}
+ if (n_copied)
+ update_ssa (TODO_update_ssa);
+
free (bbs);
free (copied_bbs);