[patch] Use a smaller, dynamic worklist in compute_global_livein

Tue Aug 7 08:31:00 GMT 2012

On Tue, Aug 7, 2012 at 8:24 AM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
> Hello,
>
> In the test case for PR54146, compute_global_livein allocates/frees a
> worklist for >400,000 basic blocks on each invocation. And it's called
> a lot, for rewrite_into_loop_closed_ssa. But the maximum number of
> basic blocks ever on the work list was only ~6500. So the work list
> can be much smaller most of the time.
>
> Bootstrapped&tested on x86_64-unknown-linux-gnu. OK for trunk?

Index: tree-ssa-loop-manip.c
===================================================================

--- tree-ssa-loop-manip.c       (revision 190176)
+++ tree-ssa-loop-manip.c       (working copy)
@@ -162,10 +162,8 @@ add_exit_phis_var (tree var, bitmap live
   basic_block def_bb = gimple_bb (SSA_NAME_DEF_STMT (var));
   bitmap_iterator bi;

-  if (is_gimple_reg (var))
-    bitmap_clear_bit (livein, def_bb->index);
-  else
-    bitmap_set_bit (livein, def_bb->index);
+  gcc_checking_assert (is_gimple_reg (var));
+  bitmap_clear_bit (livein, def_bb->index);

!is_gimple_reg is true for virtual operand PHIs ... and at least
find_uses_to_rename_stmt loops over all uses (so, I suppose given the
comments elsewhere about tracking vops can you fix that to
s/SSA_OP_ALL_USES/SSA_OP_USE/?

 void
-compute_global_livein (bitmap livein ATTRIBUTE_UNUSED, bitmap
def_blocks ATTRIBUTE_UNUSED)
+compute_global_livein (bitmap livein, bitmap def_blocks)
 {
-  basic_block bb, *worklist, *tos;
   unsigned i;
   bitmap_iterator bi;
+  VEC (basic_block, heap) *worklist;

-  tos = worklist
-    = (basic_block *) xmalloc (sizeof (basic_block) * (last_basic_block + 1));
+  /* Normally the work list size is bounded by the number of basic
+     blocks in the largest loop.  We don't know this number, but we
+     can be fairly sure that it will be relatively small.  */
+  worklist = VEC_alloc (basic_block, heap, MAX (8, n_basic_blocks / 100));

I suppose if we really want to optimize this we can make worklist static
so we at most grow once.  (in case you want to optimize allocation overhead,
not memory used).  Other structures in the ssa rewriter have this kind of
lifetime, too (the per-SSA name info for example).

Ok with the first change, whatever you prefer on the
compute_global_livein thing.

Thanks,
Richard.

> Ciao!
> Steven