This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [patch] PR middle-end/39326 - limit LIM


On Sun, Mar 10, 2013 at 2:08 AM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
> Hello,
>
> The attached patch fixes another one of the scalability problems
> reported in PR middle-end/39326.
>
> This problem is that tree loop-invariant code motion explodes on basic
> blocks with many memory references. Compile time is quadratic in the
> number of memory references in the basic block, and so are the memory
> requirements when the dependences or independences are propagated
> bottom-up through the loop tree.
>
> The fix is to give up on loops with too many memory references to
> handle. There is already a param for that for loop dependence
> analysis, and this patch makes LIM use the same param.
>
> Bootstrapped&tested on {x86_64,powerpc64}-unknown-linux-gnu.
> OK for trunk?

Given

+   well.  Return true if all is well, false if something happened
+   that is fatal to the rest of the LIM pass.  */

-static void
+static bool
 gather_mem_refs_stmt (struct loop *loop, gimple stmt)

and

   FOR_EACH_BB (bb)
     {
...
+      for (bsi = gsi_start_bb (bb);
+          !gsi_end_p (bsi) && all_ok;
+          gsi_next (&bsi))
+       all_ok = gather_mem_refs_stmt (loop, gsi_stmt (bsi));
+
+      if (! all_ok)
+        bitmap_set_bit (loops_with_too_many_memrefs, loop->num);
+    }
+
+  /* Propagate the information about loops with too many memory
+     references up the loop hierarchy.  */
+  FOR_EACH_LOOP (li, loop, LI_FROM_INNERMOST)
+    {
+      struct loop *outer = loop_outer (loop);
+      if (outer == current_loops->tree_root
+         || ! bitmap_bit_p (loops_with_too_many_memrefs, loop->num))
+       continue;
+      bitmap_set_bit (loops_with_too_many_memrefs, outer->num);
     }

I don't see how this propagation works correctly as you start to mark
BBs as not-ok starting from a "random" basic-block in the loop tree.
You of course also end up disqualifying very small loops completely
if they happen to be analyzed after a very big one you disqualify.
Of course that's partly because memory_accesses contains all
memory accesses in the function - so I think rather than limiting
on length of memory_accesses you want to limit on the length of
memory_accesses.refs_in_loop (well, on memory_accesses.all_refs_in_loop).
And you want the initial walk over all BBs to instead walk on BBs
FOR_EACH_LOOP and LI_FROM_INNERMOST (you can then do the
propagation to fill all_refs_in_loop there, too).  I realize there isn't a good
existing BB walker for this (with a suitable place to re-set all-ok) - a simple
walk over get_loop_body via LI_FROM_INNERMOST would get you
visit sub-loop bodies multiple times (easily skipped by a bb->loop_father
test, of course, but still ...)

That is, sth like the following preparatory patch to change the iteration:

Index: gcc/tree-ssa-loop-im.c
===================================================================
--- gcc/tree-ssa-loop-im.c      (revision 196547)
+++ gcc/tree-ssa-loop-im.c      (working copy)
@@ -1629,29 +1629,30 @@ gather_mem_refs_in_loops (void)
   loop_iterator li;
   bitmap lrefs, alrefs, alrefso;

-  FOR_EACH_BB (bb)
-    {
-      loop = bb->loop_father;
-      if (loop == current_loops->tree_root)
-       continue;
-
-      for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi))
-       gather_mem_refs_stmt (loop, gsi_stmt (bsi));
-    }
-
-  /* Propagate the information about accessed memory references up
-     the loop hierarchy.  */
   FOR_EACH_LOOP (li, loop, LI_FROM_INNERMOST)
     {
+      basic_block bbs = get_loop_body (loop);
+      for (i = 0; i < loop->num_nodes; ++i)
+       {
+         bb = bbs[i];
+         if (bb->loop_father != loop)
+           continue;
+         for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi))
+           gather_mem_refs_stmt (loop, gsi_stmt (bsi));
+       }
+      free (bbs);
+
+      /* Propagate the information about accessed memory references up
+        the loop hierarchy.  */
       lrefs = memory_accesses.refs_in_loop[loop->num];
       alrefs = memory_accesses.all_refs_in_loop[loop->num];
       bitmap_ior_into (alrefs, lrefs);

-      if (loop_outer (loop) == current_loops->tree_root)
-       continue;
-
-      alrefso = memory_accesses.all_refs_in_loop[loop_outer (loop)->num];
-      bitmap_ior_into (alrefso, alrefs);
+      if (loop_outer (loop) != current_loops->tree_root)
+       {
+         alrefso = memory_accesses.all_refs_in_loop[loop_outer (loop)->num];
+         bitmap_ior_into (alrefso, alrefs);
+       }
     }
 }


At this point this should be stage1 material, eventually backported for 4.8.1.

And yes, aside from the above the rest of the patch looks good to me
(move loops_with_too_many_memrefs into the memory_accesses struct?)

Thanks,
Richard.

> Ciao!
> Steven
>
> (The ChangeLog is a bit long but the patch is relatively straight-forward.)
>
>         * tree-flow.h (enum move_pos): Moved to tree-ssa-loop-im.c.
>         * tree-ssa-loop-im.c: Include diagnostic-core.h for warning_at()
>         (enum move_pos): Moved here.
>         (movement_possibility): Made static.  Reject memory operations in
>         loops with too many memory references to handle.
>         (determine_max_movement): Take loops_with_too_many_memrefs argument.
>         For statements referencing memory, find the outermost superloop
>         that is not in the loops_with_too_many_memrefs set.
>         (determine_invariantness_stmt): Take loops_with_too_many_memrefs
>         via dom_walk_data.global_data, and pass it along where necessary.
>         Hoist "pos == MOVE_POSSIBLE" test.
>         (determine_invariantness): Take loops_with_too_many_memrefs argument.
>         (move_computations): Likewise, but unused for now.
>         (gather_mem_refs_stmt): Fail if there are too many memory references.
>         Use PARAM_LOOP_MAX_DATAREFS_FOR_DATADEPS as threshold.  Add disabled
>         optimization warning.
>         (gather_mem_refs_in_loops): Take loops_with_too_many_memrefs argument.
>         Propagate it from inner loops to outer loops.  Do not propagate
>         recorded memory references for loops on which memory optimizations
>         are disabled.
>         (create_vop_ref_mapping): Take loops_with_too_many_memrefs argument.
>         Don't create a mapping on loops that are in this set.
>         (analyze_memory_references): Take loops_with_too_many_memrefs argument
>         and let subroutines fill it.
>         (store_motion): Take loops_with_too_many_memrefs argument.
>         Skip loops that are in this set.
>         (tree_ssa_lim): Allocate, pass, and free loops_with_too_many_memrefs.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]