This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Fix code quality regression on UltraSPARC

> I was just thinking out loud about further refinements (you did ask
> in your original post).  For example, I started off by thinking that
> you could refine your patch from "A follows B", to "A is dominated
> by B", or "A is immediately dominated by B".  However, I believe that
> the correct placement is determined purely by the "fallthru" edges.
> Graph-theoretically, this is the only property affected by your
> proposed change.

I agree, this seems to all boil down to the "fallthruness".  The idea is to 
avoid breaking any fallthruness relationship between blocks.  I don't know 
whether any further refinement would be of any help.

> [I apologise for any confusion with the terminology below, when I
> talk about the fallthru edges into the loop header, I'm talking about
> the placement of edges *before* we create the preheader.  After we
> insert the preheader there should only be one incoming edge to the
> loop header, which is from the new preheader].

Right (modulo the latch edge of course).

> The reason this code is here at all, and we just don't always place
> the preheader immediately before the loop header, is to avoid badly
> placing the preheader "in the loop", i.e. when the latch edge is a
> fallthru edge, such as a rotated "while (..) {}" loop.  In this case,
> placing the preheader there, changes loop edges from fallthru to
> unconditional jumps, which obviously hurts performance.  In this case,
> we the preheader should be placed after an arbitrary loop entry edge.
> In all other cases, (the latch edge isn't a fallthru edge) its best
> to place to preheader immediately before the loop header.  If there
> are no fallthru edges into the loop header, the preheader can also
> alternatively be placed after an arbitrary predecessor block with no
> adverse increase/decrease in the number of unconditional jumps.
> Your "follows" heuristic captures some aspects of this logic.

Nice explanation.  So I guess we can further improve the heuristics by testing 
the fallthruness of the latch edge.

> I also want to thank Kazu for pointing out on IRC, that we already
> loop over all incoming edges at the top of create_preheader.  This
> means that we can put some more effort into placing the preheader
> intelligently without affecting the big-O of this routine.  I believe
> that just inspecting the first N edges for a fallthru edge, where
> N is four or five, should be sufficient to capture most of the benefit
> without any severe performance impact.

Indeed, I missed that.  But do we need to scan for a fallthru edge if we test 
the fallthruness of the latch edge?  I don't think so.

> I hope this makes my comments somewhat clearer.  If not, post the
> updated patch that you have (which I'm happy to approve) to address
> the urgent performance regression, and refinements can be investigated
> as follow-up patches.

We're not in a hurry I think.  What about something like this?  Do you have 
any opinion on the ??? note?

Eric Botcazou
Index: cfgloopmanip.c
RCS file: /cvs/gcc/gcc/gcc/cfgloopmanip.c,v
retrieving revision 1.39
diff -u -p -r1.39 cfgloopmanip.c
--- cfgloopmanip.c	8 Dec 2004 18:52:48 -0000	1.39
+++ cfgloopmanip.c	12 Dec 2004 22:35:01 -0000
@@ -1139,19 +1139,24 @@ create_preheader (struct loop *loop, int
   edge e, fallthru;
   basic_block dummy;
+  basic_block one_succ_pred = 0;
   struct loop *cloop, *ploop;
   int nentry = 0;
   bool irred = false;
+  bool latch_edge_was_fallthru;
   edge_iterator ei;
   cloop = loop->outer;
   FOR_EACH_EDGE (e, ei, loop->header->preds)
-      if (e->src == loop->latch)
+      basic_block src = e->src;
+      if (src == loop->latch)
       irred |= (e->flags & EDGE_IRREDUCIBLE_LOOP) != 0;
+      if (EDGE_COUNT (src->succs) == 1)
+	one_succ_pred = src;
   gcc_assert (nentry);
   if (nentry == 1)
@@ -1166,6 +1171,7 @@ create_preheader (struct loop *loop, int
   mfb_kj_edge = loop_latch_edge (loop);
+  latch_edge_was_fallthru = (mfb_kj_edge->flags & EDGE_FALLTHRU) != 0;
   fallthru = make_forwarder_block (loop->header, mfb_keep_just,
   dummy = fallthru->src;
@@ -1177,13 +1183,36 @@ create_preheader (struct loop *loop, int
     if (ploop->latch == dummy)
       ploop->latch = fallthru->dest;
-  /* Reorganize blocks so that the preheader is not stuck in the middle of the
-     loop.  */
-  /* Get an edge that is different from the one from loop->latch to
-     dummy.  */
-  e = EDGE_PRED (dummy, EDGE_PRED (dummy, 0)->src == loop->latch);
-  move_block_after (dummy, e->src);
+  /* Try to be clever in placing the newly created preheader.  The idea is to
+     avoid breaking any "fallthruness" relationship between blocks.
+     The preheader was created just before the header and all incoming edges
+     to the header were redirected to the preheader, except the latch edge.
+     So the only problematic case is when this latch edge was a fallthru
+     edge: it is not anymore after the preheader creation so we have broken
+     the fallthruness.  We're therefore going to look for a better place.  */
+  if (latch_edge_was_fallthru)
+    {
+      basic_block best_pred = 0;
+      if (one_succ_pred)
+	best_pred = one_succ_pred;
+      else
+	FOR_EACH_EDGE (e, ei, dummy->preds)
+	  {
+	    /* ??? Can this ever happen?  */
+	    if (e->src == loop->latch)
+	      continue;
+	    /* Do not put the preheader between consecutive predecessors.  */
+	    if (!best_pred || best_pred->next_bb == e->src)
+	      best_pred = e->src;
+	    else
+	      break;
+	  }
+      move_block_after (dummy, best_pred);
+    }
   loop->header->loop_father = loop;
   add_bb_to_loop (dummy, cloop);

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]