[PATCH 2/4] bb-reorder: Add the "simple" algorithm

Segher Boessenkool segher@kernel.crashing.org
Thu Sep 24 13:48:00 GMT 2015

On Thu, Sep 24, 2015 at 12:32:59PM +0200, Bernd Schmidt wrote:
> On 09/24/2015 12:06 AM, Segher Boessenkool wrote:
> >This is the meat of this series: a new algorithm to do basic block
> >reordering.  It uses the simple greedy approach to maximum weighted
> >matching, where the weights are the predicted execution frequency of
> >the edges.  This always finds a solution that is within a factor two
> >of optimal, if you disregard loops (which we cannot allow) and the
> >complications of block partitioning.
> Looks really good for the most part.
> The comment at the top of the file should be updated to mention both 
> algorithms.

Will do.

> >+  /* Sort the edges, the most desirable first.  */
> >+
> >+  std::stable_sort (edges, edges + n, edge_order);
> Any thoughts on this vs qsort? Do you need a stable sort?

We always need stable sorts in GCC; things are not reproducible across
targets with qsort (not every qsort is the same).

Also, you sometimes have two edges back-to-back, with the same weights;
a stable sort ensures we don't put a jump in the middle of that if we
can help it.

> >+  int j;
> >+  for (j = 0; j < n; j++)
> for (int j ...
> here and in the other loop that uses j.

That is so ugly.  Will change though :-)

> >+  /* If the entry edge no longer falls through we have to make a new
> >+     block so it can do so again.  */
> >+
> >+  edge e = EDGE_SUCC (ENTRY_BLOCK_PTR_FOR_FN (cfun), 0);
> >+  if (e->dest != ENTRY_BLOCK_PTR_FOR_FN (cfun)->aux)
> >+    {
> >+      force_nonfallthru (e);
> >+      e->src->aux = ENTRY_BLOCK_PTR_FOR_FN (cfun)->aux;
> >+      BB_COPY_PARTITION (e->src, e->dest);
> >+    }
> >+}
> That's a bit odd, can this situation be prevented earlier?

We could always add an extra jump at the start, but that's about the
same code so not helpful.

> Why wouldn't we force the entry edge to fall thru?

Because it pessimises code.  If the model thinks the first block after
entry belongs somewhere in the middle of a fall-through sequence, it
usually is right (typical examples are loops that start with the loop
test).  This algorithm does not peel loops (it does no duplication at

All the optimisable blocks end with an unconditional jump, and this algo
tries to remove as many of those as it can; logically, the start block
has such a jump as well.


More information about the Gcc-patches mailing list