[PATCH] bb-reorder: Improve the simple algorithm for -Os (PR67864)

Richard Biener richard.guenther@gmail.com
Fri Oct 9 09:29:00 GMT 2015


On Thu, Oct 8, 2015 at 6:57 PM, Segher Boessenkool
<segher@kernel.crashing.org> wrote:
> As the PR points out, the "simple" reorder algorithm makes bigger code
> than the STC algorithm did, for -Os, for x86.  I now tested it for many
> different targets and it turns out to be worse everywhere.
>
> This simple patch tunes "simple" a bit; this makes it better than STC
> almost everywhere.  The only exceptions (for the targets where I have
> results) are x86 and mn10300.  For those targets it may be best to switch
> the default algorithm for -Os to STC.
>
> The raw results.  This is text size for vmlinux for 31 targets; as you
> can see many did not build, but at least all primary targets did.
> "none" is no basic block reordering; "trunk" is current trunk; "stc" is
> with the STC algorithm; "nodyn" is with simple, but considering all edges
> equally likely; "swaps" is that, but prefering the more likely edge from
> conditional branches; and "falls" prefers the edge from conditional
> branches that is already the fallthrough edge.  This patch is "falls".
>
>    none    trunk     stc     nodyn    swaps    falls    best
>  3728663  3721479  3700831  3706407  3717971  3690367  falls  arm
>  2097684  2095560  2089484  2094024  2094212  2086720  falls  blackfin
>  2096118  2107356  2081894  2092276  2103732  2077162  falls  cris
>  3204044  3200972  3187196  3191932  3198156  3177980  falls  frv
>  9254222  9340258  9208805  9265886  9331362  9247119   stc   i386
>  3353034  3355482  3334726  3334466  3349710  3314970  falls  m32r
>  4545720  4549824  4514256  4541832  4544540  4498416  falls  microblaze
>  4276743  4266363  4246255  4259923  4261367  4227723  falls  mips
>  5779199  5770567  5741663  5757663  5764475  5721803  falls  mips64
>  2059078  2086000  2051005  2064365  2083923  2055705   stc   mn10300
>  3311925  3320113  3293873  3305949  3317865  3284081  falls  nios2
>  6738701  6747077  6710253  6742917  6740965  6696757  falls  parisc64
>  8312408  8312744  8261016  8294480  8306488  8237188  falls  powerpc
> 17782722 17788382 17722326 17749526 17780642 17683810  falls  powerpc64
> 11016289 11029481 10970081 10980065 11024617 10942409  falls  s390
>  1406832  1417224  1400344  1409392  1416172  1399428  falls  shnommu
>  3771111  3776223  3751623  3768459  3771455  3732967  falls  sparc
>  6113427  6113527  6074875  6091167  6106015  6049571  falls  sparc64
> 10449681 10529883 10398908 10458240 10522149 10440814   stc   x86_64
>  1905733  1905733  1905733  1905733  1905733  1905733  -----  xtensa
>
>
> Is this okay for trunk?

I think the patch makes sense but it also raises a question for me - how
did we decide what edge gets EDGE_FALLTHRU when going out-of-cfglayout?
And isn't _that_ mechanism then not part of basic-block reordering that
needs to be tweaked for choosing the EDGE_FALLTHRU as better?

Thanks,
Richard.

>
> Segher
>
>
> 2015-10-08  Segher Boessenkool  <segher@kernel.crashing.org>
>
>         PR rtl-optimization/67864
>         * gcc/bb-reorder (reorder_basic_blocks_simple): Prefer existing
>         fallthrough edges for conditional jumps.  Don't sort candidate
>         edges if not optimizing for speed.
>
> ---
>  gcc/bb-reorder.c | 16 ++++++++++++----
>  1 file changed, 12 insertions(+), 4 deletions(-)
>
> diff --git a/gcc/bb-reorder.c b/gcc/bb-reorder.c
> index cb001e8..3b7098e 100644
> --- a/gcc/bb-reorder.c
> +++ b/gcc/bb-reorder.c
> @@ -2318,16 +2318,24 @@ reorder_basic_blocks_simple (void)
>
>        if (any_condjump_p (end))
>         {
> -         edges[n++] = EDGE_SUCC (bb, 0);
> -         edges[n++] = EDGE_SUCC (bb, 1);
> +         edge e0 = EDGE_SUCC (bb, 0);
> +         edge e1 = EDGE_SUCC (bb, 1);
> +         /* When optimizing for size it is best to keep the original
> +            fallthrough edges.  */
> +         if (e1->flags & EDGE_FALLTHRU)
> +           std::swap (e0, e1);
> +         edges[n++] = e0;
> +         edges[n++] = e1;
>         }
>        else if (single_succ_p (bb))
>         edges[n++] = EDGE_SUCC (bb, 0);
>      }
>
> -  /* Sort the edges, the most desirable first.  */
> +  /* Sort the edges, the most desirable first.  When optimizing for size
> +     all edges are equally desirable.  */
>
> -  std::stable_sort (edges, edges + n, edge_order);
> +  if (optimize_function_for_speed_p (cfun))
> +    std::stable_sort (edges, edges + n, edge_order);
>
>    /* Now decide which of those edges to make fallthrough edges.  We set
>       BB_VISITED if a block already has a fallthrough successor assigned
> --
> 1.8.1.4
>



More information about the Gcc-patches mailing list