[Bug rtl-optimization/71785] Computed gotos are mostly optimized away

Thu Nov 21 19:44:00 GMT 2019

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71785

--- Comment #20 from Aleksey <rndfax at yandex dot ru> ---
(In reply to Segher Boessenkool from comment #19)
> '-freorder-blocks'
>      Reorder basic blocks in the compiled function in order to reduce
>      number of taken branches and improve code locality.
> 
>      Enabled at levels '-O', '-O2', '-O3', '-Os'.
> 
> If you disable this option, you get more taken branches and a less linear
> control flow.  As documented.

If compgoto would have been depended on this option then that would explained
observed behavior. But compgoto does not depend on it. And it says:
/* Duplicate the blocks containing computed gotos.  This basically unfactors
   computed gotos that were factored early on in the compilation process to
   speed up edge based data flow.  We used to not unfactor them again, which
   can seriously pessimize code with many computed jumps in the source code,
   such as interpreters.  See e.g. PR15242.  */
static void
duplicate_computed_gotos (function *fun)

"This basically unfactors computed gotos that were factored early"

And it works fine until one last predecessor. Why until the last predecessor?
That's the point.

> 
> > START - the very first "goto *xxx" - was not optimized, since "bb 5" has
> > only 1 predecessor.
> > But it's optimized in later step "bbro". That's exactly why option
> > "-fno-reorder-blocks" breaks first jump optimization.
> > 
> > Can someone explain why there are such conditions:
> >       if (single_pred_p (bb))
> >           return false;
> > 
> >       if (single_pred_p (bb))
> >           continue;
> > in maybe_duplicate_computed_goto function?
> 
> It duplicates the code, one copy for every predecessor to jump to.

I showed that not for every predecessor.

> It does not move existing blocks elsewhere.

I provided RTL dump, it says that it removes edges. It moves the code. It even
says this:

/* Duplicates basic block BB and redirects edge E to it.
...
basic_block                                                                    

        duplicate_block (basic_block bb, edge e, basic_block after,
copy_bb_data *id)

Edge count in bb decreases, so code is moving.

And by answering the question I meant "what are these conditions for".
Why bb with one predecessor is not optimized?
Why not just optimize it right here right now?