[PATCH] bb-reorder: Remove a misfiring micro-optimization (PR96475)

Segher Boessenkool segher@kernel.crashing.org
Wed Aug 19 17:40:39 GMT 2020

On Wed, Aug 19, 2020 at 01:10:36PM +0100, Richard Sandiford wrote:
> Segher Boessenkool <segher@kernel.crashing.org> writes:
> > When the compgotos pass copies the tail of blocks ending in an indirect
> > jump, there is a micro-optimization to not copy the last one, since the
> > original block will then just be deleted.  This does not work properly
> > if cleanup_cfg does not merge all pairs of blocks we expect it to.
> >
> >
> > v2: This also deletes the other use of single_pred_p, which has the same
> > problem in principle, I just never have triggered it so far.
> Could you go into more details?  I thought the reason you wanted to
> remove the test in the loop was that each successful iteration of the
> loop removes one incoming edge, meaning that if all previous iterations
> succeed, the final iteration is bound to see a single predecessor.
> And there's no guarantee in that case that the jump in the final
> block will be removed.  I.e. the loop is unnecessarily leaving
> cfgcleanup to finish off the last step for it.  So I agree that the
> loop part looks good.  (I assume that was v1, but I missed it, sorry.)

I didn't Cc: you, I should have :-)

Yes, that is part of the problem.  The case where it was noticed however
was some release of GCC (some 9 release iirc) that did something wrong
with cfg_cleanup apparenetly, so not everything was threaded optimally,
making duplicate_computed_gotos run into jumps to jumps and stopping.
Since it is such an important pass (for performance, for things with
bigger jump tables or with threaded FSMs etc.), it seemed a good idea to
make it more robust instead of slightly faster.

> But it looks like the point of the initial test is to avoid “duplicating”
> the block if there would still only be one copy of the code.  That test
> still seems reasonable since the case it rejects should be handled by
> other cfg cleanups.  If for some reason it isn't, there doesn't seem
> to be any reason to apply the max_size limit for a single predecessor,
> since no code duplication will take place.  So ISTM that removing the
> first condition is really a separate thing that should be its own patch.

I have never seen the second case misfiring in practice, only the first

The original patch is at
btw.  I'll prepare a testcase for that (it doesn't trigger with recent
GCC versions though :-/ )



More information about the Gcc-patches mailing list