[PATCH 2/2] shrink-wrap: Rewrite try_shrink_wrapping
Tue Sep 15 16:07:00 GMT 2015
On Thu, Sep 10, 2015 at 8:14 AM, Segher Boessenkool
> This patch rewrites the shrink-wrapping algorithm, allowing non-linear
> pieces of CFG to be duplicated for use without prologue instead of just
> linear pieces.
> On PowerPC, this enables shrink-wrapping of about 2%-3% more functions.
> I expected more, but in most cases this would help we cannot yet shrink-
> wrap because there are non-volatile registers used, often in the first
> block already.
> Since with this patch you still get only one prologue, it doesn't do
> much either for the case where there are many no-return error paths
> (common in an enable-checking compiler build); all those paths end in
> a no-return call, and those need a prologue (are not sibling calls).
> There are PRs about this. For shrink-wrapping, because all those
> paths want a prologue we put a prologue early in the function, although
> none of the "regular" code needs it.
> I instrumented things a bit (not in the patch). We can get about 10%
> to 20% more functions shrink-wrapped by allowing multiple edges that
> need a prologue inserted (edges to one and the same block); this can be
> easily done by just inserting an extra block. I'll work on this.
> Of the blocks chosen to have the prologue inserted, about 70% need a
> prologue because there is a call, 25% for other reasons (non-volatile
> register sets mostly), and only 5% do not themselves need a prologue.
> There are also cases where no block needs a prologue at all, but GCC
> thinks the function needs one nevertheless. This happens for example
> if a stack frame was created for an address-taken local variable, but
> that variable was optimised away later. This doesn't happen much in
> most cases (one in a thousand or so). There are some cases (like -pg)
> where the compiler forces a stack frame even if nothing uses it.
> Shrink-wrapping is run at -O1, and basic block reordering is not.
> Shrink-wrapping would often benefit from some simple reordering. There
> are quite a few targets that do not want the STC bbro at all, either;
> we should have a simple bbro that runs at -O1 as well, does not increase
> code size, and can be used for those targets that do not want STC.
> It also would be nice to get rid of the silly games shrink-wrapping
> plays (together with function.c) making fake edges for where the
> simple_returns should be inserted. It would simplify a lot of code
> if we would (could) just insert them directly.
> Bootstrapped and regression tested on powerpc64-linux. Is this okay
> for mainline?
> 2015-09-10 Segher Boessenkool <email@example.com>
> * shrink-wrap.c (requires_stack_frame_p): Fix formatting.
> (dup_block_and_redirect): Delete function.
> (can_dup_for_shrink_wrapping): New function.
> (fix_fake_fallthrough_edge): New function.
> (try_shrink_wrapping): Rewrite function.
> (convert_to_simple_return): Call fix_fake_fallthrough_edge.
More information about the Gcc-patches