[PATCH] New Optimization: Partitioning hot & cold basic blocks

Mon Oct 13 16:32:00 GMT 2003

On Friday, October 10, 2003, at 5:11 PM, Richard Henderson wrote:

> Perhaps I'm not looking hard enough, but how do you handle branches
> between the hot and cold sections?  Most targets don't have
> and conditional branche that can span arbitrary distances, directly
> or indirectly.  Similarly, direct branches may need to be turned into
> indirect branches in order to reach.

The short answer:  I convert conditional jumps crossing between 
sections into
	unconditional jumps.

The long answer (gory details):  There are two things I do:  1).  I go 
through the CFG and
find all fall-thru edges that cross between hot and cold sections.  I 
then insert a new
basic block for the fall-thru edge to fall into (making sure the source 
and destination of
the fall thru edge are in the same (hot or cold) section).  The new 
basic block contains
an unconditional jump to the old fall-thru destination, in the other 
section.  That takes care
of fall thru edges.  2).  For dealing with conditional branches, I 
attempted to use the existing
machinery as much as possible.  Since not all architectures have 
"short" conditional
branches,  I define a macro "LONG_COND_BRANCH_SIZE", to indicate the 
size that
should be used in "shorten_branches" to determine the actual size of 
conditional
branching instructions.  I identify those conditional branches that 
cross between sections
and mark a field in the containing basic block indicating that the 
block contains
such a branch.  In "shorten_branches", I check to see if 
LONG_COND_BRANCH_SIZE
has been defined (to excuse those architecture that don't need this).  
If so, and if
we are performing this optimization, and if the jump instruction is in 
a basic block marked
as containing a conditional jump that crosses section boundaries, then 
the size of the
jump instruction is set to LONG_COND_BRANCH_SIZE (very large).   Later, 
output_cbranch
(in rs6000.c), checks and if the size of the jump is 
LONG_COND_BRANCH_SIZE, it determines
that a "long" jump is needed.  It then inserts an unconditional jump to 
the conditional jump's
target immediately after the conditional jump, inverts the condition on 
the conditional jump,
and makes it jump over the new long jump to the instruction after it.  
(All of this machinery,
with the exception of testing LONG_COND_BRANCH_SIZE, was already there.)

So for any architecture that uses short conditional jumps, 
LONG_COND_BRANCH_SIZE
should be defined, and the code for the architecture that converts 
short jumps to long
jumps where needed should use LONG_COND_BRANCH_SIZE as part of its 
test.  That's
all that needs to be done for any such architecture.

>
> I'd also like to see changes to dwarf2 output so that this split is
> properly represented in the debug info.  Given that this feature is
> not enabled except via explicit switch, this isn't imperitive, but
> it should be done once this patch is accepted.
>
>
> r~
>