This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] New Optimization: Partitioning hot & cold basic blocks
> On Friday, October 10, 2003, at 5:11 PM, Richard Henderson wrote:
>
> >Perhaps I'm not looking hard enough, but how do you handle branches
> >between the hot and cold sections? Most targets don't have
> >and conditional branche that can span arbitrary distances, directly
> >or indirectly. Similarly, direct branches may need to be turned into
> >indirect branches in order to reach.
>
>
> The short answer: I convert conditional jumps crossing between
> sections into
> unconditional jumps.
>
> The long answer (gory details): There are two things I do: 1). I go
> through the CFG and
> find all fall-thru edges that cross between hot and cold sections. I
> then insert a new
> basic block for the fall-thru edge to fall into (making sure the source
> and destination of
> the fall thru edge are in the same (hot or cold) section). The new
I see this is done in fix_up_fallthru_edges. There is force_nonfallthru
function that does precisely that. It would be better to avoid code
duplication.
I was also thinking a bit about your changes to cfg-cleanup. With my
changes to hookize cfg manipulation you can do cfg-cleanup before BB
reorder in cfg_layout mode (I believe it is done now). so you should no
longer need to run cfg_cleanup after BB reordering and thus you should
not need the changes... That would simplify the patch a bit :)
Honza
> basic block contains
> an unconditional jump to the old fall-thru destination, in the other
> section. That takes care
> of fall thru edges. 2). For dealing with conditional branches, I
> attempted to use the existing
> machinery as much as possible. Since not all architectures have
> "short" conditional
> branches, I define a macro "LONG_COND_BRANCH_SIZE", to indicate the
> size that
> should be used in "shorten_branches" to determine the actual size of
> conditional
> branching instructions. I identify those conditional branches that
> cross between sections
> and mark a field in the containing basic block indicating that the
> block contains
> such a branch. In "shorten_branches", I check to see if
> LONG_COND_BRANCH_SIZE
> has been defined (to excuse those architecture that don't need this).
> If so, and if
> we are performing this optimization, and if the jump instruction is in
> a basic block marked
> as containing a conditional jump that crosses section boundaries, then
> the size of the
> jump instruction is set to LONG_COND_BRANCH_SIZE (very large). Later,
> output_cbranch
> (in rs6000.c), checks and if the size of the jump is
> LONG_COND_BRANCH_SIZE, it determines
> that a "long" jump is needed. It then inserts an unconditional jump to
> the conditional jump's
> target immediately after the conditional jump, inverts the condition on
> the conditional jump,
> and makes it jump over the new long jump to the instruction after it.
> (All of this machinery,
> with the exception of testing LONG_COND_BRANCH_SIZE, was already there.)
>
> So for any architecture that uses short conditional jumps,
> LONG_COND_BRANCH_SIZE
> should be defined, and the code for the architecture that converts
> short jumps to long
> jumps where needed should use LONG_COND_BRANCH_SIZE as part of its
> test. That's
> all that needs to be done for any such architecture.
>
> >
> >I'd also like to see changes to dwarf2 output so that this split is
> >properly represented in the debug info. Given that this feature is
> >not enabled except via explicit switch, this isn't imperitive, but
> >it should be done once this patch is accepted.
> >
> >
> >r~
> >