This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Fix PR 53743 and other -freorder-blocks-and-partition failures (issue6823047)


On Fri, May 10, 2013 at 2:00 PM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
> On Fri, May 10, 2013 at 5:54 PM, Teresa Johnson wrote:
>> The main issue I had here, and why I made this change, is that we go
>> in and out of cfglayout mode several times after bb partitioning and
>> then out_of_cfglayout. The problem was that when we subsequently went
>> in and out of cfglayout mode, the switch text section notes that had
>> been inserted by bbpart were getting messed up (they were moved into
>> the bb header when we enter cfglayout mode and then not being
>> transferred to the correct location upon exit).
>>
>> I investigated trying
>> to keep those in sync, but it is really difficult/impossible to do
>> during cfglayout mode when they are in the header. So I simply strip
>> them out completely on entry to cfglayout mode, and if there were any
>> there on entry, this change ensures that they are restored in the
>> appropriate location upon exit. I'm not sure what is a good
>> alternative?
>
> The problem is that the note exists at all. I'd like to see the note
> go away completely eventually (when we have a CFG all the way through
> pass_final). As a stop-gap, we should not emit the note until
> pass_free_cfg. Up to that point the basic blocks tell what partition
> they're in, and all hot block and cold blocks should be in a sequence.
> So during pass_free_cfg, just walk the basic blocks chain until
> there's a partition change, and emit the note between those two
> blocks.

Ok, let me take a stab at that.

>
>
>> I triggered the same error in 445.gobmk once I applied the
>> thread_prologue_and_epilogue_insns fixes. This is an assert in the
>> dwarf CFI code that complains about a NOTE_INSN_SWITCH_TEXTS_SECTION
>> note not being preceeded by a barrier:
>
> The problem here is that fixup_reorder_chain should force a jump for
> basic blocks in one partition falling through to bb->next. That should
> be dealt with in can_fallthru, which should return false if
> BB_PARTITION (target) != BB_PARTITION (src).

That wasn't the issue in this case.

Here there was a block that happened to be laid out at the very start
of the cold section (it was jumped to from elsewhere, not reached via
fall through from its layout predecessor). Thus it was preceded by a
switch section note, which was put into the bb header when we entered
cfglayout mode for compgoto. The note ended up in the middle of the
block when we did some block combining with its cfg predecessor (not
the block that preceded it in the layout chain, which was the last hot
block in the reorder chain).

So the issue here wasn't making sure there is no fall-through across
section boundaries. There was already code to prevent this, but my
patch contains a few fixes to make sure that is always happening.

>
>
>> The correct solution in my opinion is to strip out the SWITCH note
>> every time we enter cfglayout mode after bbro, and then invoke
>> insert_section_boundary_note when leaving cfglayout (if one was found
>> on entry to that cfglayout mode) to reapply it.
>
> Please make the note go away completely before pass_free_cfg, and earn
> greater admiration than Zeus. The note always was wrong, and now
> you've shown it's also a problem.

Ok, I will try. =)

>
>
>>> * Fixup redirected edges that did not cross partitions before but
>>> apparently do after redirection. This is not supposed to happen in the
>>> first place, so fixing up any of this just papers over an error
>>> elsewhere in the compiler.
>>
>> Looking back through the earlier email exchanges from last fall I
>> found some discussion on this where I had found a couple places
>> causing this to happen. Two were in
>> thread_prologue_and_epilogue_insns: when we duplicated tail blocks or
>> created a new block to hold a simple return, and redirected some of
>> the edges to the new copy. In some cases this caused edges that were
>> previously region crossing to become non-region crossing, and in other
>> cases the reverse happened. I think there are a couple of other cases
>> mentioned in the emails too, but I would need to do some more digging
>> to find them all.
>>
>> There were a few places in the code that tried to detect this type of
>> issue, and fix them up, but they weren't consistent and there were
>> other places that had the same issue. I've centralized all that
>> handling in this patch (fixup_partitions and fixup_partition_crossing,
>> called from a few places where we redirect edges) so that it is more
>> consistent and comprehensive.
>
> Right, I think it's good that you've centralized this code. But it
> seems to me that we should make sure that the hot blocks and cold
> blocks are still grouped (i.e. there is only one basic block B such
> that BB_PARTITION(B) != BB_PARTITION(B->next_bb). That is something
> your code doesn't handle, AFAICT. It's just one thing that's difficult
> to maintain, probably there are others. It's also something the
> partitioning verifier should check, i.e. that the basic blocks in hot
> and cold partitions are properly grouped.

Actually, there is already code that verifies this at the end of bbro
(verify_hot_cold_block_grouping()). Before bb reordering it doesn't
make sense to check this.

And AFAICT, after bbro the only place we go into and out of cfglayout
mode is compgoto, which duplicates blocks along edges only if they
don't cross a partition boundary, and lays out the duplicated block
adjacent to the original. I haven't seen any places where this is
violated, probably as a result. But it wouldn't be a bad idea to call
verify_hot_cold_block_grouping again during the flow verification code
once we detect/flag that bbro is complete.

>
>
> BTW1: I don't understand this comment:
>
>> +  /* Invoke the cleanup again once we are out of cfg layout mode
>> +     after committing the final bb layout above. This enables removal
>> +     of forwarding blocks across the hot/cold section boundary when
>> +     splitting is enabled that were necessary while in cfg layout
>> +     mode.  */
>> +  if (crtl->has_bb_partition)
>> +    cleanup_cfg (CLEANUP_EXPENSIVE);
>
> There shouldn't be any forwarder blocks in cfg layout mode. What did
> you need this for?

This was a performance fix.

There is code in try_forward_edges, called from try_optimize_cfg that
we call from cleanup_cfg, typically in cfglayout mode, that will not
eliminate forwarding blocks when either the given block "b" or its
successor block ends with a region-crossing jump. The comments
indicate that these need to be left in to ensure we don't fall through
across section boundaries, which makes sense. The issue here was that
I saw the blocks in the hot partition ending in conditional branches,
which had a fall-through to another hot section block, and the
conditional jump led to yet another block in the hot section that
simply contained an unconditional jump to a cold section block. So in
this case when try_forward_edges was called with the block with the
conditional branch, when we look at its successor (the forwarding
block), we can't eliminate it since it ends in a region crossing
branch. I guess the concern is that if the conditional branch sense
was reversed in cfglayout mode we would end up falling through to a
different region. But once we leave cfglayout mode that should not
occur. So I loosened up the checks on the successor block so that it
is ok if it ends in a region crossing branch when we are in cfgrtl
mode (and added this call). That way, these forwarding blocks are
eliminated and we are able to have a region crossing conditional jump
directly to the cold section block, without the intervening forwarding
block.

>
> BTW2: We badly need to figure out a way to create test cases for FDO... :-(

Yes. I had tried testing awhile back with the gcc regression tests and
enabling -freorder-blocks-and-partition, but none of the issues I was
having with larger benchmarks fired. I think there just aren't enough
(or large/complex enough?) FDO tests in gcc.dg/tree-prof and elsewhere
to trigger this. I was able to trigger many of the issues when
compiling cpu2006 with fdo and partitioning enabled, but it will take
some work to cut them down.

Thanks,
Teresa

>
> Ciao!
> Steven



--
Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]