This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Fix PR 53743 and other -freorder-blocks-and-partition failures (issue6823047)


On Thu, Nov 1, 2012 at 10:19 AM, Teresa Johnson <tejohnson@google.com> wrote:
> On Wed, Oct 31, 2012 at 4:02 PM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
>> On Tue, Oct 30, 2012 at 10:43 PM, Teresa Johnson wrote:
>>> Sure, I will give this a try after your verification patch tests
>>> complete. Does this mean that the patch you posted above to
>>> force_nonfallthru_and_redirect is no longer needed either? I'll see if
>>> I can avoid the need for some of my fixes, although I believe at least
>>> the function.c one will still be needed. I'll check.
>>
>> The force_nonfallthru change is still necessary, because
>> force_nonfallthru should be almost a no-op in cfglayout mode. The
>> whole concept of a fallthru edge doesn't exist in cfglayout mode: any
>> single_succ edge is a fallthru edge until the order of the basic
>> blocks has been determined and the insn chain is re-linked (cfglayout
>> mode originally was developed for bb-reorder, to move blocks around
>> more easily). So the correct patch would actually be:
>>
>> Index: cfgrtl.c
>> ===================================================================
>> --- cfgrtl.c    (revision 193046)
>> +++ cfgrtl.c    (working copy)
>> @@ -4547,7 +4547,7 @@ struct cfg_hooks cfg_layout_rtl_cfg_hooks = {
>>    cfg_layout_split_edge,
>>    rtl_make_forwarder_block,
>>    NULL, /* tidy_fallthru_edge */
>> -  rtl_force_nonfallthru,
>> +  NULL, /* force_nonfallthru */
>>    rtl_block_ends_with_call_p,
>>    rtl_block_ends_with_condjump_p,
>>    rtl_flow_call_edges_add,
>>
>> (Or better yet: Remove the force_nonfallthru and tidy_fallthru_edge
>> hooks, they are cfgrtl-only.)
>>
>> But obviously that won't work because
>> bb-reorder.c:fix_up_fall_thru_edges calls this hook while we're in
>> cfglayout mode. That is a bug. The call to force_nonfallthru results
>> in a "dangling" barrier:
>>
>> cfgrtl.c:1523  emit_barrier_after (BB_END (jump_block));
>>
>> In cfglayout mode, barriers don't exist in the insns chain, and they
>> don't have to because every edge is a fallthru edge. If there are
>> barriers before cfglayout mode, they are either removed or linked in
>> the basic block footer, and fixup_reorder_chain restores or inserts
>> barriers where necessary to drop out of cfglayout mode. This
>> emit_barrier_after call hangs a barrier after BB_END but not in the
>> footer, and I'm pretty sure the result will be that the barrier is
>> lost in fixup_reorder_chain. See also emit_barrier_after_bb for how
>> inserting a barrier should be done in cfglayout mode.
>>
>> So in short, bbpart doesn't know what it wants to be: a cfgrtl or a
>> cfglayout pass. It doesn't work without cfglayout but it's doing
>> things that are only correct in the cfgrtl world and Very Wrong Indeed
>> in cfglayout-land.
>>
>>
>>> Regarding your earlier question about why we needed to add the
>>> barrier, I need to dig up the details again but essentially I found
>>> that the barriers were being added by bbpart, but bbro was reordering
>>> things and the block that ended up at the border between the hot and
>>> cold section didn't necessarily have a barrier on it because it was
>>> not previously at the region boundary.
>>
>> That doesn't sound right. bbpart doesn't actually re-order the basic
>> blocks, it only marks the blocks with the partition they will be
>> assigned to. Whatever ends up at the border between the two partitions
>> is not relevant: the hot section cannot end in a fall-through edge to
>> the cold section (verify_flow_info even checks for that, see "fallthru
>> edge crosses section boundary (bb %i)") so it must end in some
>> explicit jump. Such jumps are always followed by a barrier. The only
>> reason I can think of why there might be a missing barrier, is because
>> fixup_reorder_chain has a bug and forgets to insert the barrier in
>> some cases (and I suspect this may be the case for return patterns, or
>> the a.m. issue of a dropper barrier).
>>
>> I would like to work on debugging this, but it's hard without test cases...
>
> I'm working on trying to reproduce some of these failures in a test
> case I can share. So far, I have only been able to reproduce the
> failure reported in PR 53743 in spec2006 (456.hmmer/sre_math.c). Still
> working on getting a smaller/shareable test case for the other 2
> issues.
>
> The failure in PR 53743 (assert in cfg_layout_merge_blocks) is what I
> had fixed with my original changes to cfgrtl.c. Need to understand why
> there is a reg crossing note between 2 bbs in the same partition.

Interestingly, this turned out to be the same root cause as the
verify_flow_info failures below. It is fixed by the same fix to
thread_prologue_and_epilogue_insns. When the code below created the
copy_bb and put it in e->src's partition, it made it insufficient for
the merge blocks routine to check if the two bbs were in the same
partition, because they were in the same partition but separated by
the region crossing jump.

I'll do some testing of the fix below, but do you have any comments on
the correctness or the potential issue I raised (see my note just
below the patch)?

Do you recommend pursuing the move of the bb partition phase until
later, after we leave cfglayout mode? I need to revisit to see if my
prologue/epilogue fix below also addresses the issue I saw when I
tried moving it.

Thanks,
Teresa

>
> In the hmmer test case I also hit a failures in rtl_verify_flow_info
> and rtl_verify_flow_info_1:
>
> gcc  -c -o sre_math.o -DSPEC_CPU -D
> NDEBUG    -fprofile-use -freorder-blocks-and-partition   -O2
>      sre_math.c
> sre_math.c: In function ‘Gammln’:
> sre_math.c:161:1: error: EDGE_CROSSING incorrectly set across same section
>  }
>  ^
> sre_math.c:161:1: error: missing barrier after block 6
> sre_math.c:161:1: internal compiler error: verify_flow_info failed
>
>
> This was due to some code in thread_prologue_and_epilogue_insns that
> duplicated tail blocks:
>
>                 if (e)
>                   {
>                     copy_bb = create_basic_block (NEXT_INSN (BB_END (e->src)),
>                                                   NULL_RTX, e->src);
>                     BB_COPY_PARTITION (copy_bb, e->src);
>                   }
>
> In this case e->src (bb 6) was in the cold section and e->dest was in
> the hot section, and e->src ended with a REG_CROSSING_JUMP followed by
> a barrier. The new copy_bb got put into the cold section by the copy
> partition above, leading to the first error. And because the
> create_basic_block call inserted the new copy_bb before NEXT_INSN
> (BB_END (e->src)), which in this case was the barrier, we ended up
> without the barrier after the crossing edge.
>
> I fixed this by making the following change:
>
> --- function.c  (revision 192692)
> +++ function.c  (working copy)
> @@ -6249,9 +6249,18 @@ thread_prologue_and_epilogue_insns (void)
>                     break;
>                 if (e)
>                   {
> +                    rtx note;
>                     copy_bb = create_basic_block (NEXT_INSN (BB_END (e->src)),
>                                                   NULL_RTX, e->src);
>                     BB_COPY_PARTITION (copy_bb, e->src);
> +                    /* Remove the region crossing note from jump at end of
> +                       e->src if it exists.  */
> +                    note = find_reg_note (BB_END (e->src),
> REG_CROSSING_JUMP, NULL_RTX);
> +                    if (note)
> +                      /* There would also have been a barrier after
> e->src, that
> +                         is now after copy_bb, but that shouldn't be a
> +                         problem?.  */
> +                      remove_note (BB_END (e->src), note);
>                   }
>                 else
>                   {
>
> But I am not sure this is really correct in all cases - for example,
> what if another hot bb that also didn't require a prologue branched
> into the new cloned tail sequence, which is now cold? E.g.
> dup_block_and_redirect will redirect all predecessors that don't need
> a prologue to the new copy.
>
> I'm going to see if I can get the other 2 failures I had found to
> trigger on spec or a smaller test case.
>
> Teresa
>
>>
>> Ciao!
>> Steven
>
>
>
> --
> Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413



-- 
Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]