[Bug bootstrap/63432] [5 Regression] profiledbootstrap failure with bootstrap-lto

tejohnson at google dot com gcc-bugzilla@gcc.gnu.org
Sat Oct 4 19:30:00 GMT 2014


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63432

--- Comment #16 from Teresa Johnson <tejohnson at google dot com> ---
On Sat, Oct 4, 2014 at 8:34 AM, Teresa Johnson <tejohnson@google.com> wrote:
> On Fri, Oct 3, 2014 at 11:15 PM, Teresa Johnson <tejohnson@google.com> wrote:
>> On Fri, Oct 3, 2014 at 3:35 PM, Teresa Johnson <tejohnson@google.com> wrote:
>>> Thanks to H.J. for the test case, I have reproduced the issue. It
>>> exposed two separate problems. Cc'ing Honza and Jeff for input on the
>>> profile count and jump threading issues, respectively.
>>>
>>> The first is that we are calling my new freqs_to_counts_path in cases
>>> where the functions do have non-zero profile counts. I am invoking
>>> this when either the profile status is != PROFILE_READ, or when the
>>> entry block count is 0. The latter is the case where the read in
>>> profile had zero counts for the function, so we kept the guessed
>>> frequencies (see counts_to_freqs in predict.c). In some cases I am
>>> finding the the profile count on the entry block is 0, but the rest of
>>> the blocks (bb 2 and descendants) have non-zero profile counts. I
>>> thought the way to check for this case was to look at the entry
>>> block's count, and in fact that is what we seem to do in other places.
>>> I could also check the entry block successor counts, or I could simply
>>> check the blocks along the path (to see if they all have 0 counts but
>>> non-0 frequencies). I don't want to check all bbs in the function for
>>> every path. Honza, is there a good way to detect this case (function
>>> is marked PROFILE_READ but has all-0 bb counts and we kept the
>>> estimated frequencies)?
>
> I am also seeing cases where the entry bb 0 has a non-zero count, but
> the rest of the bbs in the function have 0 counts and estimated
> frequencies. So I think the safest thing to do here, and what I have
> implemented, is if the profile status is PROFILE_READ, then check all
> the bbs/edges along the path - if and only if they all have 0 counts
> and there is at least one non-zero frequency do we do the
> freqs_to_counts.
>
>>>
>>> Even when I skip freqs_to_counts_path in this case, we still get the
>>> insane edge probability. I dumped out some more verbose info while
>>> jump threading, and I am seeing a jump threading path that I don't
>>> understand. Since it violates my assumptions, the new profile update
>>> computations go haywire. Here is the path that we are attempting to
>>> thread:
>>>
>>> (119, 150) incoming edge;  (150, 155) joiner;  (13, 15) normal;
>>>
>>> Notice that the normal edge is not connected to the rest of the path.
>>> This path appears to be constructed during jump threading, as block
>>> 155 was created by an earlier threading opportunity. In fact, the
>>> earlier threadings that created bb 155 removed all predecessors of bb
>>> 13. We originally had
>>>
>>> bb 150 with successors bb 12 and bb 13
>>> bb 12 with successor bb 13.
>>>
>>> Then we do:
>>>   Threaded jump 12 --> 13 to 155
>>>   Threaded jump 150 --> 13 to 155
>>> and bb 13 has no more predecessors. So I'm not sure what it means to
>>> have a jump threading path through 150 and 13?
>>>
>>> Jeff, is the above type of jump threading path expected and
>>> reasonable? If so, I need to redo some of the profile update code to
>>> handle it better.
>>
>> I see what happened - 155 is a duplicate of 13 from the earlier
>> threading. So the path that was originally:
>>
>> (119, 150) incoming edge;  (150, 13) joiner;  (13, 15) normal;
>>
>> became
>>
>> (119, 150) incoming edge;  (150, 155) joiner;  (13, 15) normal;
>>
>> This happened naturally when we redirected the incoming edge 150->13
>> to 155 on the earlier jump threading. During the same threading we
>> created duplicate bb 154 of bb 15, and redirect bb 155 to bb 154. This
>> redirection does not affect the original 13->15 edge, and thus 13->15
>> remains on the above path. The 155->154 edge should be the new edge in
>> the path instead of 13->15. But the edges recorded in other paths are
>> not explicitly updated, so the normal edge was left alone.
>>
>> This works out ok from a correctness standpoint, since bb 13 still
>> exists and has the same outgoing edges as it did before, which were
>> duplicated onto bb 154. However, those outgoing edges have had their
>> profile counts updated and it no longer works out to use those profile
>> counts to update the counts along this new threading path. We really
>> want to look at 155->154 for the profile to update.
>>
>> Probably we should be updating remaining jump threading paths when we
>> perform jump threading. I need to think about the best way to do that.
>
> Looking at the code, I think it attempts to check for this case and
> prevent it but that code does not work in this case because of the
> order the paths are identified and subsequently iterated in the paths
> vec. In mark_threaded_blocks it looks at paths with joiners and if
> there are any edges along the path already part of another path it
> will skip that path. But in this case, we registered the paths in this
> order:
>
>   Registering jump thread: (119, 150) incoming edge;  (150, 13)
> joiner;  (13, 15) normal;
> ...
>   Registering jump thread: (150, 13) incoming edge;  (13, 15) joiner;
> (15, 17) normal;
>
> For the first path, the path is attached to incoming edge 119->150. So
> when we walk edges in the 2nd path we don't see the first path. If we
> looked at the paths in the reverse order we would have seen the path
> attached to incoming edge 150->13 and skipped the 119->150 path. Note
> that we end up doing the actual threading in the reverse order - first
> we do the threading through 13 (the second path), then later the
> threading through 150 (the first path), leading to the issue I am
> seeing.
>
> Jeff, what is intended here - should we not be threading both of these paths?

I have a patch to make the mark_threaded_blocks checking of paths work
regardless of the ordering of paths in the vec. This fixes the
failure.

The other approach is whenever we finish threading a path, go through
the vec of remaining paths and update the edges for any that have been
affected by the threading and that should instead include the
duplicated edges.

Teresa

>
> Thanks,
> Teresa
>
> --
> Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413



More information about the Gcc-bugs mailing list