This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] decide edge's hotness when there is profile info
- From: Dehao Chen <dehao at google dot com>
- To: Jan Hubicka <hubicka at ucw dot cz>
- Cc: Xinliang David Li <davidxl at google dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Tue, 15 Oct 2013 07:56:29 -0700
- Subject: Re: [PATCH] decide edge's hotness when there is profile info
- Authentication-results: sourceware.org; auth=none
- References: <CAO2gOZVX2QuoUGRy6uRiT4A+=xdeCXmq04tF0Qr1kqezP-y5-A at mail dot gmail dot com> <CAAkRFZLHHDgtfCQQE06O8NKMfSzQuXyweHSoBUwTWseUu7KTnQ at mail dot gmail dot com> <CAO2gOZXwT1Nu4z5NKr1y6xkY2o1dQ6rNxvhT5ap1Kv-WAEo=Pg at mail dot gmail dot com> <20131014194908 dot GA16717 at kam dot mff dot cuni dot cz> <CAO2gOZU8rdLYJm240xz1T=s_+uOx=DV4nM8YEzBHMY9rC5dw4A at mail dot gmail dot com> <CAAkRFZJWSXd6wkVXwsvPf_zkjY=5DF0YhRL6LCnkPn1qttebUw at mail dot gmail dot com> <CAO2gOZVPv8G5yy6K+itKN=Ra6JCUkpZd2HGGfb34ZKkP75pmuA at mail dot gmail dot com> <20131014215027 dot GD17422 at kam dot mff dot cuni dot cz> <CAO2gOZXeGP8nZsa0aoji1dVxkRjT69G_ySU3iWbK9oJrjF2n_g at mail dot gmail dot com>
The patch updated:
Index: gcc/cgraph.c
===================================================================
--- gcc/cgraph.c (revision 203609)
+++ gcc/cgraph.c (working copy)
@@ -877,6 +877,10 @@ cgraph_create_edge_1 (struct cgraph_node *caller,
if (call_stmt && caller->call_site_hash)
cgraph_add_edge_to_call_site_hash (edge);
+ if (count > 0 && edge->callee
+ && edge->callee->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED)
+ edge->callee->frequency = NODE_FREQUENCY_NORMAL;
+
return edge;
}
Thanks,
Dehao
On Mon, Oct 14, 2013 at 3:26 PM, Dehao Chen <dehao@google.com> wrote:
> On Mon, Oct 14, 2013 at 2:50 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
>>> For my test case, the entire inline instance is optimized away, so
>>> there is no info about it in the profile. I can do some fixup in the
>>> rebuild_cgraph_edge though.
>>
>> Yep, I understand that. In this case we should turn PROFILE_READ to PROFILE_GUESSED
>> and guess the profile when we detect this (i.e. we have edges with non-0 counts into
>> functions with 0 profile). That should prvent these from getting UNLIKELY_EXECUTED
>> and they will be inlined normal way.
>
> Oh, actually in AutoFDO, only functions with samples will be marked as
> PROFILE_READ. Others will all be marked as PROFILE_GUESSED.
>
> Dehao
>
>>
>> Honza
>>>
>>> Dehao
>>>
>>> On Mon, Oct 14, 2013 at 2:27 PM, Xinliang David Li <davidxl@google.com> wrote:
>>> > Is it possible to update the callee node summary after profile
>>> > annotate (using information from inline instances which are not
>>> > inlined in early inline)?
>>> >
>>> > David
>>> >
>>> > On Mon, Oct 14, 2013 at 2:18 PM, Dehao Chen <dehao@google.com> wrote:
>>> >> On Mon, Oct 14, 2013 at 12:49 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
>>> >>>> Not for instrumented FDO (not as I know of). But for AutoFDO, this
>>> >>>> could be a potential risk because some callee is marked unlikely
>>> >>>> executed simply because they are inlined and eliminated in the O2
>>> >>>> binary. But in ipa-inline it will not get inlined because the edge is
>>> >>>> not hot from cgraph_maybe_hot_edge_p (because callee is
>>> >>>> UNLIKELY_EXECUTED), while the edge->count is actually hot.
>>> >>>
>>> >>> Can't you prevent setting calle to UNLIKELY_EXECUTED in these cases instead?
>>> >>> It seems that having profile set incorrectly will lead to other problems later, too.
>>> >>> We discussed similar problem with Teresa about the missing profiles for comdat,
>>> >>> basically one should detect these cases as profile being lost and go with guessed
>>> >>> profile. (I believe patch for that was posted, too, and so far it seems best approach
>>> >>> to this issue)
>>> >>
>>> >> The current AutoFDO implementation will take all functions that do not
>>> >> have have profile as normally executed, thus use guessed profile for
>>> >> it. This is like using profile for truly hot functions, and using O2
>>> >> for other functions. This works fine. However, it leads to larger code
>>> >> size (approximately 10%~20% larger than FDO).
>>> >>
>>> >> I'd like to introduce another mode for users who care about both
>>> >> performance and code size, and can be sure that profile is
>>> >> representative. In this mode, we will mark all functions without
>>> >> sample as "unlikely executed". However, because AutoFDO use debug info
>>> >> (of optimized code) to represent profile, it's possible that some hot
>>> >> functions (say foo) are inlined and fully eliminated into another hot
>>> >> function (say bar). So in the profile, bar is cold, and because the
>>> >> profile for foo::bar is eliminated, bar will not be inlined into foo
>>> >> before the profile annotation. However, after profile annotate, we can
>>> >> infer from the bb count that foo->bar is hot, thus it should be
>>> >> inlined in ipa-inline phase. However, because bar itself is marked
>>> >> UNLIKELY_EXECUTED, it will not be inlined.
>>> >>
>>> >> One possible workaround would be that during rebuild_cgraph_edges, if
>>> >> we find an edge's callee is unlikely executed, add the edge count to
>>> >> the callee's count and recalculate callee's frequency.
>>> >>
>>> >> Dehao
>>> >>
>>> >>>
>>> >>> Honza