This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] decide edge's hotness when there is profile info
- From: Dehao Chen <dehao at google dot com>
- To: Teresa Johnson <tejohnson at google dot com>
- Cc: Jan Hubicka <hubicka at ucw dot cz>, Xinliang David Li <davidxl at google dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Tue, 15 Oct 2013 08:40:28 -0700
- Subject: Re: [PATCH] decide edge's hotness when there is profile info
- Authentication-results: sourceware.org; auth=none
- References: <CAO2gOZVX2QuoUGRy6uRiT4A+=xdeCXmq04tF0Qr1kqezP-y5-A at mail dot gmail dot com> <CAAkRFZLHHDgtfCQQE06O8NKMfSzQuXyweHSoBUwTWseUu7KTnQ at mail dot gmail dot com> <CAO2gOZXwT1Nu4z5NKr1y6xkY2o1dQ6rNxvhT5ap1Kv-WAEo=Pg at mail dot gmail dot com> <20131014194908 dot GA16717 at kam dot mff dot cuni dot cz> <CAO2gOZU8rdLYJm240xz1T=s_+uOx=DV4nM8YEzBHMY9rC5dw4A at mail dot gmail dot com> <CAAkRFZJWSXd6wkVXwsvPf_zkjY=5DF0YhRL6LCnkPn1qttebUw at mail dot gmail dot com> <CAO2gOZVPv8G5yy6K+itKN=Ra6JCUkpZd2HGGfb34ZKkP75pmuA at mail dot gmail dot com> <20131014215027 dot GD17422 at kam dot mff dot cuni dot cz> <CAO2gOZXeGP8nZsa0aoji1dVxkRjT69G_ySU3iWbK9oJrjF2n_g at mail dot gmail dot com> <CAAe5K+VJ70+Sj_J-uwcr+W6THPZXe5cy-hwtBCpJXms2=nPimg at mail dot gmail dot com>
Thanks for the pointer to Honza's patch. The patch does exactly what I
need. But it only resides in the instrumentation based FDO path. Can
we move the code to more common place (like rebuild_cgraph_edges)?
Thanks,
Dehao
On Tue, Oct 15, 2013 at 7:59 AM, Teresa Johnson <tejohnson@google.com> wrote:
> On Mon, Oct 14, 2013 at 3:26 PM, Dehao Chen <dehao@google.com> wrote:
>> On Mon, Oct 14, 2013 at 2:50 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
>>>> For my test case, the entire inline instance is optimized away, so
>>>> there is no info about it in the profile. I can do some fixup in the
>>>> rebuild_cgraph_edge though.
>>>
>>> Yep, I understand that. In this case we should turn PROFILE_READ to PROFILE_GUESSED
>>> and guess the profile when we detect this (i.e. we have edges with non-0 counts into
>>> functions with 0 profile). That should prvent these from getting UNLIKELY_EXECUTED
>>> and they will be inlined normal way.
>>
>> Oh, actually in AutoFDO, only functions with samples will be marked as
>> PROFILE_READ. Others will all be marked as PROFILE_GUESSED.
>
> Here is Honza's patch that he was referring to:
>
> Index: tree-profile.c
> ===================================================================
> --- tree-profile.c (revision 201838)
> +++ tree-profile.c (working copy)
> @@ -604,6 +604,34 @@
>
> pop_cfun ();
> }
> + /* See if 0 count function has non-0 count callers. In this case we
> + lost some profile. Drop its function profile to PROFILE_GUESSED. */
> + FOR_EACH_DEFINED_FUNCTION (node)
> + {
> + struct cgraph_edge *e;
> + bool called = false;
> + if (node->count)
> + continue;
> + for (e = node->callers; e; e = e->next_caller)
> + {
> + if (e->count)
> + called = true;
> + if (cgraph_maybe_hot_edge_p (e))
> + break;
> + }
> + if (e || called
> + && profile_status_for_function
> + (DECL_STRUCT_FUNCTION (node->symbol.decl)) == PROFILE_READ)
> + {
> + if (dump_file)
> + fprintf (stderr, "Dropping 0 profile for %s/%i.%s based on calls.\n",
> + cgraph_node_name (node), node->symbol.order,
> + e ? "function is hot" : "function is normal");
> + profile_status_for_function (DECL_STRUCT_FUNCTION (node->symbol.decl))
> + = (flag_guess_branch_prob ? PROFILE_GUESSED : PROFILE_ABSENT);
> + node->frequency = e ? NODE_FREQUENCY_HOT : NODE_FREQUENCY_NORMAL;
> + }
> + }
>
> del_node_map();
> return 0;
> Index: predict.c
> ===================================================================
> --- predict.c (revision 201838)
> +++ predict.c (working copy)
> @@ -2715,6 +2715,9 @@
> gcov_type count_max, true_count_max = 0;
> basic_block bb;
>
> + if (!ENTRY_BLOCK_PTR->count)
> + return 0;
> +
> FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR, NULL, next_bb)
> true_count_max = MAX (bb->count, true_count_max);
>
>
> Which is discussed in this email:
> http://gcc.gnu.org/ml/gcc-patches/2013-08/msg01185.html
>
> For COMDATs I need to extend this to do it a little later to do it
> recursively to catch the case of COMDATs feeding other COMDATs and I
> need to do some other handling to compute counts from the frequencies
> when inlining. I have been meaning to work on this for awhile but
> finally am getting to it this week. (Here's the last message from a
> later thread that forked off the above one:
> http://gcc.gnu.org/ml/gcc-patches/2013-08/msg01907.html)
>
> In the meantime, perhaps Honza's patch will suffice?
>
> Teresa
>
>>
>> Dehao
>>
>>>
>>> Honza
>>>>
>>>> Dehao
>>>>
>>>> On Mon, Oct 14, 2013 at 2:27 PM, Xinliang David Li <davidxl@google.com> wrote:
>>>> > Is it possible to update the callee node summary after profile
>>>> > annotate (using information from inline instances which are not
>>>> > inlined in early inline)?
>>>> >
>>>> > David
>>>> >
>>>> > On Mon, Oct 14, 2013 at 2:18 PM, Dehao Chen <dehao@google.com> wrote:
>>>> >> On Mon, Oct 14, 2013 at 12:49 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
>>>> >>>> Not for instrumented FDO (not as I know of). But for AutoFDO, this
>>>> >>>> could be a potential risk because some callee is marked unlikely
>>>> >>>> executed simply because they are inlined and eliminated in the O2
>>>> >>>> binary. But in ipa-inline it will not get inlined because the edge is
>>>> >>>> not hot from cgraph_maybe_hot_edge_p (because callee is
>>>> >>>> UNLIKELY_EXECUTED), while the edge->count is actually hot.
>>>> >>>
>>>> >>> Can't you prevent setting calle to UNLIKELY_EXECUTED in these cases instead?
>>>> >>> It seems that having profile set incorrectly will lead to other problems later, too.
>>>> >>> We discussed similar problem with Teresa about the missing profiles for comdat,
>>>> >>> basically one should detect these cases as profile being lost and go with guessed
>>>> >>> profile. (I believe patch for that was posted, too, and so far it seems best approach
>>>> >>> to this issue)
>>>> >>
>>>> >> The current AutoFDO implementation will take all functions that do not
>>>> >> have have profile as normally executed, thus use guessed profile for
>>>> >> it. This is like using profile for truly hot functions, and using O2
>>>> >> for other functions. This works fine. However, it leads to larger code
>>>> >> size (approximately 10%~20% larger than FDO).
>>>> >>
>>>> >> I'd like to introduce another mode for users who care about both
>>>> >> performance and code size, and can be sure that profile is
>>>> >> representative. In this mode, we will mark all functions without
>>>> >> sample as "unlikely executed". However, because AutoFDO use debug info
>>>> >> (of optimized code) to represent profile, it's possible that some hot
>>>> >> functions (say foo) are inlined and fully eliminated into another hot
>>>> >> function (say bar). So in the profile, bar is cold, and because the
>>>> >> profile for foo::bar is eliminated, bar will not be inlined into foo
>>>> >> before the profile annotation. However, after profile annotate, we can
>>>> >> infer from the bb count that foo->bar is hot, thus it should be
>>>> >> inlined in ipa-inline phase. However, because bar itself is marked
>>>> >> UNLIKELY_EXECUTED, it will not be inlined.
>>>> >>
>>>> >> One possible workaround would be that during rebuild_cgraph_edges, if
>>>> >> we find an edge's callee is unlikely executed, add the edge count to
>>>> >> the callee's count and recalculate callee's frequency.
>>>> >>
>>>> >> Dehao
>>>> >>
>>>> >>>
>>>> >>> Honza
>
>
>
> --
> Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413