This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [GOOGLE] Fix AutoFDO size issue
- From: Dehao Chen <dehao at google dot com>
- To: Xinliang David Li <davidxl at google dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Mon, 17 Nov 2014 12:47:22 -0800
- Subject: Re: [GOOGLE] Fix AutoFDO size issue
- Authentication-results: sourceware.org; auth=none
- References: <CAO2gOZV8=MEHb_Vz6+ZNrUGe0dsyfG3F1hYd+mr2gyFHSCeG1Q at mail dot gmail dot com> <CAAkRFZL-L39L2mnapunhw9_CTSMPOhZyikfsnziWSQXmWv23eg at mail dot gmail dot com> <CAAkRFZJ01Cf2nBKVXdHKLWxTUpdRkO7fDdDSCgbiRTEzTMqr-w at mail dot gmail dot com> <CAO2gOZWy9SstnK=KQkX3T6ukXTTUtd84dT4ZaRt+7jfGACj0jA at mail dot gmail dot com> <CAAkRFZ+MkwgzXMiCG7A8_gMpi7NbCq7tEiWtcqp-6wWqWYn4og at mail dot gmail dot com> <CAO2gOZUzX_axcSTeSXn2B_B9v9Dy8SrEPs3AFvAB2thD16Z6FA at mail dot gmail dot com>
The patch was updated to ignore comdat einline tuning for AutoFDO.
Performance testing is green.
OK for google-4_9?
Thanks,
Dehao
Index: gcc/auto-profile.c
===================================================================
--- gcc/auto-profile.c (revision 217523)
+++ gcc/auto-profile.c (working copy)
@@ -1771,6 +1771,7 @@ auto_profile (void)
free_dominance_info (CDI_DOMINATORS);
free_dominance_info (CDI_POST_DOMINATORS);
rebuild_cgraph_edges ();
+ compute_inline_parameters (cgraph_get_node
(current_function_decl), true);
pop_cfun ();
}
Index: gcc/ipa-inline.c
===================================================================
--- gcc/ipa-inline.c (revision 217523)
+++ gcc/ipa-inline.c (working copy)
@@ -501,7 +501,7 @@ want_early_inline_function_p (struct cgraph_edge *
growth);
want_inline = false;
}
- else if (DECL_COMDAT (callee->decl)
+ else if (!flag_auto_profile && DECL_COMDAT (callee->decl)
&& growth <= PARAM_VALUE (PARAM_EARLY_INLINING_INSNS_COMDAT))
;
else if ((n = num_calls (callee)) != 0
On Thu, Nov 13, 2014 at 3:42 PM, Dehao Chen <dehao@google.com> wrote:
> We do not do sophisticated recursive call detection in einline phase.
> It only happens in ipa-inline phase.
>
> Dehao
>
> On Thu, Nov 13, 2014 at 3:18 PM, Xinliang David Li <davidxl@google.com> wrote:
>> On Thu, Nov 13, 2014 at 2:57 PM, Dehao Chen <dehao@google.com> wrote:
>>> IIRC, AutoFDO the actual iteration for AutoFDO is mostly <3. But it
>>> should not harm to set max iter as 10.
>>>
>>> On Thu, Nov 13, 2014 at 2:51 PM, Xinliang David Li <davidxl@google.com> wrote:
>>>> After inline summary is recomputed, the large code growth problem will
>>>> also be better controlled, right?
>>>
>>> For this case, recomputing inline summary does not help because the
>>> code was bloated in first einline phase.
>>
>> For recursive inlining, the inline summary for the cloned edges need
>> to be updated to prevent the growth?
>>
>> david
>>
>>>
>>> Dehao
>>>
>>>>
>>>> David
>>>>
>>>> On Thu, Nov 13, 2014 at 2:48 PM, Xinliang David Li <davidxl@google.com> wrote:
>>>>> Is there a need to have 10 iterations of early inline for autofdo?
>>>>>
>>>>> David
>>>>>
>>>>> On Thu, Nov 13, 2014 at 2:25 PM, Dehao Chen <dehao@google.com> wrote:
>>>>>> In AutoFDO, we increase einline iterations. This could lead to
>>>>>> extensive code bloat if we have recursive calls like:
>>>>>>
>>>>>> dtor() {
>>>>>> destroy(node);
>>>>>> }
>>>>>>
>>>>>> destroy(node) {
>>>>>> destroy(left)
>>>>>> destroy(right)
>>>>>> }
>>>>>>
>>>>>> In this case, the size growth will be around 8 which is smaller than
>>>>>> threshold (11). However, if we allow this to happen for 2 iterations,
>>>>>> it will expand the size by 1024X. To fix this problem, we want to set
>>>>>> a much smaller threshold in the AutoFDO case. This is because AutoFDO
>>>>>> do not not rely on aggressive einline to gain more profile context.
>>>>>>
>>>>>> And also, in AutoFDO pass, after we processed a function, we need to
>>>>>> recompute inline parameters because rebuild_cgraph_edges will zero out
>>>>>> all inline parameters.
>>>>>>
>>>>>> The patch is attached below, bootstrapped and perf test on-going. OK
>>>>>> for google-4_9?
>>>>>>
>>>>>> Thanks,
>>>>>> Dehao
>>>>>>
>>>>>> Index: gcc/auto-profile.c
>>>>>> ===================================================================
>>>>>> --- gcc/auto-profile.c (revision 217523)
>>>>>> +++ gcc/auto-profile.c (working copy)
>>>>>> @@ -1771,6 +1771,7 @@ auto_profile (void)
>>>>>> free_dominance_info (CDI_DOMINATORS);
>>>>>> free_dominance_info (CDI_POST_DOMINATORS);
>>>>>> rebuild_cgraph_edges ();
>>>>>> + compute_inline_parameters (cgraph_get_node
>>>>>> (current_function_decl), true);
>>>>>> pop_cfun ();
>>>>>> }
>>>>>>
>>>>>> Index: gcc/opts.c
>>>>>> ===================================================================
>>>>>> --- gcc/opts.c (revision 217523)
>>>>>> +++ gcc/opts.c (working copy)
>>>>>> @@ -1853,6 +1853,12 @@ common_handle_option (struct gcc_options *opts,
>>>>>> maybe_set_param_value (
>>>>>> PARAM_EARLY_INLINER_MAX_ITERATIONS, 10,
>>>>>> opts->x_param_values, opts_set->x_param_values);
>>>>>> + maybe_set_param_value (
>>>>>> + PARAM_EARLY_INLINING_INSNS, 4,
>>>>>> + opts->x_param_values, opts_set->x_param_values);
>>>>>> + maybe_set_param_value (
>>>>>> + PARAM_EARLY_INLINING_INSNS_COMDAT, 4,
>>>>>> + opts->x_param_values, opts_set->x_param_values);
>>>>>> value = true;
>>>>>> /* No break here - do -fauto-profile processing. */
>>>>>> case OPT_fauto_profile: