[GOOGLE] Fix AutoFDO size issue

Xinliang David Li davidxl@google.com
Mon Nov 17 22:58:00 GMT 2014


Ok for now as a workraround, but this is probably not a long term fix.

David


On Mon, Nov 17, 2014 at 12:47 PM, Dehao Chen <dehao@google.com> wrote:
> The patch was updated to ignore comdat einline tuning for AutoFDO.
> Performance testing is green.
>
> OK for google-4_9?
>
> Thanks,
> Dehao
>
> Index: gcc/auto-profile.c
> ===================================================================
> --- gcc/auto-profile.c (revision 217523)
> +++ gcc/auto-profile.c (working copy)
> @@ -1771,6 +1771,7 @@ auto_profile (void)
>        free_dominance_info (CDI_DOMINATORS);
>        free_dominance_info (CDI_POST_DOMINATORS);
>        rebuild_cgraph_edges ();
> +      compute_inline_parameters (cgraph_get_node
> (current_function_decl), true);
>        pop_cfun ();
>      }
>
> Index: gcc/ipa-inline.c
> ===================================================================
> --- gcc/ipa-inline.c (revision 217523)
> +++ gcc/ipa-inline.c (working copy)
> @@ -501,7 +501,7 @@ want_early_inline_function_p (struct cgraph_edge *
>       growth);
>    want_inline = false;
>   }
> -      else if (DECL_COMDAT (callee->decl)
> +      else if (!flag_auto_profile && DECL_COMDAT (callee->decl)
>                 && growth <= PARAM_VALUE (PARAM_EARLY_INLINING_INSNS_COMDAT))
>          ;
>        else if ((n = num_calls (callee)) != 0
>
> On Thu, Nov 13, 2014 at 3:42 PM, Dehao Chen <dehao@google.com> wrote:
>> We do not do sophisticated recursive call detection in einline phase.
>> It only happens in ipa-inline phase.
>>
>> Dehao
>>
>> On Thu, Nov 13, 2014 at 3:18 PM, Xinliang David Li <davidxl@google.com> wrote:
>>> On Thu, Nov 13, 2014 at 2:57 PM, Dehao Chen <dehao@google.com> wrote:
>>>> IIRC, AutoFDO the actual iteration for AutoFDO is mostly <3. But it
>>>> should not harm to set max iter as 10.
>>>>
>>>> On Thu, Nov 13, 2014 at 2:51 PM, Xinliang David Li <davidxl@google.com> wrote:
>>>>> After inline summary is recomputed, the large code growth problem will
>>>>> also be better controlled, right?
>>>>
>>>> For this case, recomputing inline summary does not help because the
>>>> code was bloated in first einline phase.
>>>
>>> For recursive inlining, the inline summary for the cloned edges need
>>> to be updated to prevent the growth?
>>>
>>> david
>>>
>>>>
>>>> Dehao
>>>>
>>>>>
>>>>> David
>>>>>
>>>>> On Thu, Nov 13, 2014 at 2:48 PM, Xinliang David Li <davidxl@google.com> wrote:
>>>>>> Is there a need to have 10 iterations of early inline for autofdo?
>>>>>>
>>>>>> David
>>>>>>
>>>>>> On Thu, Nov 13, 2014 at 2:25 PM, Dehao Chen <dehao@google.com> wrote:
>>>>>>> In AutoFDO, we increase einline iterations. This could lead to
>>>>>>> extensive code bloat if we have recursive calls like:
>>>>>>>
>>>>>>> dtor() {
>>>>>>>   destroy(node);
>>>>>>> }
>>>>>>>
>>>>>>> destroy(node) {
>>>>>>>   destroy(left)
>>>>>>>   destroy(right)
>>>>>>> }
>>>>>>>
>>>>>>> In this case, the size growth will be around 8 which is smaller than
>>>>>>> threshold (11). However, if we allow this to happen for 2 iterations,
>>>>>>> it will expand the size by 1024X. To fix this problem, we want to set
>>>>>>> a much smaller threshold in the AutoFDO case. This is because AutoFDO
>>>>>>> do not not rely on aggressive einline to gain more profile context.
>>>>>>>
>>>>>>> And also, in AutoFDO pass, after we processed a function, we need to
>>>>>>> recompute inline parameters because rebuild_cgraph_edges will zero out
>>>>>>> all inline parameters.
>>>>>>>
>>>>>>> The patch is attached below, bootstrapped and perf test on-going. OK
>>>>>>> for google-4_9?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Dehao
>>>>>>>
>>>>>>> Index: gcc/auto-profile.c
>>>>>>> ===================================================================
>>>>>>> --- gcc/auto-profile.c (revision 217523)
>>>>>>> +++ gcc/auto-profile.c (working copy)
>>>>>>> @@ -1771,6 +1771,7 @@ auto_profile (void)
>>>>>>>        free_dominance_info (CDI_DOMINATORS);
>>>>>>>        free_dominance_info (CDI_POST_DOMINATORS);
>>>>>>>        rebuild_cgraph_edges ();
>>>>>>> +      compute_inline_parameters (cgraph_get_node
>>>>>>> (current_function_decl), true);
>>>>>>>        pop_cfun ();
>>>>>>>      }
>>>>>>>
>>>>>>> Index: gcc/opts.c
>>>>>>> ===================================================================
>>>>>>> --- gcc/opts.c (revision 217523)
>>>>>>> +++ gcc/opts.c (working copy)
>>>>>>> @@ -1853,6 +1853,12 @@ common_handle_option (struct gcc_options *opts,
>>>>>>>        maybe_set_param_value (
>>>>>>>   PARAM_EARLY_INLINER_MAX_ITERATIONS, 10,
>>>>>>>   opts->x_param_values, opts_set->x_param_values);
>>>>>>> +      maybe_set_param_value (
>>>>>>> +        PARAM_EARLY_INLINING_INSNS, 4,
>>>>>>> +        opts->x_param_values, opts_set->x_param_values);
>>>>>>> +      maybe_set_param_value (
>>>>>>> +        PARAM_EARLY_INLINING_INSNS_COMDAT, 4,
>>>>>>> +        opts->x_param_values, opts_set->x_param_values);
>>>>>>>        value = true;
>>>>>>>      /* No break here - do -fauto-profile processing. */
>>>>>>>      case OPT_fauto_profile:



More information about the Gcc-patches mailing list