[GOOGLE] Unrestrict early inline restrictions for AutoFDO

Xinliang David Li davidxl@google.com
Mon Jun 3 04:36:00 GMT 2013


The patch is ok if performance test passes.  For a complete fix, Is it
better to tune down PARAM_EARLY_INLINE_INSNS from 11 to a small value
for autoFDO or use a different parameter?

David

On Sun, Jun 2, 2013 at 9:19 PM, Dehao Chen <dehao@google.com> wrote:
> I've updated the patch to check it at ipa-inline:
>
> Index: gcc/ipa-inline.c
> ===================================================================
> --- gcc/ipa-inline.c (revision 199593)
> +++ gcc/ipa-inline.c (working copy)
> @@ -434,6 +434,16 @@ want_early_inline_function_p (struct cgraph_edge *
>
>        if (growth <= PARAM_VALUE (PARAM_EARLY_INLINING_INSNS_ANY))
>   ;
> +      else if (flag_auto_profile)
> + {
> +  if (dump_file)
> +    fprintf (dump_file, "  will not early inline: %s/%i->%s/%i, "
> +     "call is cold in profiling and code would grow by %i\n",
> +     xstrdup (cgraph_node_name (e->caller)), e->caller->uid,
> +     xstrdup (cgraph_node_name (callee)), callee->uid,
> +     growth);
> +    want_inline = false;
> + }
>        else if (!cgraph_maybe_hot_edge_p (e))
>   {
>    if (dump_file)
>
> Thanks,
> Dehao
>
> On Sun, Jun 2, 2013 at 9:08 PM, Xinliang David Li <davidxl@google.com> wrote:
>> If the purpose of the fix is to filter early inlinings with code
>> growth in autoFDO, the proposed fix is the wrong way to do -- it
>> changes the meaning of cgraph_maybe_hot_edge_p.
>>
>> David
>>
>> On Sun, Jun 2, 2013 at 7:25 PM, Dehao Chen <dehao@google.com> wrote:
>>> On Sun, Jun 2, 2013 at 7:14 PM, Xinliang David Li <davidxl@google.com> wrote:
>>>>
>>>> auto profile info is not available yet in early inlining, why would
>>>> this change make any difference?
>>>
>>> Because the check of PARAM_EARLY_INLINING_INSNS is after the check of
>>> cgraph_maybe_hot_edge_p in early inline. If
>>> cgraph_maybe_hot_edge_p fails, the early inline will not happen even
>>> if growth is less than PARAM_EARLY_INLINING_INSNS.
>>>
>>>>
>>>> Can you just reset the max_iters to a
>>>> higher value for autoFDO?
>>>
>>> We could do that, but it could still lead to some code bloat because
>>> recursive inlines can happen for at most, say 10, iterations.
>>>
>>> Dehao
>>>
>>>>
>>>> David
>>>>
>>>> On Sun, Jun 2, 2013 at 6:21 PM, Dehao Chen <dehao@google.com> wrote:
>>>> > The patch was committed to google-4_8, but it causes problem because
>>>> > einline sets PARAM_EARLY_INLINING_INSNS = 11. This will cause
>>>> > recursive inlining at einline stage (e.g. main->foo, foo->bar,
>>>> > bar->foo) when autofdo is enabled.
>>>> >
>>>> > The following patch can fix the problem by doing more targetted early inlining:
>>>> >
>>>> > Index: gcc/predict.c
>>>> > ===================================================================
>>>> > --- gcc/predict.c (revision 199593)
>>>> > +++ gcc/predict.c (working copy)
>>>> > @@ -175,6 +175,8 @@ cgraph_maybe_hot_edge_p (struct cgraph_edge *edge)
>>>> >        && !maybe_hot_count_p (NULL,
>>>> >                               edge->count))
>>>> >      return false;
>>>> > +  if (flag_auto_profile)
>>>> > +    return false;
>>>> >    if (edge->caller->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED
>>>> >        || (edge->callee
>>>> >    && edge->callee->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED))
>>>> >
>>>> > Performance testing on-going...
>>>> >
>>>> > Dehao
>>>> >
>>>> > On Wed, May 29, 2013 at 3:44 PM, Dehao Chen <dehao@google.com> wrote:
>>>> >> OK, I'll commit the early inline part.
>>>> >>
>>>> >> Dehao
>>>> >>
>>>> >> On Wed, May 29, 2013 at 10:00 AM, Xinliang David Li <davidxl@google.com> wrote:
>>>> >>> The early inlining part is ok. The tracer optimization should be
>>>> >>> revisited -- we should have more fine grain control on it (for
>>>> >>> instance, based on FDO summary -- but that should be common to
>>>> >>> FDO/LIPO).
>>>> >>>
>>>> >>> David
>>>> >>>
>>>> >>> On Wed, May 29, 2013 at 9:39 AM, Dehao Chen <dehao@google.com> wrote:
>>>> >>>> In gcc4-8, the max einline iterations are restricted to 1. For
>>>> >>>> AutoFDO, this is bad because early inline is not size restricted. This
>>>> >>>> patch allows einline to do multiple iterations in AutoFDO. It also
>>>> >>>> enables tracer optimization in AutoFDO.
>>>> >>>>
>>>> >>>> Bootstrapped and passed regression test.
>>>> >>>>
>>>> >>>> OK for googel-4_8?
>>>> >>>>
>>>> >>>> Thanks,
>>>> >>>> Dehao
>>>> >>>>
>>>> >>>> Index: gcc/ipa-inline.c
>>>> >>>> ===================================================================
>>>> >>>> --- gcc/ipa-inline.c (revision 199416)
>>>> >>>> +++ gcc/ipa-inline.c (working copy)
>>>> >>>> @@ -2161,7 +2161,8 @@ early_inliner (void)
>>>> >>>>      {
>>>> >>>>        /* We iterate incremental inlining to get trivial cases of indirect
>>>> >>>>   inlining.  */
>>>> >>>> -      while (iterations < PARAM_VALUE (PARAM_EARLY_INLINER_MAX_ITERATIONS)
>>>> >>>> +      while ((flag_auto_profile
>>>> >>>> +      || iterations < PARAM_VALUE (PARAM_EARLY_INLINER_MAX_ITERATIONS))
>>>> >>>>       && early_inline_small_functions (node))
>>>> >>>>   {
>>>> >>>>    timevar_push (TV_INTEGRATION);
>>>> >>>> Index: gcc/opts.c
>>>> >>>> ===================================================================
>>>> >>>> --- gcc/opts.c (revision 199416)
>>>> >>>> +++ gcc/opts.c (working copy)
>>>> >>>> @@ -1644,6 +1644,8 @@ common_handle_option (struct gcc_options *opts,
>>>> >>>>   opts->x_flag_peel_loops = value;
>>>> >>>>        if (!opts_set->x_flag_value_profile_transformations)
>>>> >>>>   opts->x_flag_value_profile_transformations = value;
>>>> >>>> +      if (!opts_set->x_flag_tracer)
>>>> >>>> + opts->x_flag_tracer = value;
>>>> >>>>        if (!opts_set->x_flag_inline_functions)
>>>> >>>>   opts->x_flag_inline_functions = value;
>>>> >>>>        if (!opts_set->x_flag_ipa_cp)



More information about the Gcc-patches mailing list