This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [GOOGLE] Unrestrict early inline restrictions for AutoFDO
- From: Dehao Chen <dehao at google dot com>
- To: Xinliang David Li <davidxl at google dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>, Teresa Johnson <tejohnson at google dot com>
- Date: Sun, 2 Jun 2013 18:21:27 -0700
- Subject: Re: [GOOGLE] Unrestrict early inline restrictions for AutoFDO
- References: <CAO2gOZVq76GUFzY5i=EsgS6CjD+_S1tiwN9EyoWLF3nr=OvE8Q at mail dot gmail dot com> <CAAkRFZKQbN3hf2HVYjxR+JAUqCMwUi=CNF3Os_uatW8LRm+L1w at mail dot gmail dot com> <CAO2gOZWmrU6NBuJ8N=SsqFeyYc+1ewsRHQvdDv-JpCxPO2gSXQ at mail dot gmail dot com>
The patch was committed to google-4_8, but it causes problem because
einline sets PARAM_EARLY_INLINING_INSNS = 11. This will cause
recursive inlining at einline stage (e.g. main->foo, foo->bar,
bar->foo) when autofdo is enabled.
The following patch can fix the problem by doing more targetted early inlining:
Index: gcc/predict.c
===================================================================
--- gcc/predict.c (revision 199593)
+++ gcc/predict.c (working copy)
@@ -175,6 +175,8 @@ cgraph_maybe_hot_edge_p (struct cgraph_edge *edge)
&& !maybe_hot_count_p (NULL,
edge->count))
return false;
+ if (flag_auto_profile)
+ return false;
if (edge->caller->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED
|| (edge->callee
&& edge->callee->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED))
Performance testing on-going...
Dehao
On Wed, May 29, 2013 at 3:44 PM, Dehao Chen <dehao@google.com> wrote:
> OK, I'll commit the early inline part.
>
> Dehao
>
> On Wed, May 29, 2013 at 10:00 AM, Xinliang David Li <davidxl@google.com> wrote:
>> The early inlining part is ok. The tracer optimization should be
>> revisited -- we should have more fine grain control on it (for
>> instance, based on FDO summary -- but that should be common to
>> FDO/LIPO).
>>
>> David
>>
>> On Wed, May 29, 2013 at 9:39 AM, Dehao Chen <dehao@google.com> wrote:
>>> In gcc4-8, the max einline iterations are restricted to 1. For
>>> AutoFDO, this is bad because early inline is not size restricted. This
>>> patch allows einline to do multiple iterations in AutoFDO. It also
>>> enables tracer optimization in AutoFDO.
>>>
>>> Bootstrapped and passed regression test.
>>>
>>> OK for googel-4_8?
>>>
>>> Thanks,
>>> Dehao
>>>
>>> Index: gcc/ipa-inline.c
>>> ===================================================================
>>> --- gcc/ipa-inline.c (revision 199416)
>>> +++ gcc/ipa-inline.c (working copy)
>>> @@ -2161,7 +2161,8 @@ early_inliner (void)
>>> {
>>> /* We iterate incremental inlining to get trivial cases of indirect
>>> inlining. */
>>> - while (iterations < PARAM_VALUE (PARAM_EARLY_INLINER_MAX_ITERATIONS)
>>> + while ((flag_auto_profile
>>> + || iterations < PARAM_VALUE (PARAM_EARLY_INLINER_MAX_ITERATIONS))
>>> && early_inline_small_functions (node))
>>> {
>>> timevar_push (TV_INTEGRATION);
>>> Index: gcc/opts.c
>>> ===================================================================
>>> --- gcc/opts.c (revision 199416)
>>> +++ gcc/opts.c (working copy)
>>> @@ -1644,6 +1644,8 @@ common_handle_option (struct gcc_options *opts,
>>> opts->x_flag_peel_loops = value;
>>> if (!opts_set->x_flag_value_profile_transformations)
>>> opts->x_flag_value_profile_transformations = value;
>>> + if (!opts_set->x_flag_tracer)
>>> + opts->x_flag_tracer = value;
>>> if (!opts_set->x_flag_inline_functions)
>>> opts->x_flag_inline_functions = value;
>>> if (!opts_set->x_flag_ipa_cp)