This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] decide edge's hotness when there is profile info
- From: Teresa Johnson <tejohnson at google dot com>
- To: Jan Hubicka <hubicka at ucw dot cz>
- Cc: Dehao Chen <dehao at google dot com>, Xinliang David Li <davidxl at google dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Fri, 18 Oct 2013 13:34:15 -0700
- Subject: Re: [PATCH] decide edge's hotness when there is profile info
- Authentication-results: sourceware.org; auth=none
- References: <CAAkRFZJWSXd6wkVXwsvPf_zkjY=5DF0YhRL6LCnkPn1qttebUw at mail dot gmail dot com> <CAO2gOZVPv8G5yy6K+itKN=Ra6JCUkpZd2HGGfb34ZKkP75pmuA at mail dot gmail dot com> <20131014215027 dot GD17422 at kam dot mff dot cuni dot cz> <CAO2gOZXeGP8nZsa0aoji1dVxkRjT69G_ySU3iWbK9oJrjF2n_g at mail dot gmail dot com> <CAAe5K+VJ70+Sj_J-uwcr+W6THPZXe5cy-hwtBCpJXms2=nPimg at mail dot gmail dot com> <CAO2gOZVmMLBEnEYdh4yFD8D5y0netp8V0VLNW1Wcwt9fcwEYkw at mail dot gmail dot com> <CAAe5K+V5k2i9gpo0Ak8b5MnC87uSeaJWJzkRUrZxtoF1M8crNg at mail dot gmail dot com> <CAO2gOZV-opRL_R-1sVQD8BdMaUDHac3H+WfF8Y-vkdHzRd__Sg at mail dot gmail dot com> <20131015210324 dot GA23980 at kam dot mff dot cuni dot cz> <CAAe5K+X8MzvUHdpGVG5jKpz35XpjnPLig--eaHWG9pCNFdS3Bw at mail dot gmail dot com> <20131015215741 dot GA22365 at kam dot mff dot cuni dot cz> <CAAe5K+V+Yidij=WZM=rwfqfewuh81UOcUvHKVwT_g-9YwpUs-A at mail dot gmail dot com> <CAAe5K+Vw8YbvEe+qSKF8js_TgCEehdJ-qKVD2HkFS0pBRF+e+g at mail dot gmail dot com>
Here is the patch updated to use the new parameter from r203830.
Passed bootstrap and regression tests.
2013-10-18 Jan Hubicka <jh@suse.cz>
Teresa Johnson <tejohnson@google.com>
* predict.c (handle_missing_profiles): New function.
(counts_to_freqs): Don't overwrite estimated frequencies
when function has no profile counts.
* predict.h (handle_missing_profiles): Declare.
* tree-profile.c (tree_profiling): Invoke handle_missing_profiles.
Index: predict.c
===================================================================
--- predict.c (revision 203830)
+++ predict.c (working copy)
@@ -2758,6 +2758,40 @@ estimate_loops (void)
BITMAP_FREE (tovisit);
}
+void
+handle_missing_profiles (void)
+{
+ struct cgraph_node *node;
+ int unlikely_count_fraction = PARAM_VALUE (UNLIKELY_BB_COUNT_FRACTION);
+ /* See if 0 count function has non-0 count callers. In this case we
+ lost some profile. Drop its function profile to PROFILE_GUESSED. */
+ FOR_EACH_DEFINED_FUNCTION (node)
+ {
+ struct cgraph_edge *e;
+ gcov_type call_count = 0;
+ struct function *fn = DECL_STRUCT_FUNCTION (node->symbol.decl);
+ if (node->count)
+ continue;
+ for (e = node->callers; e; e = e->next_caller)
+ call_count += e->count;
+ if (call_count
+ && fn && fn->cfg
+ && (call_count * unlikely_count_fraction >= profile_info->runs))
+ {
+ bool maybe_hot = maybe_hot_count_p (NULL, call_count);
+ if (dump_file)
+ fprintf (dump_file,
+ "Dropping 0 profile for %s/%i. %s based on calls.\n",
+ cgraph_node_name (node), node->symbol.order,
+ maybe_hot ? "Function is hot" : "Function is normal");
+ profile_status_for_function (fn)
+ = (flag_guess_branch_prob ? PROFILE_GUESSED : PROFILE_ABSENT);
+ node->frequency
+ = maybe_hot ? NODE_FREQUENCY_HOT : NODE_FREQUENCY_NORMAL;
+ }
+ }
+}
+
/* Convert counts measured by profile driven feedback to frequencies.
Return nonzero iff there was any nonzero execution count. */
@@ -2767,6 +2801,9 @@ counts_to_freqs (void)
gcov_type count_max, true_count_max = 0;
basic_block bb;
+ if (!ENTRY_BLOCK_PTR->count)
+ return 0;
+
FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR, NULL, next_bb)
true_count_max = MAX (bb->count, true_count_max);
Index: predict.h
===================================================================
--- predict.h (revision 203830)
+++ predict.h (working copy)
@@ -37,6 +37,7 @@ enum prediction
extern void predict_insn_def (rtx, enum br_predictor, enum prediction);
extern int counts_to_freqs (void);
+extern void handle_missing_profiles (void);
extern void estimate_bb_frequencies (bool);
extern const char *predictor_name (enum br_predictor);
extern tree build_predict_expr (enum br_predictor, enum prediction);
Index: tree-profile.c
===================================================================
--- tree-profile.c (revision 203830)
+++ tree-profile.c (working copy)
@@ -607,6 +607,8 @@ tree_profiling (void)
pop_cfun ();
}
+ handle_missing_profiles ();
+
del_node_map ();
return 0;
}
On Wed, Oct 16, 2013 at 12:02 PM, Teresa Johnson <tejohnson@google.com> wrote:
> On Tue, Oct 15, 2013 at 3:39 PM, Teresa Johnson <tejohnson@google.com> wrote:
>> On Tue, Oct 15, 2013 at 2:57 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
>>>> On Tue, Oct 15, 2013 at 2:03 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
>>>> >> Yes, that would work. So let's discard this patch because the fix for
>>>> >> comdat can also fix this problem.
>>>> >
>>>> > Unforutnately ipa-profile-estimate is an IPA pass and as such it does
>>>> > not have access to profile_status_to_function.
>>>>
>>>> I see. I was coding that up today but hadn't tested it yet.
>>>>
>>>> > You can probably just factor this out into a function that can be called
>>>> > and for normal FDO we call it where the loop stis now and for auto-FDO we can
>>>> > probably have another invocation from before early passes where auto-FDO is collected.
>>>>
>>>> Ok, let's go with that approach for now. It won't address the 0 count
>>>> COMDAT calling another 0 count COMDAT problem, but I will probably
>>>> just find a way to deal with this when inlining.
>>>
>>> You can still propagate, since tree-profile is an simple-ipa pass.
>>
>> Ok, I think I will leave that for the second patch, since I need to do
>> some testing on the effects - i.e. the propagation I had in mind would
>> make any 0-count routine called by a routine that was dropped to
>> guessed to also be dropped to guessed (and no longer unlikely). It may
>> be too aggressive, I need to check.
>>
>>>>
>>>> >> >>> + if (node->count)
>>>> >> >>> + continue;
>>>> > Also here we should sum the counts and consider function non unlikely executed
>>>> > in the same way as probably_never_executed does.
>>>>
>>>> I assume you mean by doing the same comparison to the number of
>>>> profile->runs. Yes, this makes sense.
>>>
>>> Yes.
>>>>
>>>> >
>>>> > I can prepare updated patch, but i am currently travelling, so i would not
>>>> > be disapointed if you beat me ;)
>>>>
>>>> I'm working on it, and I think based on Dehao's needs I am going to
>>>> split up the patch into two phases, the one that does just the part
>>>> you had sent a patch for (making sure 0 count routines with non-zero
>>>> calls are marked guessed and have their node frequency set
>>>> appropriately), and a subsequent one to do the count application when
>>>> we inline a 0-count routine into a non-zero callsite. I'll shoot for
>>>> getting this ready by tomorrow.
>>>>
>>>> BTW, in your original patch you are checking for both e->count or
>>>> cgraph_maybe_hot_edge_p(e), but AFAICT the call to
>>>> cgraph_maybe_hot_edge_p will never return true when e->count is zero.
>>>> When there is a profile it will return false via maybe_hot_count_p
>>>> since e->count == 0. When there is no profile it will return false
>>>> when the callee has NODE_FREQUENCY_UNLIKELY_EXECUTED. So I think just
>>>> checking for e->count >0 is sufficient here.
>>>
>>> I think I was checking caller count here (that is read) and the code
>>> was supposed to make functoin with hot caller to be hot...
>>
>> The code in your patch was just checking the edge count, not the
>> caller count. cgraph_maybe_hot_edge_p also doesn't check the caller
>> count, just the edge count. Do we want to make all functions with hot
>> callers also be hot, even when the edge count is not hot? This may be
>> to aggressive. I was simply going to make the code check e->count.
>
> Here is the patch from Honza, that I revised based on discussions and
> testing. Once this related patch goes in, I can change the check against the
> profile_info->runs that hardcodes the threshold with the new parameter
> proposed for probably_never_executed there:
> http://gcc.gnu.org/ml/gcc-patches/2013-10/msg00743.html
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu. Ok for trunk?
>
> Thanks, Teresa
>
> 2013-10-16 Jan Hubicka <jh@suse.cz>
> Teresa Johnson <tejohnson@google.com>
>
> * predict.c (handle_missing_profiles): New function.
> (counts_to_freqs): Don't overwrite estimated frequencies
> when function has no profile counts.
> * predict.h (handle_missing_profiles): Declare.
> * tree-profile.c (tree_profiling): Invoke handle_missing_profiles.
>
> Index: predict.c
> ===================================================================
> --- predict.c (revision 203633)
> +++ predict.c (working copy)
> @@ -2742,6 +2742,39 @@ estimate_loops (void)
> BITMAP_FREE (tovisit);
> }
>
> +void
> +handle_missing_profiles (void)
> +{
> + struct cgraph_node *node;
> + /* See if 0 count function has non-0 count callers. In this case we
> + lost some profile. Drop its function profile to PROFILE_GUESSED. */
> + FOR_EACH_DEFINED_FUNCTION (node)
> + {
> + struct cgraph_edge *e;
> + gcov_type call_count = 0;
> + struct function *fn = DECL_STRUCT_FUNCTION (node->symbol.decl);
> + if (node->count)
> + continue;
> + for (e = node->callers; e; e = e->next_caller)
> + call_count += e->count;
> + if (call_count
> + && fn && fn->cfg
> + && (call_count * 4 >= profile_info->runs))
> + {
> + bool maybe_hot = maybe_hot_count_p (NULL, call_count);
> + if (dump_file)
> + fprintf (dump_file,
> + "Dropping 0 profile for %s/%i. %s based on calls.\n",
> + cgraph_node_name (node), node->symbol.order,
> + maybe_hot ? "Function is hot" : "Function is normal");
> + profile_status_for_function (fn)
> + = (flag_guess_branch_prob ? PROFILE_GUESSED : PROFILE_ABSENT);
> + node->frequency
> + = maybe_hot ? NODE_FREQUENCY_HOT : NODE_FREQUENCY_NORMAL;
> + }
> + }
> +}
> +
> /* Convert counts measured by profile driven feedback to frequencies.
> Return nonzero iff there was any nonzero execution count. */
>
> @@ -2751,6 +2784,9 @@ counts_to_freqs (void)
> gcov_type count_max, true_count_max = 0;
> basic_block bb;
>
> + if (!ENTRY_BLOCK_PTR->count)
> + return 0;
> +
> FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR, NULL, next_bb)
> true_count_max = MAX (bb->count, true_count_max);
>
> Index: predict.h
> ===================================================================
> --- predict.h (revision 203633)
> +++ predict.h (working copy)
> @@ -37,6 +37,7 @@ enum prediction
>
> extern void predict_insn_def (rtx, enum br_predictor, enum prediction);
> extern int counts_to_freqs (void);
> +extern void handle_missing_profiles (void);
> extern void estimate_bb_frequencies (bool);
> extern const char *predictor_name (enum br_predictor);
> extern tree build_predict_expr (enum br_predictor, enum prediction);
> Index: tree-profile.c
> ===================================================================
> --- tree-profile.c (revision 203633)
> +++ tree-profile.c (working copy)
> @@ -607,6 +607,8 @@ tree_profiling (void)
> pop_cfun ();
> }
>
> + handle_missing_profiles ();
> +
> del_node_map ();
> return 0;
> }
>
>
>>
>> Teresa
>>
>>>
>>> Honza
>>>>
>>>> Thanks,
>>>> Teresa
>>>>
>>>> >
>>>> > Honza
>>>>
>>>>
>>>>
>>>> --
>>>> Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
>>
>>
>>
>> --
>> Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
>
>
>
> --
> Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
--
Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413