[PATCH] [RFC, PGO+LTO] Missed function specialization + partial devirtualization
luoxhu
luoxhu@linux.ibm.com
Mon Jun 24 09:20:00 GMT 2019
On 2019/6/24 10:34, luoxhu wrote:
> Hi Honza,
> Thanks very much to get so many useful comments from you.
> As a newbie to GCC, not sure whether my questions are described clearly
> enough. Thanks for your patience in advance. :)
>
>
> On 2019/6/20 21:47, Jan Hubicka wrote:
>> Hi,
>> some comments on the ipa part of the patch
>> (and thanks for working on it - this was on my TODO list for years)
>>
>>> diff --git a/gcc/cgraph.c b/gcc/cgraph.c
>>> index de82316d4b1..0d373a67d1b 100644
>>> --- a/gcc/cgraph.c
>>> +++ b/gcc/cgraph.c
>>> @@ -553,6 +553,7 @@ cgraph_node::get_create (tree decl)
>>> Â Â Â Â Â fprintf (dump_file, "Introduced new external node "
>>> Â Â Â Â Â Â Â Â Â Â "(%s) and turned into root of the clone tree.\n",
>>> Â Â Â Â Â Â Â Â Â Â node->dump_name ());
>>> +Â Â Â Â Â node->profile_id = first_clone->profile_id;
>>> Â Â Â Â Â }
>>> Â Â Â else if (dump_file)
>>> Â Â Â Â Â fprintf (dump_file, "Introduced new external node "
>>
>> This is independent of the rest of changes. Do you have example where
>> this matters? The inline clones are created in ipa-inline while
>> ipa-profile is run before it, so I can not think of such a scenario.
>> I see you also copy profile_id from function to clone. I would like to
>> know why you needed that.
>>
>> Also you mention that you hit some ICEs. If fixes are independent of
>> rest of your changes, send them separately.
>
> I copy the profile_id for cloned node as when in LTO ltrans, there is no
> references or referrings info for the specialized node/cloned node, so it
> is difficult to track the node's reference in
> cgraph_edge::speculative_call_info. I use it mainly for debug purpose now.
> Will remove it and split the patches in later version to include ICE fixes.
>
>>
>>> @@ -1110,6 +1111,7 @@ cgraph_edge::speculative_call_info (cgraph_edge
>>> *&direct,
>>> Â Â Â int i;
>>> Â Â Â cgraph_edge *e2;
>>> Â Â Â cgraph_edge *e = this;
>>> +Â cgraph_node *referred_node;
>>> Â Â Â if (!e->indirect_unknown_callee)
>>> Â Â Â Â Â for (e2 = e->caller->indirect_calls;
>>> @@ -1142,8 +1144,20 @@ cgraph_edge::speculative_call_info (cgraph_edge
>>> *&direct,
>>> Â Â Â Â Â && ((ref->stmt && ref->stmt == e->call_stmt)
>>> Â Â Â Â Â Â Â Â Â || (!ref->stmt && ref->lto_stmt_uid == e->lto_stmt_uid)))
>>> Â Â Â Â Â Â Â {
>>> -Â Â Â reference = ref;
>>> -Â Â Â break;
>>> +Â Â Â if (e2->indirect_info && e2->indirect_info->num_of_ics)
>>> +Â Â Â Â Â {
>>> +Â Â Â Â Â Â Â referred_node = dyn_cast<cgraph_node *> (ref->referred);
>>> +Â Â Â Â Â Â Â if (strstr (e->callee->name (), referred_node->name ()))
>>> +Â Â Â Â Â Â Â Â Â {
>>> +Â Â Â Â Â Â Â reference = ref;
>>> +Â Â Â Â Â Â Â break;
>>> +Â Â Â Â Â Â Â Â Â }
>>> +Â Â Â Â Â }
>>> +Â Â Â else
>>> +Â Â Â Â Â {
>>> +Â Â Â Â Â Â Â reference = ref;
>>> +Â Â Â Â Â Â Â break;
>>> +Â Â Â Â Â }
>>> Â Â Â Â Â Â Â }
>>
>> This function is intended to return everything related to the
>> speculative call, so if you add multiple direct targets, i would expect
>> it to tage auto_vec of cgraph_nodes for direct and auto_vec of
>> references.
>
> So will the signature becomes
> cgraph_edge::speculative_call_info (auto_vec<cgraph_edge *> *direct,
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â cgraph_edge *&indirect,
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â auto_vec<ipa_ref *> *reference)
>
> Seems a lot of code related to it, maybe should split to another patch.
> And will the sequence of direct and reference in each auto_vec be strictly
> mapped for iteration convenience?
> Second question is "this" is a direct edge will be pushed to auto_vec
> "direct", how can it get its next direct edge here? From e->caller->callees?
There maybe some misunderstanding here. The direct should be one edge
only, but reference could be multiple.
For example: two indirect edge on one single statement x = p(3);
the first speculative edge is main -> one;
the second speculative edge 2 is main -> two.
direct->call_stmt is: x_10 = p_3 (3);
call code in ipa-inline-transform.c:
for (e = node->callees; e; e = next)
{
next = e->next_callee;
e->redirect_call_stmt_to_callee ();
}
redirect_call_stmt_to_callee will call
e->speculative_call_info(e, e2, ref).
When e is âmain -> one" being redirected, The returned auto_vec reference
length will be 2.
So the map should be 1:N instead of N:N. (one direct edge will find N
reference nodes, but only one of it is correct, need iterate to find it
out.)
e2 is the indirect call(e->caller->indirect_calls) can only be set to false
speculative if all indirect targets are redirected by "next=e->next_callee"
Or else, the next speculative edge couldn't finish the redirect as the e2
is not speculative again in next round iteration.
As a result, maybe still need similar logic to check the returned reference
length, only set "e2->speculative = false;" when the length is 1. which
means all direct targets are redirected.
>
>
>>> Â Â Â /* Speculative edge always consist of all three components - direct
>>> edge,
>>> @@ -1199,7 +1213,14 @@ cgraph_edge::resolve_speculation (tree callee_decl)
>>>           in the functions inlined through it. */
>>> Â Â Â Â Â }
>>> Â Â Â edge->count += e2->count;
>>> -Â edge->speculative = false;
>>> +Â if (edge->indirect_info && edge->indirect_info->num_of_ics)
>>> +Â Â Â {
>>> +Â Â Â Â Â edge->indirect_info->num_of_ics--;
>>> +Â Â Â Â Â if (edge->indirect_info->num_of_ics == 0)
>>> +Â Â Â edge->speculative = false;
>>> +Â Â Â }
>>> +Â else
>>> +Â Â Â edge->speculative = false;
>>> Â Â Â e2->speculative = false;
>>> Â Â Â ref->remove_reference ();
>>> Â Â Â if (e2->indirect_unknown_callee || e2->inline_failed)
>>
>> This function should turn speculative call into direct call to DECL, so
>> I think it should remove all the other direct calls associated with stmt
>> and the indirect one.
>>
>> There are now two cases - in first case you want to turn speculative
>> call into direct call or give up on especulation completely, while in
>> other case you want to only remove one of speculations.
>>
>> I guess we want to have resolve_speculation(decl) for first and
>> remove_one_speculation(edge) for the second case?
>> The second case would be useful for the code below handling type
>> mismatches and also for inline when one of speculative targets seems not
>> useful to bother with.
>
> So the logic will be:
>
> if (edge->indirect_info->num_of_ics > 1)
> Â Â Â Â cgraph_edge::resolve_speculation (tree callee_decl);
> else
> Â Â Â Â remove_one_speculation(edge);
>
> cgraph_edge::resolve_speculation will call edge->speculative_call_info (e2,
> edge, ref) internally, at this time, e2 and ref will only contains one
> direct target?
>
>
>>> @@ -1333,7 +1354,14 @@ cgraph_edge::redirect_call_stmt_to_callee (void)
>>> Â Â Â Â Â Â Â e->caller->set_call_stmt_including_clones (e->call_stmt, new_stmt,
>>> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â false);
>>> Â Â Â Â Â Â Â e->count = gimple_bb (e->call_stmt)->count;
>>> -Â Â Â Â Â e2->speculative = false;
>>> +Â Â Â Â Â if (e2->indirect_info && e2->indirect_info->num_of_ics)
>>> +Â Â Â Â Â Â Â {
>>> +Â Â Â Â Â Â Â Â Â e2->indirect_info->num_of_ics--;
>>> +Â Â Â Â Â Â Â Â Â if (e2->indirect_info->num_of_ics == 0)
>>> +Â Â Â Â Â Â Â e2->speculative = false;
>>> +Â Â Â Â Â Â Â }
>>> +Â Â Â Â Â else
>>> +Â Â Â Â Â Â Â e2->speculative = false;
>>> Â Â Â Â Â Â Â e2->count = gimple_bb (e2->call_stmt)->count;
>>> Â Â Â Â Â Â Â ref->speculative = false;
>>> Â Â Â Â Â Â Â ref->stmt = NULL;
>>
>>> Â extern void debuginfo_early_init (void);
>>> Â extern void debuginfo_init (void);
>>> @@ -1638,11 +1639,17 @@ struct GTY(()) cgraph_indirect_call_info
>>> Â Â Â int param_index;
>>>    /* ECF flags determined from the caller. */
>>> Â Â Â int ecf_flags;
>>> - /* Profile_id of common target obtrained from profile. */
>>> + /* Profile_id of common target obtained from profile. */
>>> Â Â Â int common_target_id;
>>> Â Â Â /* Probability that call will land in function with
>>> COMMON_TARGET_ID. */
>>> Â Â Â int common_target_probability;
>>> + /* Profile_id of common target obtained from profile. */
>>> +Â int common_target_ids[GCOV_ICALL_TOPN_NCOUNTS / 2];
>>> +Â /* Probabilities that call will land in function with
>>> COMMON_TARGET_IDS. */
>>> +Â int common_target_probabilities[GCOV_ICALL_TOPN_NCOUNTS / 2];
>>
>> I would use vec of pairs (profile_id,probability) to hold this and do
>> not wire in GCOV_ICALL_TOPN_NCOUTS. Most of time this vec will be just
>> NULL pointer so it will result in less memory overhead and will avoid
>> hard limit on number of speculations we want to do.
>>
>> Note that the speculative edges may end up being redirected during IPA
>> optimization, for example when their target is cloned for particular
>> call context or when the function is detected identical to other
>> function. So one can not preserve the mapping between targets and
>> profile ids.
>>
>> Also this infrastructure is useful even w/o profile because we could use
>> ipa-devirt to devirtualize even when multiple polymorphic targets are
>> found. So I would not wirte in the limit GCOV_ICALL_TOPN_NCOUNTS and
>> just use dynamically allocated vectors instead.
>>
>> With ipa-devirt it is possible that we know that there are precisely 2
>> possible polymorphic targets. Other case i was considering was to
>> speculatively inline with -fPIC. I.e. when one has interposiable call
>> for foo() one can create foo.localalias() for the definition visible to
>> compiler and then speculate
>>
>> if (foo == foo.localalias())
>> Â Â inline_path
>> else
>> Â Â foo();
>>
>> In these cases we may end up with indirect
>> call that has no associated indirect edge (which we do not support
>> currently), so it may be interesting to move speculative call info away
>> from indirect call info to cgraph edge structure (but that can be done
>> incrementally based on what you do now - this code is not completely
>> easy so lets do it step by step).
>
> +
> +struct GTY (()) indirect_target_info
> +{
> + /* Profile_id of common target obtained from profile. */
> +Â int common_target_id;
> + /* Probability that call will land in function with COMMON_TARGET_ID. */
> +Â int common_target_probability;
> +};
> +
>
> Tested with "vec<indirect_target_info, va_gc> *indirect_call_targets;"
> works quite good, really much better.
>
> Is it correct that profile_id/common_target_id will only be used to
> generate speculative edges in ipa-profile, all new specialized node/cloned
> node will take(not copy) the speculative property from original node, so
> was the reason that no need to clone profile id?
>
>
>>
>>> @@ -212,6 +216,46 @@ ipa_profile_generate_summary (void)
>>> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â gimple_remove_histogram_value (DECL_STRUCT_FUNCTION
>>> (node->decl),
>>> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â stmt, h);
>>> Â Â Â Â Â Â Â Â Â Â Â Â Â }
>>> +Â Â Â Â Â Â Â Â Â else if (h && type == HIST_TYPE_INDIR_CALL_TOPN)
>>> +Â Â Â Â Â Â Â Â Â Â Â {
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â unsigned j;
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â struct cgraph_edge *e = node->get_edge (stmt);
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â if (e && !e->indirect_unknown_callee)
>>> +Â Â Â Â Â Â Â Â Â Â Â continue;
>>
>> I suppose you are going to change this for Martin's implementation, so i
>> have skipped it for now.
>>> @@ -558,7 +601,8 @@ ipa_profile (void)
>>> Â Â Â Â Â {
>>> Â Â Â Â Â Â Â if (n->count.initialized_p ())
>>> Â Â Â Â Â Â Â Â Â nindirect++;
>>> -Â Â Â Â Â if (e->indirect_info->common_target_id)
>>> +Â Â Â Â Â if (e->indirect_info->common_target_id
>>> +Â Â Â Â Â Â Â Â Â || (e->indirect_info && e->indirect_info->num_of_ics == 1))
>>> Â Â Â Â Â Â Â Â Â {
>>> Â Â Â Â Â Â Â Â Â Â Â if (!node_map_initialized)
>>> Â Â Â Â Â Â Â Â Â Â Â Â Â init_node_map (false);
>>> @@ -613,7 +657,7 @@ ipa_profile (void)
>>> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â if (dump_file)
>>> Â Â Â Â Â Â Â Â Â Â Â Â Â fprintf (dump_file,
>>> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â "Not speculating: "
>>> -Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â "parameter count mistmatch\n");
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â "parameter count mismatch\n");
>>> Â Â Â Â Â Â Â Â Â Â Â Â Â }
>>> Â Â Â Â Â Â Â Â Â Â Â else if (e->indirect_info->polymorphic
>>> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â && !opt_for_fn (n->decl, flag_devirtualize)
>>> @@ -655,7 +699,130 @@ ipa_profile (void)
>>> Â Â Â Â Â Â Â Â Â Â Â nunknown++;
>>> Â Â Â Â Â Â Â Â Â }
>>> Â Â Â Â Â Â Â Â Â }
>>> -Â Â Â Â }
>>> +Â Â Â Â Â if (e->indirect_info && e->indirect_info->num_of_ics > 1)
>>> +Â Â Â Â Â Â Â {
>>> +Â Â Â Â Â Â Â Â Â if (in_lto_p)
>>> +Â Â Â Â Â Â Â {
>>> +Â Â Â Â Â Â Â Â Â if (dump_file)
>>> +Â Â Â Â Â Â Â Â Â Â Â {
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â fprintf (dump_file,
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â "Updating hotness threshold in LTO mode.\n");
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â fprintf (dump_file, "Updated min count: %" PRId64 "\n",
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â (int64_t) threshold);
>>> +Â Â Â Â Â Â Â Â Â Â Â }
>>> +Â Â Â Â Â Â Â Â Â set_hot_bb_threshold (threshold
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â / e->indirect_info->num_of_ics);
>>> +Â Â Â Â Â Â Â }
>>
>> Why do you need different paths for one target and for multiple targets?
>> Also why LTO is different here?
> Will remove the logic and the above change based on Martin'
> implementation.
>
>>> +Â Â Â Â Â Â Â Â Â if (!node_map_initialized)
>>> +Â Â Â Â Â Â Â init_node_map (false);
>>> +Â Â Â Â Â Â Â Â Â node_map_initialized = true;
>>> +Â Â Â Â Â Â Â Â Â ncommon++;
>>> +Â Â Â Â Â Â Â Â Â unsigned speculative = 0;
>>> +Â Â Â Â Â Â Â Â Â for (i = 0; i < (int)e->indirect_info->num_of_ics; i++)
>>> +Â Â Â Â Â Â Â {
>>> +Â Â Â Â Â Â Â Â Â n2 = find_func_by_profile_id (
>>> +Â Â Â Â Â Â Â Â Â Â Â e->indirect_info->common_target_ids[i]);
>>> +Â Â Â Â Â Â Â Â Â if (n2)
>>> +Â Â Â Â Â Â Â Â Â Â Â {
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â if (dump_file)
>>> +Â Â Â Â Â Â Â Â Â Â Â {
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â fprintf (
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â dump_file,
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â "Indirect call -> direct call from"
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â " other module %s => %s, prob %3.2f\n",
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â n->dump_name (), n2->dump_name (),
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â e->indirect_info->common_target_probabilities[i]
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â / (float) REG_BR_PROB_BASE);
>>> +Â Â Â Â Â Â Â Â Â Â Â }
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â if (e->indirect_info->common_target_probabilities[i]
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â < REG_BR_PROB_BASE / 2)
>>> +Â Â Â Â Â Â Â Â Â Â Â {
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â nuseless++;
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â if (dump_file)
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â fprintf (
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â dump_file,
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â "Not speculating: probability is too low.\n");
>>> +Â Â Â Â Â Â Â Â Â Â Â }
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â else if (!e->maybe_hot_p ())
>>> +Â Â Â Â Â Â Â Â Â Â Â {
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â nuseless++;
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â if (dump_file)
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â fprintf (dump_file,
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â "Not speculating: call is cold.\n");
>>> +Â Â Â Â Â Â Â Â Â Â Â }
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â else if (n2->get_availability () <= AVAIL_INTERPOSABLE
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â && n2->can_be_discarded_p ())
>>> +Â Â Â Â Â Â Â Â Â Â Â {
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â nuseless++;
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â if (dump_file)
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â fprintf (dump_file,
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â "Not speculating: target is overwritable "
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â "and can be discarded.\n");
>>> +Â Â Â Â Â Â Â Â Â Â Â }
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â else if (ipa_node_params_sum && ipa_edge_args_sum
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â && (!vec_safe_is_empty (
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â IPA_NODE_REF (n2)->descriptors))
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â && ipa_get_param_count (IPA_NODE_REF (n2))
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â != ipa_get_cs_argument_count (
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â IPA_EDGE_REF (e))
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â && (ipa_get_param_count (IPA_NODE_REF (n2))
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â >= ipa_get_cs_argument_count (
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â IPA_EDGE_REF (e))
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â || !stdarg_p (TREE_TYPE (n2->decl))))
>>> +Â Â Â Â Â Â Â Â Â Â Â {
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â nmismatch++;
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â if (dump_file)
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â fprintf (dump_file, "Not speculating: "
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â "parameter count mismatch\n");
>>> +Â Â Â Â Â Â Â Â Â Â Â }
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â else if (e->indirect_info->polymorphic
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â && !opt_for_fn (n->decl, flag_devirtualize)
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â && !possible_polymorphic_call_target_p (e, n2))
>>> +Â Â Â Â Â Â Â Â Â Â Â {
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â nimpossible++;
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â if (dump_file)
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â fprintf (dump_file,
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â "Not speculating: "
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â "function is not in the polymorphic "
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â "call target list\n");
>>> +Â Â Â Â Â Â Â Â Â Â Â }
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â else
>>> +Â Â Â Â Â Â Â Â Â Â Â {
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â /* Target may be overwritable, but profile says that
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â control flow goes to this particular implementation
>>> +                of N2. Speculate on the local alias to allow
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â inlining.
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â */
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â if (!n2->can_be_discarded_p ())
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â {
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â cgraph_node *alias;
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â alias = dyn_cast<cgraph_node *> (
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â n2->noninterposable_alias ());
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â if (alias)
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â n2 = alias;
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â }
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â nconverted++;
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â e->make_speculative (
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â n2, e->count.apply_probability (
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â e->indirect_info
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â ->common_target_probabilities[i]));
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â update = true;
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â speculative++;
>>> +Â Â Â Â Â Â Â Â Â Â Â }
>>> +Â Â Â Â Â Â Â Â Â Â Â }
>>> +Â Â Â Â Â Â Â Â Â else
>>> +Â Â Â Â Â Â Â Â Â Â Â {
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â if (dump_file)
>>> +Â Â Â Â Â Â Â Â Â Â Â fprintf (dump_file,
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â "Function with profile-id %i not found.\n",
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â e->indirect_info->common_target_ids[i]);
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â nunknown++;
>>> +Â Â Â Â Â Â Â Â Â Â Â }
>>> +Â Â Â Â Â Â Â }
>>> +Â Â Â Â Â Â Â Â Â if (speculative < e->indirect_info->num_of_ics)
>>> +Â Â Â Â Â Â Â e->indirect_info->num_of_ics = speculative;
>>> +Â Â Â Â Â Â Â }
>>> +Â Â Â }
>>> Â Â Â Â Â Â Â Â if (update)
>>> Â Â Â Â Â Â ipa_update_overall_fn_summary (n);
>>> Â Â Â Â Â Â }
>>> diff --git a/gcc/ipa-utils.c b/gcc/ipa-utils.c
>>> index 79b250c3943..30347691029 100644
>>> --- a/gcc/ipa-utils.c
>>> +++ b/gcc/ipa-utils.c
>>> @@ -587,6 +587,11 @@ ipa_merge_profiles (struct cgraph_node *dst,
>>> Â Â Â Â Â Â Â update_max_bb_count ();
>>> Â Â Â Â Â Â Â compute_function_frequency ();
>>> Â Â Â Â Â Â Â pop_cfun ();
>>> +     /* When src is speculative, clone the referrings. */
>>> +Â Â Â Â Â if (src->indirect_call_target)
>>> +Â Â Â for (e = src->callers; e; e = e->next_caller)
>>> +Â Â Â Â Â if (e->callee == src && e->speculative)
>>> +Â Â Â Â Â Â Â dst->clone_referring (src);
>>
>> This looks wrong. Why do you need to copy all references from src
>> to target?
> Clonning the referrings from callee to the merged callee, not references.
> the information from "caller to src" is not passed to "caller to dst" when
> merge profiles, it is a workaround when fixing a SPEC ICE. Will root
> cause it later.
>
>
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â /* Speculative calls consist of two edges - direct
>>> +                  and indirect. Duplicate the whole thing and
>>> +                  distribute frequencies accordingly. */
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â if (edge->speculative)
>> I think here you want to handle the whole group with multiple targets.
> OK. Also an ICE fix in SPEC.
>
>
> Thanks
> Xionghu
>
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â {
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â struct ipa_ref *ref;
>>> +
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â gcc_assert (!edge->indirect_unknown_callee);
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â old_edge->speculative_call_info (direct,
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â indirect, ref);
>>> +
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â profile_count indir_cnt = indirect->count;
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â indirect
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â = indirect->clone (id->dst_node, call_stmt,
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â gimple_uid (stmt), num,
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â den, true);
>>> +
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â profile_probability prob
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â = indir_cnt.probability_in (old_cnt
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â + indir_cnt);
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â indirect->count
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â = copy_basic_block->count.apply_probability (
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â prob);
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â edge->count
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â = copy_basic_block->count - indirect->count;
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â id->dst_node->clone_reference (ref, stmt);
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â }
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â else
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â edge->count = copy_basic_block->count;
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â }
>>> +Â Â Â Â Â Â Â Â Â Â Â /* If the indirect call contains more than one indirect
>>> +              targets, need clone all speculative edges here. */
>>> +Â Â Â Â Â Â Â Â Â Â Â if (old_edge && old_edge->next_callee
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â && old_edge->speculative && indirect
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â && indirect->indirect_info
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â && indirect->indirect_info->num_of_ics > 1)
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â {
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â edge = old_edge->next_callee;
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â old_edge = old_edge->next_callee;
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â if (edge->speculative)
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â next_speculative = true;
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â }
>>> +Â Â Â Â Â Â Â Â Â Â Â Â Â }
>>> +Â Â Â Â Â Â Â Â Â Â Â while (next_speculative);
>>> +Â Â Â Â Â Â Â Â Â }
>>> Â Â Â Â Â Â Â Â Â Â Â break;
>>> Â Â Â Â Â Â Â Â Â case CB_CGE_MOVE_CLONES:
>
More information about the Gcc-patches
mailing list