This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH v3] Missed function specialization + partial devirtualization


On 7/31/19 7:29 AM, luoxhu wrote:
> This patch aims to fix PR69678 caused by PGO indirect call profiling
> performance issues.
> The bug that profiling data is never working was fixed by Martin's pull
> back of topN patches, performance got GEOMEAN ~1% improvement.
> Still, currently the default profile only generates SINGLE indirect target
> that called more than 75%.  This patch leverages MULTIPLE indirect
> targets use in LTO-WPA and LTO-LTRANS stage, as a result, function
> specialization, profiling, partial devirtualization, inlining and
> cloning could be done successfully based on it.
> Performance can get improved from 0.70 sec to 0.38 sec on simple tests.
> Details are:
>   1.  PGO with topn is enbaled by default now, but only one indirect
>   target edge will be generated in ipa-profile pass, so add variables to enable
>   multiple speculative edges through passes, speculative_id will record the direct edge
>   index bind to the indirect edge, num_of_ics records how many direct edges owned by
>   the indirect edge, postpone gimple_ic to ipa-profile like default as inline
>   pass will decide whether it is benefit to transform indirect call.
>   2.  Enable LTO WPA/LTRANS stage multiple indirect call targets analysis for
>   profile full support in ipa passes and cgraph_edge functions.  speculative_id
>   can be set by make_speculative id when multiple targets are binded to
>   one indirect edge, and cloned if new edge is cloned.  speculative_id
>   is streamed out and stream int by lto like lto_stmt_uid.
>   3.  Add 1 in module testcase and 2 cross module testcases.
>   4.  Bootstrap and regression test passed on Power8-LE.
> 
> v3 Changes:
>  1. Rebase to trunk.
>  2. Use speculative_id to track and search the reference node matched
>  with the direct edge's callee for multiple targets.  This could
>  eliminate the workaround strstr before.  Actually, it is the caller's
>  response to handle the direct edges mapped to same indirect edge.
>  speculative_call_info will still return one of the direct edge
>  specified, this will leverage current IPA edge process framework mostly.

Thank you for the next version of the patch. Inline comments follow:

> 
> gcc/ChangeLog
> 
> 	2019-07-31  Xiong Hu Luo  <luoxhu@linux.ibm.com>
> 
> 	PR ipa/69678
> 	* cgraph.c (symbol_table::create_edge): Init speculative_id.
> 	(cgraph_edge::make_speculative): Add param for setting speculative_id.
> 	(cgraph_edge::speculative_call_info): Find reference by
> 	speculative_id for multiple indirect targets.
> 	(cgraph_edge::resolve_speculation): Decrease the speculations
> 	for indirect edge, drop it's speculative if not direct target
> 	left.
> 	(cgraph_edge::redirect_call_stmt_to_callee): Likewise.
> 	(cgraph_node::verify_node): Don't report error if speculative
> 	edge not include statement.
> 	* cgraph.h (struct indirect_target_info): New struct.
> 	(indirect_call_targets): New vector variable.
> 	(num_of_ics): New variable.
> 	(make_speculative): Add param for setting speculative_id.
> 	(speculative_id): New variable.
> 	* cgraphclones.c (cgraph_node::create_clone): Clone speculative_id.
> 	* ipa-inline.c (inline_small_functions): Add iterator update.
> 	* ipa-profile.c (ipa_profile_generate_summary): Add indirect
> 	multiple targets logic.
> 	(ipa_profile): Likewise.
> 	* ipa-ref.h (speculative_id): New variable.
> 	* ipa.c (process_references): Fix typo.
> 	* lto-cgraph.c (lto_output_edge): Add indirect multiple targets
> 	logic.  Stream out speculative_id.
> 	(input_edge): Likewise.
> 	* predict.c (dump_prediction): Revome edges count assert to be
> 	precise.
> 	* symtab.c (symtab_node::create_reference): Init speculative_id.
> 	(symtab_node::clone_references): Clone speculative_id.
> 	(symtab_node::clone_referring): Clone speculative_id.
> 	(symtab_node::clone_reference): Clone speculative_id.
> 	(symtab_node::clear_stmts_in_references): Clear speculative_id.
> 	* tree-inline.c (copy_bb): Duplicate all the speculative edges
> 	if indirect call contains multiple speculative targets.
> 	* tree-profile.c (gimple_gen_ic_profiler): Use the new variable
> 	__gcov_indirect_call.counters and __gcov_indirect_call.callee.
> 	(gimple_gen_ic_func_profiler): Likewise.
> 	(pass_ipa_tree_profile::gate): Fix comment typos.
> 	* value-prof.c  (gimple_ic_transform): Handle topn case.
> 	Fix comment typos.
> 
> gcc/testsuite/ChangeLog
> 
> 	2019-07-31  Xiong Hu Luo  <luoxhu@linux.ibm.com>
> 
> 	PR ipa/69678
> 	* gcc.dg/tree-prof/indir-call-prof-topn.c: New testcase.
> 	* gcc.dg/tree-prof/crossmodule-indir-call-topn-1.c: New testcase.
> 	* gcc.dg/tree-prof/crossmodule-indir-call-topn-1a.c: New testcase.
> 	* gcc.dg/tree-prof/crossmodule-indir-call-topn-2.c: New testcase.
> ---
>  gcc/cgraph.c                                  |  70 +++++-
>  gcc/cgraph.h                                  |  28 ++-
>  gcc/cgraphclones.c                            |   1 +
>  gcc/ipa-inline.c                              |  15 +-
>  gcc/ipa-profile.c                             | 238 +++++++++++-------
>  gcc/ipa-ref.h                                 |   1 +
>  gcc/ipa.c                                     |   2 +-
>  gcc/lto-cgraph.c                              |  54 +++-
>  gcc/predict.c                                 |   1 -
>  gcc/symtab.c                                  |   5 +
>  .../tree-prof/crossmodule-indir-call-topn-1.c |  35 +++
>  .../crossmodule-indir-call-topn-1a.c          |  22 ++
>  .../tree-prof/crossmodule-indir-call-topn-2.c |  42 ++++
>  .../gcc.dg/tree-prof/indir-call-prof-topn.c   |  38 +++
>  gcc/tree-inline.c                             |  20 ++
>  gcc/tree-profile.c                            |  12 +-
>  gcc/value-prof.c                              |  80 +++---
>  17 files changed, 500 insertions(+), 164 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1a.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-2.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-topn.c
> 
> diff --git a/gcc/cgraph.c b/gcc/cgraph.c
> index a7e3bcf2132..d7be9e9c545 100644
> --- a/gcc/cgraph.c
> +++ b/gcc/cgraph.c
> @@ -869,6 +869,7 @@ symbol_table::create_edge (cgraph_node *caller, cgraph_node *callee,
>    edge->prev_callee = NULL;
>    edge->next_callee = NULL;
>    edge->lto_stmt_uid = 0;
> +  edge->speculative_id = 0;
>  
>    edge->count = count;
>  
> @@ -1063,7 +1064,8 @@ cgraph_edge::remove (void)
>     Return direct edge created.  */
>  
>  cgraph_edge *
> -cgraph_edge::make_speculative (cgraph_node *n2, profile_count direct_count)
> +cgraph_edge::make_speculative (cgraph_node *n2, profile_count direct_count,
> +			       unsigned int speculative_id)
>  {
>    cgraph_node *n = caller;
>    ipa_ref *ref = NULL;
> @@ -1081,11 +1083,13 @@ cgraph_edge::make_speculative (cgraph_node *n2, profile_count direct_count)
>    else
>      e2->can_throw_external = can_throw_external;
>    e2->lto_stmt_uid = lto_stmt_uid;
> +  e2->speculative_id = speculative_id;
>    e2->in_polymorphic_cdtor = in_polymorphic_cdtor;
>    count -= e2->count;
>    symtab->call_edge_duplication_hooks (this, e2);
>    ref = n->create_reference (n2, IPA_REF_ADDR, call_stmt);
>    ref->lto_stmt_uid = lto_stmt_uid;
> +  ref->speculative_id = speculative_id;
>    ref->speculative = speculative;
>    n2->mark_address_taken ();
>    return e2;
> @@ -1099,6 +1103,38 @@ cgraph_edge::make_speculative (cgraph_node *n2, profile_count direct_count)
>     call) and if one of them exists, all of them must exist.
>  
>     Given speculative call edge, return all three components.
> +
> +   For some indirect edge, it may maps to multiple direct edges, i.e. 1:N.
> +   check the speculative_id to return all the three components for specified
> +   direct edge or indirect edge.
> +   If input is indirect, caller of this function will get the direct edge one by
> +   one, get_edge will just return one of the direct edge mapped to the indirect
> +   edge, the returned direct edge will be resolved or redirected by the caller,
> +   then num_of_ics (speculations) is deceased in each access.
> +   If input is direct, this function will get the indirect edge and reference
> +   with matched speculative_id, the returned edge will also be resolved or
> +   redirected, decrease the speculations accordingly.
> +   Speculations of indirect edge will be dropped only if all direct edges
> +   be handled.
> +
> +   e.g. for indirect edge E statement "call call_dest":
> +
> +   Redirect N3 after redirected N2:
> +
> +   if (call_dest == N2)
> +     n2 ();
> +   else if (call_dest == N3)
> +     n3 ();
> +   else
> +     call call_dest
> +
> +   Resolve N3 and only redirect N2:
> +
> +   if (call_dest == N2)
> +     n2 ();
> +   else
> +     call call_dest
> +
>   */
>  
>  void
> @@ -1138,7 +1174,7 @@ cgraph_edge::speculative_call_info (cgraph_edge *&direct,
>  
>    reference = NULL;
>    for (i = 0; e->caller->iterate_reference (i, ref); i++)
> -    if (ref->speculative
> +    if (ref->speculative && ref->speculative_id == e->speculative_id
>  	&& ((ref->stmt && ref->stmt == e->call_stmt)
>  	    || (!ref->stmt && ref->lto_stmt_uid == e->lto_stmt_uid)))
>        {
> @@ -1199,7 +1235,19 @@ cgraph_edge::resolve_speculation (tree callee_decl)
>           in the functions inlined through it.  */
>      }
>    edge->count += e2->count;
> -  edge->speculative = false;
> +  /* edge is indirect, e2 is direct.  If edge contains multiple speculations,
> +     remove one of speculations for this indirect edge, then if edge still
> +     contains direct target, keep the speculation, next direct target
> +     will continue use it.  Give up speculation completely if no direct
> +     target is left for this indirect edge.  */
> +  if (edge->indirect_info && edge->indirect_info->num_of_ics)
> +    {
> +      edge->indirect_info->num_of_ics--;
> +      if (edge->indirect_info->num_of_ics == 0)
> +	edge->speculative = false;
> +    }
> +  else
> +    edge->speculative = false;
>    e2->speculative = false;
>    ref->remove_reference ();
>    if (e2->indirect_unknown_callee || e2->inline_failed)
> @@ -1333,7 +1381,19 @@ cgraph_edge::redirect_call_stmt_to_callee (void)
>  	  e->caller->set_call_stmt_including_clones (e->call_stmt, new_stmt,
>  						     false);
>  	  e->count = gimple_bb (e->call_stmt)->count;
> -	  e2->speculative = false;
> +	  /* edge is direct, e2 is indirect here.  If e2 contains multiple
> +	     speculations, remove one of speculations for this indirect edge,
> +	     then if e2 still contains direct target, keep the speculation,
> +	     next direct target will continue use it.  Give up speculation
> +	     completely if no direct target is left for this indirect e2.  */
> +	  if (e2->indirect_info && e2->indirect_info->num_of_ics)
> +	    {
> +	      e2->indirect_info->num_of_ics--;
> +	      if (e2->indirect_info->num_of_ics == 0)
> +		e2->speculative = false;
> +	    }
> +	  else
> +	    e2->speculative = false;
>  	  e2->count = gimple_bb (e2->call_stmt)->count;
>  	  ref->speculative = false;
>  	  ref->stmt = NULL;
> @@ -3434,7 +3494,7 @@ cgraph_node::verify_node (void)
>  
>        for (e = callees; e; e = e->next_callee)
>  	{
> -	  if (!e->aux)
> +	  if (!e->aux && !e->speculative)
>  	    {
>  	      error ("edge %s->%s has no corresponding call_stmt",
>  		     identifier_to_locale (e->caller->name ()),
> diff --git a/gcc/cgraph.h b/gcc/cgraph.h
> index fa5224fb3a5..0b453c3534c 100644
> --- a/gcc/cgraph.h
> +++ b/gcc/cgraph.h
> @@ -1630,6 +1630,16 @@ private:
>    void make_speculative (tree otr_type = NULL);
>  };
>  
> +/* Structure containing indirect target information from profile.  */
> +
> +struct GTY (()) indirect_target_info
> +{
> +  /* Profile_id of common target obtained from profile.  */
> +  unsigned int common_target_id;
> +  /* Probability that call will land in function with COMMON_TARGET_ID.  */
> +  int common_target_probability;
> +};
> +
>  /* Structure containing additional information about an indirect call.  */
>  
>  class GTY(()) cgraph_indirect_call_info
> @@ -1648,10 +1658,14 @@ public:
>    int param_index;
>    /* ECF flags determined from the caller.  */
>    int ecf_flags;
> -  /* Profile_id of common target obtrained from profile.  */
> -  int common_target_id;
> -  /* Probability that call will land in function with COMMON_TARGET_ID.  */
> -  int common_target_probability;
> +
> +  /* An indirect call may contain one or multiple call targets.  */
> +  vec<indirect_target_info, va_gc> *indirect_call_targets;
> +
> +  /* Number of indirect targets (speculations) for the indirect call.
> +     Increased by ipa-profile, decreased by resolve_speculation or redirect
> +     statement to callee.  */
> +  unsigned num_of_ics;

Would it be possible to use just the vector and replace all usages of num_of_ics
with indirect_call_targets->length () ? It would make the code easier.

>  
>    /* Set when the call is a virtual call with the parameter being the
>       associated object pointer rather than a simple direct call.  */
> @@ -1708,7 +1722,8 @@ public:
>    /* Turn edge into speculative call calling N2. Update
>       the profile so the direct call is taken COUNT times
>       with FREQUENCY.  */
> -  cgraph_edge *make_speculative (cgraph_node *n2, profile_count direct_count);
> +  cgraph_edge *make_speculative (cgraph_node *n2, profile_count direct_count,
> +				 unsigned int speculative_id = 0);
>  
>     /* Given speculative call edge, return all three components.  */
>    void speculative_call_info (cgraph_edge *&direct, cgraph_edge *&indirect,
> @@ -1786,6 +1801,9 @@ public:
>    /* The stmt_uid of call_stmt.  This is used by LTO to recover the call_stmt
>       when the function is serialized in.  */
>    unsigned int lto_stmt_uid;
> +  /* speculative id is used by multiple indirect targets when the function is
> +   speculated.  */
> +  unsigned int speculative_id;
>    /* Whether this edge was made direct by indirect inlining.  */
>    unsigned int indirect_inlining_edge : 1;
>    /* Whether this edge describes an indirect call with an undetermined
> diff --git a/gcc/cgraphclones.c b/gcc/cgraphclones.c
> index fd867ecac91..ab6b9dd86c8 100644
> --- a/gcc/cgraphclones.c
> +++ b/gcc/cgraphclones.c
> @@ -128,6 +128,7 @@ cgraph_edge::clone (cgraph_node *n, gcall *call_stmt, unsigned stmt_uid,
>    new_edge->inline_failed = inline_failed;
>    new_edge->indirect_inlining_edge = indirect_inlining_edge;
>    new_edge->lto_stmt_uid = stmt_uid;
> +  new_edge->speculative_id = speculative_id;
>    /* Clone flags that depend on call_stmt availability manually.  */
>    new_edge->can_throw_external = can_throw_external;
>    new_edge->call_stmt_cannot_inline_p = call_stmt_cannot_inline_p;
> diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c
> index 939d86ef94a..f5654198877 100644
> --- a/gcc/ipa-inline.c
> +++ b/gcc/ipa-inline.c
> @@ -1877,12 +1877,15 @@ inline_small_functions (void)
>  	}
>        if (has_speculative)
>  	for (edge = node->callees; edge; edge = next)
> -	  if (edge->speculative && !speculation_useful_p (edge,
> -							  edge->aux != NULL))
> -	    {
> -	      edge->resolve_speculation ();
> -	      update = true;
> -	    }
> +	  {
> +	    next = edge->next_callee;
> +	    if (edge->speculative
> +		&& !speculation_useful_p (edge, edge->aux != NULL))
> +	      {
> +		edge->resolve_speculation ();
> +		update = true;
> +	      }
> +	  }
>        if (update)
>  	{
>  	  struct cgraph_node *where = node->global.inlined_to
> diff --git a/gcc/ipa-profile.c b/gcc/ipa-profile.c
> index 970dba39c80..26214eba27e 100644
> --- a/gcc/ipa-profile.c
> +++ b/gcc/ipa-profile.c
> @@ -192,23 +192,39 @@ ipa_profile_generate_summary (void)
>  		  if (h)
>  		    {
>  		      gcov_type val, count, all;
> -		      if (get_nth_most_common_value (NULL, "indirect call", h,
> -						     &val, &count, &all))
> +		      struct cgraph_edge *e = node->get_edge (stmt);
> +		      if (e && !e->indirect_unknown_callee)
> +			continue;
> +
> +		      e->indirect_info->num_of_ics = 0;
> +		      for (unsigned j = 0; j < GCOV_TOPN_VALUES; j++)
>  			{
> -			  struct cgraph_edge * e = node->get_edge (stmt);
> -			  if (e && !e->indirect_unknown_callee)
> +			  if (!get_nth_most_common_value (NULL, "indirect call",
> +							  h, &val, &count, &all,
> +							  j))
> +			    continue;
> +
> +			  if (val == 0)
>  			    continue;
>  
> -			  e->indirect_info->common_target_id = val;
> -			  e->indirect_info->common_target_probability
> +			  struct indirect_target_info item;
> +			  item.common_target_id = val;
> +			  item.common_target_probability
>  			    = GCOV_COMPUTE_SCALE (count, all);
> -			  if (e->indirect_info->common_target_probability > REG_BR_PROB_BASE)
> +			  if (item.common_target_probability > REG_BR_PROB_BASE)
>  			    {
>  			      if (dump_file)
> -				fprintf (dump_file, "Probability capped to 1\n");
> -			      e->indirect_info->common_target_probability = REG_BR_PROB_BASE;
> +				fprintf (dump_file,
> +					 "Probability capped to 1\n");
> +			      item.common_target_probability = REG_BR_PROB_BASE;
>  			    }
> +			  vec_safe_push (
> +			    e->indirect_info->indirect_call_targets, item);
> +			  e->indirect_info->num_of_ics++;
>  			}
> +		      gcc_assert (e->indirect_info->num_of_ics
> +				  <= GCOV_TOPN_VALUES);
> +
>  		      gimple_remove_histogram_value (DECL_STRUCT_FUNCTION (node->decl),
>  						      stmt, h);
>  		    }
> @@ -492,6 +508,7 @@ ipa_profile (void)
>    int nindirect = 0, ncommon = 0, nunknown = 0, nuseless = 0, nconverted = 0;
>    int nmismatch = 0, nimpossible = 0;
>    bool node_map_initialized = false;
> +  gcov_type threshold;
>  
>    if (dump_file)
>      dump_histogram (dump_file, histogram);
> @@ -500,14 +517,12 @@ ipa_profile (void)
>        overall_time += histogram[i]->count * histogram[i]->time;
>        overall_size += histogram[i]->size;
>      }
> +  threshold = 0;
>    if (overall_time)
>      {
> -      gcov_type threshold;
> -
>        gcc_assert (overall_size);
>  
>        cutoff = (overall_time * PARAM_VALUE (HOT_BB_COUNT_WS_PERMILLE) + 500) / 1000;
> -      threshold = 0;
>        for (i = 0; cumulated < cutoff; i++)
>  	{
>  	  cumulated += histogram[i]->count * histogram[i]->time;
> @@ -543,7 +558,7 @@ ipa_profile (void)
>    histogram.release ();
>    histogram_pool.release ();
>  
> -  /* Produce speculative calls: we saved common traget from porfiling into
> +  /* Produce speculative calls: we saved common target from profiling into
>       e->common_target_id.  Now, at link time, we can look up corresponding
>       function node and produce speculative call.  */
>  
> @@ -558,104 +573,133 @@ ipa_profile (void)
>  	{
>  	  if (n->count.initialized_p ())
>  	    nindirect++;
> -	  if (e->indirect_info->common_target_id)
> +	  if (e->indirect_info && e->indirect_info->num_of_ics)
>  	    {
> -	      if (!node_map_initialized)
> -	        init_node_map (false);
> -	      node_map_initialized = true;
> -	      ncommon++;
> -	      n2 = find_func_by_profile_id (e->indirect_info->common_target_id);
> -	      if (n2)
> +	      if (in_lto_p)
>  		{

Starting from here, it's very difficult to read the patch. Reason is that you
wrapped all this in a FOR_EACH_VEC_SAFE_ELT. Would it be please possible
to make a refactoring patch that will put code with all the if-else-else-else..
into a separate function? Then you'll be able to call the function in a small loop:

FOR_EACH_VEC_SAFE_ELT (e->indirect_info->indirect_call_targets, i, item)
   new_function_call...

It will simplify the patch review a lot.


>  		  if (dump_file)
>  		    {
> -		      fprintf (dump_file, "Indirect call -> direct call from"
> -			       " other module %s => %s, prob %3.2f\n",
> -			       n->dump_name (),
> -			       n2->dump_name (),
> -			       e->indirect_info->common_target_probability
> -			       / (float)REG_BR_PROB_BASE);
> -		    }
> -		  if (e->indirect_info->common_target_probability
> -		      < REG_BR_PROB_BASE / 2)
> -		    {
> -		      nuseless++;
> -		      if (dump_file)
> -			fprintf (dump_file,
> -				 "Not speculating: probability is too low.\n");
> -		    }
> -		  else if (!e->maybe_hot_p ())
> -		    {
> -		      nuseless++;
> -		      if (dump_file)
> -			fprintf (dump_file,
> -				 "Not speculating: call is cold.\n");
> -		    }
> -		  else if (n2->get_availability () <= AVAIL_INTERPOSABLE
> -			   && n2->can_be_discarded_p ())
> -		    {
> -		      nuseless++;
> -		      if (dump_file)
> -			fprintf (dump_file,
> -				 "Not speculating: target is overwritable "
> -				 "and can be discarded.\n");
> +		      fprintf (dump_file,
> +			       "Updating hotness threshold in LTO mode.\n");
> +		      fprintf (dump_file, "Updated min count: %" PRId64 "\n",
> +			       (int64_t) threshold);
>  		    }
> -		  else if (ipa_node_params_sum && ipa_edge_args_sum
> -			   && (!vec_safe_is_empty
> -			       (IPA_NODE_REF (n2)->descriptors))
> -			   && ipa_get_param_count (IPA_NODE_REF (n2))
> -			      != ipa_get_cs_argument_count (IPA_EDGE_REF (e))
> -			    && (ipa_get_param_count (IPA_NODE_REF (n2))
> -				>= ipa_get_cs_argument_count (IPA_EDGE_REF (e))
> -				|| !stdarg_p (TREE_TYPE (n2->decl))))
> +		  set_hot_bb_threshold (threshold
> +					/ e->indirect_info->num_of_ics);
> +		}
> +	      if (!node_map_initialized)
> +		init_node_map (false);
> +	      node_map_initialized = true;
> +	      ncommon++;
> +	      unsigned speculative_id = 0;
> +	      struct indirect_target_info *item;
> +	      FOR_EACH_VEC_SAFE_ELT (e->indirect_info->indirect_call_targets, i,
> +				     item)
> +		{
> +		  if ((unsigned) i > e->indirect_info->num_of_ics)
> +		    break;
> +		  n2 = find_func_by_profile_id (item->common_target_id);
> +		  if (n2)
>  		    {
> -		      nmismatch++;
>  		      if (dump_file)
> -			fprintf (dump_file,
> -				 "Not speculating: "
> -				 "parameter count mistmatch\n");
> +			{
> +			  fprintf (dump_file,
> +				   "Indirect call -> direct call from"
> +				   " other module %s => %s, prob %3.2f\n",
> +				   n->dump_name (), n2->dump_name (),
> +				   item->common_target_probability
> +				     / (float) REG_BR_PROB_BASE);
> +			}
> +		      if (item->common_target_probability
> +			  < REG_BR_PROB_BASE / 2)
> +			{
> +			  nuseless++;
> +			  if (dump_file)
> +			    fprintf (
> +			      dump_file,
> +			      "Not speculating: probability is too low.\n");
> +			}
> +		      else if (!e->maybe_hot_p ())
> +			{
> +			  nuseless++;
> +			  if (dump_file)
> +			    fprintf (dump_file,
> +				     "Not speculating: call is cold.\n");
> +			}
> +		      else if (n2->get_availability () <= AVAIL_INTERPOSABLE
> +			       && n2->can_be_discarded_p ())
> +			{
> +			  nuseless++;
> +			  if (dump_file)
> +			    fprintf (dump_file,
> +				     "Not speculating: target is overwritable "
> +				     "and can be discarded.\n");
> +			}
> +		      else if (ipa_node_params_sum && ipa_edge_args_sum
> +			       && (!vec_safe_is_empty (
> +				 IPA_NODE_REF (n2)->descriptors))
> +			       && ipa_get_param_count (IPA_NODE_REF (n2))
> +				    != ipa_get_cs_argument_count (
> +				      IPA_EDGE_REF (e))
> +			       && (ipa_get_param_count (IPA_NODE_REF (n2))
> +				     >= ipa_get_cs_argument_count (
> +				       IPA_EDGE_REF (e))
> +				   || !stdarg_p (TREE_TYPE (n2->decl))))
> +			{
> +			  nmismatch++;
> +			  if (dump_file)
> +			    fprintf (dump_file, "Not speculating: "
> +						"parameter count mismatch\n");
> +			}
> +		      else if (e->indirect_info->polymorphic
> +			       && !opt_for_fn (n->decl, flag_devirtualize)
> +			       && !possible_polymorphic_call_target_p (e, n2))
> +			{
> +			  nimpossible++;
> +			  if (dump_file)
> +			    fprintf (dump_file,
> +				     "Not speculating: "
> +				     "function is not in the polymorphic "
> +				     "call target list\n");
> +			}
> +		      else
> +			{
> +			  /* Target may be overwritable, but profile says that
> +			     control flow goes to this particular implementation
> +			     of N2.  Speculate on the local alias to allow
> +			     inlining.
> +			     */
> +			  if (!n2->can_be_discarded_p ())
> +			    {
> +			      cgraph_node *alias;
> +			      alias = dyn_cast<cgraph_node *> (
> +				n2->noninterposable_alias ());
> +			      if (alias)
> +				n2 = alias;
> +			    }
> +			  nconverted++;
> +			  e->make_speculative (
> +			    n2,
> +			    e->count.apply_probability (
> +			      item->common_target_probability),
> +			    speculative_id);
> +			  update = true;
> +			  speculative_id++;
> +			}
>  		    }
> -		  else if (e->indirect_info->polymorphic
> -			   && !opt_for_fn (n->decl, flag_devirtualize)
> -			   && !possible_polymorphic_call_target_p (e, n2))
> +		  else
>  		    {
> -		      nimpossible++;
>  		      if (dump_file)
>  			fprintf (dump_file,
> -				 "Not speculating: "
> -				 "function is not in the polymorphic "
> -				 "call target list\n");
> +				 "Function with profile-id %i not found.\n",
> +				 item->common_target_id);
> +		      nunknown++;
>  		    }
> -		  else
> -		    {
> -		      /* Target may be overwritable, but profile says that
> -			 control flow goes to this particular implementation
> -			 of N2.  Speculate on the local alias to allow inlining.
> -		       */
> -		      if (!n2->can_be_discarded_p ())
> -			{
> -			  cgraph_node *alias;
> -			  alias = dyn_cast<cgraph_node *> (n2->noninterposable_alias ());
> -			  if (alias)
> -			    n2 = alias;
> -			}
> -		      nconverted++;
> -		      e->make_speculative
> -			(n2,
> -			 e->count.apply_probability
> -				     (e->indirect_info->common_target_probability));
> -		      update = true;
> -		    }
> -		}
> -	      else
> -		{
> -		  if (dump_file)
> -		    fprintf (dump_file, "Function with profile-id %i not found.\n",
> -			     e->indirect_info->common_target_id);
> -		  nunknown++;
>  		}
> +	      if (speculative_id < e->indirect_info->num_of_ics)
> +		e->indirect_info->num_of_ics = speculative_id;
>  	    }
> -	 }
> +	}
>         if (update)
>  	 ipa_update_overall_fn_summary (n);
>       }
> diff --git a/gcc/ipa-ref.h b/gcc/ipa-ref.h
> index 0d8e509c932..3e6562ec9d1 100644
> --- a/gcc/ipa-ref.h
> +++ b/gcc/ipa-ref.h
> @@ -59,6 +59,7 @@ public:
>    symtab_node *referred;
>    gimple *stmt;
>    unsigned int lto_stmt_uid;
> +  unsigned int speculative_id;
>    unsigned int referred_index;
>    ENUM_BITFIELD (ipa_ref_use) use:3;
>    unsigned int speculative:1;
> diff --git a/gcc/ipa.c b/gcc/ipa.c
> index 2496694124c..c1fe081a72d 100644
> --- a/gcc/ipa.c
> +++ b/gcc/ipa.c
> @@ -166,7 +166,7 @@ process_references (symtab_node *snode,
>     devirtualization happens.  After inlining still keep their declarations
>     around, so we can devirtualize to a direct call.
>  
> -   Also try to make trivial devirutalization when no or only one target is
> +   Also try to make trivial devirtualization when no or only one target is
>     possible.  */
>  
>  static void
> diff --git a/gcc/lto-cgraph.c b/gcc/lto-cgraph.c
> index bc0f0107333..1dc856d3c64 100644
> --- a/gcc/lto-cgraph.c
> +++ b/gcc/lto-cgraph.c
> @@ -238,6 +238,7 @@ lto_output_edge (struct lto_simple_output_block *ob, struct cgraph_edge *edge,
>    unsigned int uid;
>    intptr_t ref;
>    struct bitpack_d bp;
> +  unsigned len;
>  
>    if (edge->indirect_unknown_callee)
>      streamer_write_enum (ob->main_stream, LTO_symtab_tags, LTO_symtab_last_tag,
> @@ -265,6 +266,7 @@ lto_output_edge (struct lto_simple_output_block *ob, struct cgraph_edge *edge,
>    bp_pack_enum (&bp, cgraph_inline_failed_t,
>  	        CIF_N_REASONS, edge->inline_failed);
>    bp_pack_var_len_unsigned (&bp, uid);
> +  bp_pack_var_len_unsigned (&bp, edge->speculative_id);
>    bp_pack_value (&bp, edge->indirect_inlining_edge, 1);
>    bp_pack_value (&bp, edge->speculative, 1);
>    bp_pack_value (&bp, edge->call_stmt_cannot_inline_p, 1);
> @@ -291,11 +293,28 @@ lto_output_edge (struct lto_simple_output_block *ob, struct cgraph_edge *edge,
>    streamer_write_bitpack (&bp);
>    if (edge->indirect_unknown_callee)
>      {
> +      struct indirect_target_info *item;
> +      unsigned int i;
> +      len = edge->indirect_info->num_of_ics;
> +      gcc_assert (len <= GCOV_TOPN_VALUES);
> +
>        streamer_write_hwi_stream (ob->main_stream,
> -			         edge->indirect_info->common_target_id);
> -      if (edge->indirect_info->common_target_id)
> -	streamer_write_hwi_stream
> -	   (ob->main_stream, edge->indirect_info->common_target_probability);
> +				 edge->indirect_info->num_of_ics);
> +
> +      if (len)
> +	{
> +	  FOR_EACH_VEC_SAFE_ELT (edge->indirect_info->indirect_call_targets, i,
> +				 item)
> +	    {
> +	      if (i == edge->indirect_info->num_of_ics)
> +		break;

This is one of the places which will simplify with the vec::length.

> +	      streamer_write_hwi_stream (ob->main_stream,
> +					 item->common_target_id);
> +	      if (item->common_target_id)
> +		streamer_write_hwi_stream (ob->main_stream,
> +					   item->common_target_probability);
> +	    }
> +	}
>      }
>  }
>  
> @@ -688,6 +707,7 @@ lto_output_ref (struct lto_simple_output_block *ob, struct ipa_ref *ref,
>        if (ref->stmt)
>  	uid = gimple_uid (ref->stmt) + 1;
>        streamer_write_hwi_stream (ob->main_stream, uid);
> +      streamer_write_hwi_stream (ob->main_stream, ref->speculative_id);
>      }
>  }
>  
> @@ -1420,6 +1440,8 @@ input_ref (class lto_input_block *ib,
>    ref->speculative = speculative;
>    if (is_a <cgraph_node *> (referring_node))
>      ref->lto_stmt_uid = streamer_read_hwi (ib);
> +  if (is_a <cgraph_node *> (referring_node))
> +    ref->speculative_id = streamer_read_hwi (ib);

Here you want to write the condition just once.

>  }
>  
>  /* Read an edge from IB.  NODES points to a vector of previously read nodes for
> @@ -1433,11 +1455,12 @@ input_edge (class lto_input_block *ib, vec<symtab_node *> nodes,
>  {
>    struct cgraph_node *caller, *callee;
>    struct cgraph_edge *edge;
> -  unsigned int stmt_id;
> +  unsigned int stmt_id, speculative_id;
>    profile_count count;
>    cgraph_inline_failed_t inline_failed;
>    struct bitpack_d bp;
>    int ecf_flags = 0;
> +  unsigned i, len;
>  
>    caller = dyn_cast<cgraph_node *> (nodes[streamer_read_hwi (ib)]);
>    if (caller == NULL || caller->decl == NULL_TREE)
> @@ -1457,6 +1480,7 @@ input_edge (class lto_input_block *ib, vec<symtab_node *> nodes,
>    bp = streamer_read_bitpack (ib);
>    inline_failed = bp_unpack_enum (&bp, cgraph_inline_failed_t, CIF_N_REASONS);
>    stmt_id = bp_unpack_var_len_unsigned (&bp);
> +  speculative_id = bp_unpack_var_len_unsigned (&bp);
>  
>    if (indirect)
>      edge = caller->create_indirect_edge (NULL, 0, count);
> @@ -1466,6 +1490,7 @@ input_edge (class lto_input_block *ib, vec<symtab_node *> nodes,
>    edge->indirect_inlining_edge = bp_unpack_value (&bp, 1);
>    edge->speculative = bp_unpack_value (&bp, 1);
>    edge->lto_stmt_uid = stmt_id;
> +  edge->speculative_id = speculative_id;
>    edge->inline_failed = inline_failed;
>    edge->call_stmt_cannot_inline_p = bp_unpack_value (&bp, 1);
>    edge->can_throw_external = bp_unpack_value (&bp, 1);
> @@ -1485,9 +1510,22 @@ input_edge (class lto_input_block *ib, vec<symtab_node *> nodes,
>        if (bp_unpack_value (&bp, 1))
>  	ecf_flags |= ECF_RETURNS_TWICE;
>        edge->indirect_info->ecf_flags = ecf_flags;
> -      edge->indirect_info->common_target_id = streamer_read_hwi (ib);
> -      if (edge->indirect_info->common_target_id)
> -        edge->indirect_info->common_target_probability = streamer_read_hwi (ib);
> +
> +      len = streamer_read_hwi (ib);
> +      edge->indirect_info->num_of_ics = len;
> +
> +      gcc_assert (len <= GCOV_TOPN_VALUES);
> +
> +      if (len)
> +	{
> +	  for (i = 0; i < len; i++)
> +	    {
> +	      indirect_target_info item;

I would also suggest a constructor for indirect_target_info. You initialize it
multiple times.

> +	      item.common_target_id = streamer_read_hwi (ib);
> +	      item.common_target_probability = streamer_read_hwi (ib);
> +	      vec_safe_push (edge->indirect_info->indirect_call_targets, item);
> +	    }
> +	}
>      }
>  }
>  
> diff --git a/gcc/predict.c b/gcc/predict.c
> index 07f66aab7a3..67a859ea436 100644
> --- a/gcc/predict.c
> +++ b/gcc/predict.c
> @@ -763,7 +763,6 @@ dump_prediction (FILE *file, enum br_predictor predictor, int probability,
>        && bb->count.precise_p ()
>        && reason == REASON_NONE)
>      {
> -      gcc_assert (e->count ().precise_p ());
>        fprintf (file, ";;heuristics;%s;%" PRId64 ";%" PRId64 ";%.1f;\n",
>  	       predictor_info[predictor].name,
>  	       bb->count.to_gcov_type (), e->count ().to_gcov_type (),
> diff --git a/gcc/symtab.c b/gcc/symtab.c
> index b1589ea8a40..90a1f62d6a9 100644
> --- a/gcc/symtab.c
> +++ b/gcc/symtab.c
> @@ -603,6 +603,7 @@ symtab_node::create_reference (symtab_node *referred_node,
>    ref->referred = referred_node;
>    ref->stmt = stmt;
>    ref->lto_stmt_uid = 0;
> +  ref->speculative_id = 0;
>    ref->use = use_type;
>    ref->speculative = 0;
>  
> @@ -660,6 +661,7 @@ symtab_node::clone_references (symtab_node *node)
>        ref2 = create_reference (ref->referred, ref->use, ref->stmt);
>        ref2->speculative = speculative;
>        ref2->lto_stmt_uid = stmt_uid;
> +      ref2->speculative_id = ref->speculative_id;
>      }
>  }
>  
> @@ -678,6 +680,7 @@ symtab_node::clone_referring (symtab_node *node)
>        ref2 = ref->referring->create_reference (this, ref->use, ref->stmt);
>        ref2->speculative = speculative;
>        ref2->lto_stmt_uid = stmt_uid;
> +      ref2->speculative_id = ref->speculative_id;
>      }
>  }
>  
> @@ -693,6 +696,7 @@ symtab_node::clone_reference (ipa_ref *ref, gimple *stmt)
>    ref2 = create_reference (ref->referred, ref->use, stmt);
>    ref2->speculative = speculative;
>    ref2->lto_stmt_uid = stmt_uid;
> +  ref2->speculative_id = ref->speculative_id;
>    return ref2;
>  }
>  
> @@ -747,6 +751,7 @@ symtab_node::clear_stmts_in_references (void)
>        {
>  	r->stmt = NULL;
>  	r->lto_stmt_uid = 0;
> +	r->speculative_id = 0;
>        }
>  }
>  
> diff --git a/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1.c b/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1.c
> new file mode 100644
> index 00000000000..e0a83c2e067
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1.c
> @@ -0,0 +1,35 @@
> +/* { dg-require-effective-target lto } */
> +/* { dg-additional-sources "crossmodule-indir-call-topn-1a.c" } */
> +/* { dg-require-profiling "-fprofile-generate" } */
> +/* { dg-options "-O2 -flto -DDOJOB=1 -fdump-ipa-profile_estimate --param indir-call-topn-profile=1" } */
> +
> +#include <stdio.h>
> +
> +typedef int (*fptr) (int);
> +int
> +one (int a);
> +
> +int
> +two (int a);
> +
> +fptr table[] = {&one, &two};
> +
> +int
> +main()
> +{
> +  int i, x;
> +  fptr p = &one;
> +
> +  x = one (3);
> +
> +  for (i = 0; i < 350000000; i++)
> +    {
> +      x = (*p) (3);
> +      p = table[x];
> +    }
> +  printf ("done:%d\n", x);
> +}
> +
> +/* { dg-final-use-not-autofdo { scan-wpa-ipa-dump "Indirect call -> direct call.* one transformation on insn" "profile_estimate" } } */
> +/* { dg-final-use-not-autofdo { scan-wpa-ipa-dump "Indirect call -> direct call.* two transformation on insn" "profile_estimate" } } */
> +
> diff --git a/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1a.c b/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1a.c
> new file mode 100644
> index 00000000000..a8c6e365fb9
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1a.c
> @@ -0,0 +1,22 @@
> +/* It seems there is no way to avoid the other source of mulitple
> +   source testcase from being compiled independently.  Just avoid
> +   error.  */
> +#ifdef DOJOB
> +int
> +one (int a)
> +{
> +  return 1;
> +}
> +
> +int
> +two (int a)
> +{
> +  return 0;
> +}
> +#else
> +int
> +main()
> +{
> +  return 0;
> +}
> +#endif
> diff --git a/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-2.c b/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-2.c
> new file mode 100644
> index 00000000000..aa3887fde83
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-2.c
> @@ -0,0 +1,42 @@
> +/* { dg-require-effective-target lto } */
> +/* { dg-additional-sources "crossmodule-indir-call-topn-1a.c" } */
> +/* { dg-require-profiling "-fprofile-generate" } */
> +/* { dg-options "-O2 -flto -DDOJOB=1 -fdump-ipa-profile_estimate --param indir-call-topn-profile=1" } */
> +
> +#include <stdio.h>
> +
> +typedef int (*fptr) (int);
> +int
> +one (int a);
> +
> +int
> +two (int a);
> +
> +fptr table[] = {&one, &two};
> +
> +int foo ()
> +{
> +  int i, x;
> +  fptr p = &one;
> +
> +  x = one (3);
> +
> +  for (i = 0; i < 350000000; i++)
> +    {
> +      x = (*p) (3);
> +      p = table[x];
> +    }
> +  return x;
> +}
> +
> +int
> +main()
> +{
> +  int x = foo ();
> +  printf ("done:%d\n", x);
> +}
> +
> +/* { dg-final-use-not-autofdo { scan-wpa-ipa-dump "Indirect call -> direct call.* one transformation on insn" "profile_estimate" } } */
> +/* { dg-final-use-not-autofdo { scan-wpa-ipa-dump "Indirect call -> direct call.* two transformation on insn" "profile_estimate" } } */
> +
> +
> diff --git a/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-topn.c b/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-topn.c
> new file mode 100644
> index 00000000000..951bc7ddd19
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-topn.c
> @@ -0,0 +1,38 @@
> +/* { dg-require-profiling "-fprofile-generate" } */
> +/* { dg-options "-O2 -fdump-ipa-profile --param indir-call-topn-profile=1" } */
> +
> +#include <stdio.h>
> +
> +typedef int (*fptr) (int);
> +int
> +one (int a)
> +{
> +  return 1;
> +}
> +
> +int
> +two (int a)
> +{
> +  return 0;
> +}
> +
> +fptr table[] = {&one, &two};
> +
> +int
> +main()
> +{
> +  int i, x;
> +  fptr p = &one;
> +
> +  one (3);
> +
> +  for (i = 0; i < 350000000; i++)
> +    {
> +      x = (*p) (3);
> +      p = table[x];
> +    }
> +  printf ("done:%d\n", x);
> +}
> +
> +/* { dg-final-use-not-autofdo { scan-ipa-dump "Indirect call -> direct call.* one transformation on insn" "profile" } } */
> +/* { dg-final-use-not-autofdo { scan-ipa-dump "Indirect call -> direct call.* two transformation on insn" "profile" } } */
> diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
> index 4311309acce..93d9269aa8f 100644
> --- a/gcc/tree-inline.c
> +++ b/gcc/tree-inline.c
> @@ -2167,6 +2167,26 @@ copy_bb (copy_body_data *id, basic_block bb,
>  
>  			  gcc_assert (!edge->indirect_unknown_callee);
>  			  old_edge->speculative_call_info (direct, indirect, ref);
> +			  while (old_edge->next_callee
> +				 && old_edge->next_callee->speculative
> +				 && indirect->indirect_info
> +				 && indirect->indirect_info->num_of_ics > 1)

Maybe indirect->indirect_info && indirect->indirect_info->num_of_ics can be
put into a function indirect->has_indirect_call_p () ?

> +			    {
> +			      /* Some speculative calls may contain more than
> +				 one direct target, loop iterate it to clone all
> +				 related direct edges before cloning the related
> +				 indirect edge.  */
> +			      id->dst_node->clone_reference (ref, stmt);
> +
> +			      edge = old_edge->next_callee;
> +			      edge = edge->clone (id->dst_node, call_stmt,
> +						  gimple_uid (stmt), num, den,
> +						  true);
> +			      old_edge = old_edge->next_callee;
> +			      gcc_assert (!edge->indirect_unknown_callee);
> +			      old_edge->speculative_call_info (direct, indirect,
> +							       ref);
> +			    }
>  
>  			  profile_count indir_cnt = indirect->count;
>  			  indirect = indirect->clone (id->dst_node, call_stmt,
> diff --git a/gcc/tree-profile.c b/gcc/tree-profile.c
> index 554a8c98419..f35bd3f9d7a 100644
> --- a/gcc/tree-profile.c
> +++ b/gcc/tree-profile.c
> @@ -74,8 +74,8 @@ static GTY(()) tree ic_tuple_callee_field;
>  /* Do initialization work for the edge profiler.  */
>  
>  /* Add code:
> -   __thread gcov*	__gcov_indirect_call_counters; // pointer to actual counter
> -   __thread void*	__gcov_indirect_call_callee; // actual callee address
> +   __thread gcov*	__gcov_indirect_call.counters; // pointer to actual counter
> +   __thread void*	__gcov_indirect_call.callee; // actual callee address
>     __thread int __gcov_function_counter; // time profiler function counter
>  */
>  static void
> @@ -384,7 +384,7 @@ gimple_gen_ic_profiler (histogram_value value, unsigned tag, unsigned base)
>        f_1 = foo;
>        __gcov_indirect_call.counters = &__gcov4.main[0];
>        PROF_9 = f_1;
> -      __gcov_indirect_call_callee = PROF_9;
> +      __gcov_indirect_call.callee = PROF_9;
>        _4 = f_1 ();
>     */
>  
> @@ -447,11 +447,11 @@ gimple_gen_ic_func_profiler (void)
>  
>    /* Insert code:
>  
> -     if (__gcov_indirect_call_callee != NULL)
> +     if (__gcov_indirect_call.callee != NULL)
>         __gcov_indirect_call_profiler_v3 (profile_id, &current_function_decl);
>  
>       The function __gcov_indirect_call_profiler_v3 is responsible for
> -     resetting __gcov_indirect_call_callee to NULL.  */
> +     resetting __gcov_indirect_call.callee to NULL.  */
>  
>    gimple_stmt_iterator gsi = gsi_start_bb (cond_bb);
>    void0 = build_int_cst (ptr_type_node, 0);
> @@ -893,7 +893,7 @@ pass_ipa_tree_profile::gate (function *)
>  {
>    /* When profile instrumentation, use or test coverage shall be performed.
>       But for AutoFDO, this there is no instrumentation, thus this pass is
> -     diabled.  */
> +     disabled.  */
>    return (!in_lto_p && !flag_auto_profile
>  	  && (flag_branch_probabilities || flag_test_coverage
>  	      || profile_arc_flag));
> diff --git a/gcc/value-prof.c b/gcc/value-prof.c
> index 759458868a8..cbcc104bd5c 100644
> --- a/gcc/value-prof.c
> +++ b/gcc/value-prof.c

Please rebase changes in this file. I installed recently changes that collide with your patch.

Thanks,
Martin

> @@ -1404,11 +1404,10 @@ gimple_ic (gcall *icall_stmt, struct cgraph_node *direct_call,
>    return dcall_stmt;
>  }
>  
> -/*
> -  For every checked indirect/virtual call determine if most common pid of
> -  function/class method has probability more than 50%. If yes modify code of
> -  this call to:
> - */
> +/* There maybe multiple indirect targets in histogram.  Check every
> +   indirect/virtual call if callee function exists, if not exist, leave it to
> +   LTO stage for later process.  Modify code of this indirect call to an if-else
> +   structure in ipa-profile finally.  */
>  
>  static bool
>  gimple_ic_transform (gimple_stmt_iterator *gsi)
> @@ -1432,54 +1431,65 @@ gimple_ic_transform (gimple_stmt_iterator *gsi)
>    if (!histogram)
>      return false;
>  
> -  if (!get_nth_most_common_value (NULL, "indirect call", histogram, &val,
> -				  &count, &all))
> -    return false;
> +  count = 0;
> +  all = histogram->hvalue.counters[0];
>  
> -  if (4 * count <= 3 * all)
> -    return false;
> +  for (unsigned j = 0; j < GCOV_TOPN_VALUES; j++)
> +    {
> +      if (!get_nth_most_common_value (NULL, "indirect call", histogram, &val,
> +				      &count, &all, j))
> +	continue;
>  
> -  direct_call = find_func_by_profile_id ((int)val);
> +      if (4 * count <= all)
> +	continue;
>  
> -  if (direct_call == NULL)
> -    {
> -      if (val)
> +      direct_call = find_func_by_profile_id ((int) val);
> +
> +      if (direct_call == NULL)
> +	{
> +	  if (val)
> +	    {
> +	      if (dump_file)
> +		{
> +		  fprintf (dump_file,
> +			   "Indirect call -> direct call from other module");
> +		  print_generic_expr (dump_file, gimple_call_fn (stmt),
> +				      TDF_SLIM);
> +		  fprintf (dump_file, "=> %i (will resolve only with LTO)\n",
> +			   (int) val);
> +		}
> +	    }
> +	  continue;
> +	}
> +
> +      if (!check_ic_target (stmt, direct_call))
>  	{
>  	  if (dump_file)
>  	    {
> -	      fprintf (dump_file, "Indirect call -> direct call from other module");
> +	      fprintf (dump_file, "Indirect call -> direct call ");
>  	      print_generic_expr (dump_file, gimple_call_fn (stmt), TDF_SLIM);
> -	      fprintf (dump_file, "=> %i (will resolve only with LTO)\n", (int)val);
> +	      fprintf (dump_file, "=> ");
> +	      print_generic_expr (dump_file, direct_call->decl, TDF_SLIM);
> +	      fprintf (dump_file,
> +		       " transformation skipped because of type mismatch");
> +	      print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
>  	    }
> +	  gimple_remove_histogram_value (cfun, stmt, histogram);
> +	  return false;
>  	}
> -      return false;
> -    }
>  
> -  if (!check_ic_target (stmt, direct_call))
> -    {
>        if (dump_file)
>  	{
>  	  fprintf (dump_file, "Indirect call -> direct call ");
>  	  print_generic_expr (dump_file, gimple_call_fn (stmt), TDF_SLIM);
>  	  fprintf (dump_file, "=> ");
>  	  print_generic_expr (dump_file, direct_call->decl, TDF_SLIM);
> -	  fprintf (dump_file, " transformation skipped because of type mismatch");
> +	  fprintf (dump_file,
> +		   " transformation on insn postponed to ipa-profile");
>  	  print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
> +	  fprintf (dump_file, "hist->count %" PRId64 " hist->all %" PRId64 "\n",
> +		   count, all);
>  	}
> -      gimple_remove_histogram_value (cfun, stmt, histogram);
> -      return false;
> -    }
> -
> -  if (dump_file)
> -    {
> -      fprintf (dump_file, "Indirect call -> direct call ");
> -      print_generic_expr (dump_file, gimple_call_fn (stmt), TDF_SLIM);
> -      fprintf (dump_file, "=> ");
> -      print_generic_expr (dump_file, direct_call->decl, TDF_SLIM);
> -      fprintf (dump_file, " transformation on insn postponned to ipa-profile");
> -      print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
> -      fprintf (dump_file, "hist->count %" PRId64
> -	       " hist->all %" PRId64"\n", count, all);
>      }
>  
>    return true;
> 


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]