This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Speculative call support in the callgraph
- From: Xinliang David Li <davidxl at google dot com>
- To: Jan Hubicka <hubicka at ucw dot cz>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>, marxin dot liska at gmail dot com, Martin Jambor <mjambor at suse dot cz>
- Date: Fri, 9 Aug 2013 09:44:11 -0700
- Subject: Re: Speculative call support in the callgraph
- References: <20130809121840 dot GA28721 at kam dot mff dot cuni dot cz>
I have not looked at the details. One high level question: this form
seems to only support one indirect target case. LIPO uses TOPN
indirect target profiling (tracking multiple targets), which can be
used by LTO as well (when the topn profiling gets into trunk).
thanks,
David
On Fri, Aug 9, 2013 at 5:18 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
> Hi,
> this patch adds support for speculative calls into callgraph. The idea is that
> any IPA optimization that believes it knows likely target of an indirect call
> (currently I use it for cross-module indirect call profiling, but I expect
> Martin J. can easily add support for ipa-cp and I hope to add speculative
> devirtualization in foreseeable future since it should make difference for
> Firefox). Speculative call replaces indirect call
>
> call_target ()
>
> by:
>
> if (call_target == expected_fn)
> expected_fn ();
> else
> call_target ();
>
> We already know how to produce these by gimple_ic, however the idea is to not
> modify the body immediately (IPA passes can not do that), but do that only when
> we are applying ipa transformations.
>
> Every speculative call is represented by three components attached
> to the same call statement (the original indirect call):
> 1) a direct call (to expected_fn)
> 2) an indirect call (to call_target)
> 3) a IPA_REF_ADDR refrence to expected_fn.
>
> Each of those can be found by having "speculative" flag set.
>
> This representation allows IPA passes to be smart about speculative calls.
> 1) the direct calls can be inlined or redirected to cprop clones just as any
> ordinary calls. The reference still links to the original call target making
> it possible to later expand the test.
> 2) When indirect call target is found by ipa propagation, the call can be turned back
> to a direct call. Either by using the exisitng speculative direct call (if
> speculation was right) and keeping optimizations already attached to it
> or by turning the indirect call into new target if speculation was wrong.
> (this is done by resolve_speuclative_call)
> 3) at the end of IPA optimization we may get rid of speuclative calls that
> don't seem to buy anything. I think that on most modern cores the
> speculative sequence is actually worse than the indirect call when
> the expected_fn call stays uninlined or unpropagated.
> (not implemented yet)
> 4) inliner can adjust its metric knowing that inlining speuclative call
> is costier (disabling 3) (also not implemented yet).
>
> This patch implements the infrastructure only. I will send the corss-module
> indirect call profiling code in followup patch.
>
> I hoped this all to go quite smoothly into exisitng WHOPR infrastructure that
> juding from the size of patch is not true. I had to deal with some issues:
>
> 1) I am not linking the three components of calls explicitely together. Instead
> I rely on statement pointers in both edges&refs to allow later lookup of
> the components (by cgraph_speculative_call_info).
>
> ipa-reference call statements are however not streamed by LTO, so I had to
> implement the streaming. (hitting some surprises handled by preparation
> patches)
> 2) IPA-refs are not really maintained through versioning and inlining that is
> needed now for speculative expansion to work. Currently we update only
> callgraph edges and we rebuild ipa-refs (and wrongly so not updating
> clones that is harmless since they are used only for unreachable function
> removal).
>
> I solved this partly by updating speculative calls only (since I can use
> the existing callgraph edge path).
> tree-inline got fat&ugly, I need to push higher the plans to clean it up.
> 3) cgraph_edge statement hash expect only one call per statement. I fixed
> this by always recording the direct speculative call only.
> 4) ipa-prop and ipa-cp was not expecting the function to turn indirect call
> to direct call to return different edge that is passed to it (that happens
> when speculation was right). It also accesses indirect_call_info in
> edges that are already direct.
> I fixed these and added code to explicitely free the indirect call info
> when they are turned to direct edges. Until now it stayed dormant
> in memory.
> 5) Indirect call expansion breaks basic blocks. Currently it happens during
> call redirection performed by inliner. It needs to be moved later in
> tree-inline after the body is fully duplicated or we ICE on trying to fixup
> not yet finished loop structure in split_bb.
> 6) ipa-inline was not ready for callgraph edges to be removed as result
> of inlining. This was longer time on my TODO, so I fixed it by adding
> the missing edge removal hook.
>
> Bootstrapped/regtested x86_64-linux, will commit it later today. Comments welcome.
>
> Honza
> * cgraphbuild.c (cgraph_rebuild_references): Rebuild only non-speculative
> refs.
> * cgraph.c (cgraph_update_edge_in_call_site_hash): New function.
> (cgraph_add_edge_to_call_site_hash): Deal with speculative calls.
> (cgraph_set_call_stmt): Likewise.
> (cgraph_create_edge_1): Fix release checking compilatoin;
> clear lto_stmt_uid.
> (cgraph_free_edge): Free indirect info.
> (cgraph_turn_edge_to_speculative): New function.
> (cgraph_speculative_call_info): New function.
> (cgraph_make_edge_direct): Return direct edge; handle speculation.
> (cgraph_redirect_edge_call_stmt_to_callee): Expand speculative
> edges.
> (dump_cgraph_node): Dump speculation.
> (verify_edge_count_and_frequency): Accept speculative edges.
> (verify_edge_corresponds_to_fndecl): Handle partitioned cgraph.
> (verify_cgraph_node): Handle speculation.
> * cgraph.h (cgraph_edge): Add SPECULATIVE flag.
> (cgraph_set_call_stmt): Update prototype.
> (cgraph_make_edge_direct): Update prototype.
> (cgraph_speculative_call_info): Declare.
> * ipa-cp.c (ipcp_discover_new_direct_edges): Be ready for edge
> to change; update call of ipa_find_references.
> * ipa-ref.c (ipa_record_reference): Fix return value; clear
> lto_stmt_uid and speculative flags.
> (ipa_dump_references): Dump speculation.
> (ipa_clone_references): Clone speculative flag.
> (ipa_clone_referring): Likewise.
> (ipa_clone_ref): New function.
> (ipa_find_reference): Look into lto_stmt_uids
> (ipa_clear_stmts_in_references): Do not clear speculative calls.
> * ipa-ref.h (ipa_ref): Add lto_stmt_uid and speculative flags.
> (ipa_find_reference): Update declaration.
> (ipa_clone_ref): Declare.
> * lto-cgraph.c (lto_output_edge): Make lto_stmt_uids start from 0;
> stream speculative flag.
> (lto_output_ref): Stream statements uids and speculation.
> (input_ref): Likewise.
> (input_edge): Stream speuclation.
> * cgraphclones.c (cgraph_clone_edge): Clone speculation.
> (cgraph_set_call_stmt_including_clones): Handle speculation.
> * ipa-inline.c (heap_edge_removal_hook): New function.
> (inline_small_functions): Register it.
> * lto-streamer-in.c (fixup_call_stmt_edges_1): Bounds checking;
> also initialize refs.
> * ipa-prop.c (ipa_make_edge_direct_to_target): Be ready for
> edge to change.
> (try_make_edge_direct_simple_call): Likewise.
> (try_make_edge_direct_simple_call): Likewise.
> (update_indirect_edges_after_inlining): Likewise.
> (remove_described_reference): Look proper lto_stmt_uid.
> (propagate_controlled_uses): Likewise.
> (propagate_controlled_uses): Liekwise.
> * tree-inline.c (copy_bb): Copy speculative edges.
> (redirect_all_calls): New function.
> (copy_cfg_body): Do redirection after loop info
> is updated.
> (delete_unreachable_blocks_update_callgraph): Updadte
> speculation.
> * value-prof.c (gimple_ic): Export.
> * value-prof.h (gimple_ic): Declare.
>
> Index: cgraphbuild.c
> ===================================================================
> --- cgraphbuild.c (revision 201601)
> +++ cgraphbuild.c (working copy)
> @@ -483,8 +483,15 @@ cgraph_rebuild_references (void)
> basic_block bb;
> struct cgraph_node *node = cgraph_get_node (current_function_decl);
> gimple_stmt_iterator gsi;
> + struct ipa_ref *ref;
> + int i;
>
> - ipa_remove_all_references (&node->symbol.ref_list);
> + /* Keep speculative references for further cgraph edge expansion. */
> + for (i = 0; ipa_ref_list_reference_iterate (&node->symbol.ref_list, i, ref);)
> + if (!ref->speculative)
> + ipa_remove_reference (ref);
> + else
> + i++;
>
> node->count = ENTRY_BLOCK_PTR->count;
>
> Index: cgraph.c
> ===================================================================
> --- cgraph.c (revision 201601)
> +++ cgraph.c (working copy)
> @@ -674,14 +674,36 @@ edge_eq (const void *x, const void *y)
> /* Add call graph edge E to call site hash of its caller. */
>
> static inline void
> +cgraph_update_edge_in_call_site_hash (struct cgraph_edge *e)
> +{
> + void **slot;
> + slot = htab_find_slot_with_hash (e->caller->call_site_hash,
> + e->call_stmt,
> + htab_hash_pointer (e->call_stmt),
> + INSERT);
> + *slot = e;
> +}
> +
> +/* Add call graph edge E to call site hash of its caller. */
> +
> +static inline void
> cgraph_add_edge_to_call_site_hash (struct cgraph_edge *e)
> {
> void **slot;
> + /* There are two speculative edges for every statement (one direct,
> + one indirect); always hash the direct one. */
> + if (e->speculative && e->indirect_unknown_callee)
> + return;
> slot = htab_find_slot_with_hash (e->caller->call_site_hash,
> e->call_stmt,
> htab_hash_pointer (e->call_stmt),
> INSERT);
> - gcc_assert (!*slot);
> + if (*slot)
> + {
> + gcc_assert (((struct cgraph_edge *)*slot)->speculative);
> + return;
> + }
> + gcc_assert (!*slot || e->speculative);
> *slot = e;
> }
>
> @@ -732,14 +754,33 @@ cgraph_edge (struct cgraph_node *node, g
> }
>
>
> -/* Change field call_stmt of edge E to NEW_STMT. */
> +/* Change field call_stmt of edge E to NEW_STMT.
> + If UPDATE_SPECULATIVE and E is any component of speculative
> + edge, then update all components. */
>
> void
> -cgraph_set_call_stmt (struct cgraph_edge *e, gimple new_stmt)
> +cgraph_set_call_stmt (struct cgraph_edge *e, gimple new_stmt,
> + bool update_speculative)
> {
> tree decl;
>
> - if (e->caller->call_site_hash)
> + /* Speculative edges has three component, update all of them
> + when asked to. */
> + if (update_speculative && e->speculative)
> + {
> + struct cgraph_edge *direct, *indirect;
> + struct ipa_ref *ref;
> +
> + cgraph_speculative_call_info (e, direct, indirect, ref);
> + cgraph_set_call_stmt (direct, new_stmt, false);
> + cgraph_set_call_stmt (indirect, new_stmt, false);
> + ref->stmt = new_stmt;
> + return;
> + }
> +
> + /* Only direct speculative edges go to call_site_hash. */
> + if (e->caller->call_site_hash
> + && (!e->speculative || !e->indirect_unknown_callee))
> {
> htab_remove_elt_with_hash (e->caller->call_site_hash,
> e->call_stmt,
> @@ -755,7 +796,7 @@ cgraph_set_call_stmt (struct cgraph_edge
> struct cgraph_node *new_callee = cgraph_get_node (decl);
>
> gcc_checking_assert (new_callee);
> - cgraph_make_edge_direct (e, new_callee);
> + e = cgraph_make_edge_direct (e, new_callee);
> }
>
> push_cfun (DECL_STRUCT_FUNCTION (e->caller->symbol.decl));
> @@ -781,7 +822,10 @@ cgraph_create_edge_1 (struct cgraph_node
> {
> /* This is a rather expensive check possibly triggering
> construction of call stmt hashtable. */
> - gcc_checking_assert (!cgraph_edge (caller, call_stmt));
> +#ifdef ENABLE_CHECKING
> + struct cgraph_edge *e;
> + gcc_checking_assert (!(e=cgraph_edge (caller, call_stmt)) || e->speculative);
> +#endif
>
> gcc_assert (is_gimple_call (call_stmt));
> }
> @@ -804,6 +848,7 @@ cgraph_create_edge_1 (struct cgraph_node
> edge->next_caller = NULL;
> edge->prev_callee = NULL;
> edge->next_callee = NULL;
> + edge->lto_stmt_uid = 0;
>
> edge->count = count;
> gcc_assert (count >= 0);
> @@ -937,6 +982,9 @@ cgraph_free_edge (struct cgraph_edge *e)
> {
> int uid = e->uid;
>
> + if (e->indirect_info)
> + ggc_free (e->indirect_info);
> +
> /* Clear out the edge so we do not dangle pointers. */
> memset (e, 0, sizeof (*e));
> e->uid = uid;
> @@ -977,6 +1025,110 @@ cgraph_set_edge_callee (struct cgraph_ed
> e->callee = n;
> }
>
> +/* Turn edge E into speculative call calling N2. Update
> + the profile so the direct call is taken COUNT times
> + with FREQUENCY.
> +
> + At clone materialization time, the indirect call E will
> + be expanded as:
> +
> + if (call_dest == N2)
> + n2 ();
> + else
> + call call_dest
> +
> + At this time the function just creates the direct call,
> + the referencd representing the if conditional and attaches
> + them all to the orginal indirect call statement. */
> +
> +void
> +cgraph_turn_edge_to_speculative (struct cgraph_edge *e,
> + struct cgraph_node *n2,
> + gcov_type direct_count,
> + int direct_frequency)
> +{
> + struct cgraph_node *n = e->caller;
> + struct ipa_ref *ref;
> + struct cgraph_edge *e2;
> +
> + if (dump_file)
> + {
> + fprintf (dump_file, "Indirect call -> direct call from"
> + " other module %s/%i => %s/%i\n",
> + xstrdup (cgraph_node_name (n)), n->symbol.order,
> + xstrdup (cgraph_node_name (n2)), n2->symbol.order);
> + }
> + e->speculative = true;
> + e2 = cgraph_create_edge (n, n2, e->call_stmt, direct_count, direct_frequency);
> + initialize_inline_failed (e2);
> + e2->speculative = true;
> + e2->call_stmt = e->call_stmt;
> + e2->can_throw_external = e->can_throw_external;
> + e2->lto_stmt_uid = e->lto_stmt_uid;
> + e->count -= e2->count;
> + e->frequency -= e2->frequency;
> + cgraph_call_edge_duplication_hooks (e, e2);
> + ref = ipa_record_reference ((symtab_node)n, (symtab_node)n2,
> + IPA_REF_ADDR, e->call_stmt);
> + ref->lto_stmt_uid = e->lto_stmt_uid;
> + ref->speculative = e->speculative;
> +}
> +
> +/* Speculative call consist of three components:
> + 1) an indirect edge representing the original call
> + 2) an direct edge representing the new call
> + 3) ADDR_EXPR reference representing the speculative check.
> + All three components are attached to single statement (the indirect
> + call) and if one of them exists, all of them must exist.
> +
> + Given speculative call edge E, return all three components.
> + */
> +
> +void
> +cgraph_speculative_call_info (struct cgraph_edge *e,
> + struct cgraph_edge *&indirect,
> + struct cgraph_edge *&direct,
> + struct ipa_ref *&reference)
> +{
> + struct ipa_ref *ref;
> + int i;
> + struct cgraph_edge *e2;
> +
> + if (!e->indirect_unknown_callee)
> + for (e2 = e->caller->indirect_calls;
> + e2->call_stmt != e->call_stmt || e2->lto_stmt_uid != e->lto_stmt_uid;
> + e2 = e2->next_callee)
> + ;
> + else
> + {
> + e2 = e;
> + /* We can take advantage of the call stmt hash. */
> + if (e2->call_stmt)
> + {
> + e = cgraph_edge (e->caller, e2->call_stmt);
> + gcc_assert (!e->speculative && !e->indirect_unknown_callee);
> + }
> + else
> + for (e = e->caller->callees;
> + e2->call_stmt != e->call_stmt || e2->lto_stmt_uid != e->lto_stmt_uid;
> + e = e->next_callee)
> + ;
> + }
> + gcc_assert (e->speculative && e2->speculative);
> + indirect = e;
> + direct = e2;
> +
> + reference = NULL;
> + for (i = 0; ipa_ref_list_reference_iterate (&e->caller->symbol.ref_list, i, ref); i++)
> + if (ref->speculative
> + && ((ref->stmt && ref->stmt == e->call_stmt)
> + || (ref->lto_stmt_uid == e->lto_stmt_uid)))
> + {
> + reference = ref;
> + break;
> + }
> +}
> +
> /* Redirect callee of E to N. The function does not update underlying
> call expression. */
>
> @@ -990,14 +1142,74 @@ cgraph_redirect_edge_callee (struct cgra
> cgraph_set_edge_callee (e, n);
> }
>
> +/* Speculative call EDGE turned out to be direct call to CALLE_DECL.
> + Remove the speculative call sequence and return edge representing the call.
> + It is up to caller to redirect the call as appropriate. */
> +
> +static struct cgraph_edge *
> +cgraph_resolve_speculation (struct cgraph_edge *edge, tree callee_decl)
> +{
> + struct cgraph_edge *e2;
> + struct ipa_ref *ref;
> +
> + gcc_assert (edge->speculative);
> + cgraph_speculative_call_info (edge, e2, edge, ref);
> + if (ref->referred->symbol.decl != callee_decl)
> + {
> + if (dump_file)
> + {
> + fprintf (dump_file, "Speculative indirect call %s/%i => %s/%i has "
> + "turned out to have contradicitng known target ",
> + xstrdup (cgraph_node_name (edge->caller)), edge->caller->symbol.order,
> + xstrdup (cgraph_node_name (e2->callee)), e2->callee->symbol.order);
> + print_generic_expr (dump_file, callee_decl, 0);
> + fprintf (dump_file, "\n");
> + }
> + }
> + else
> + {
> + struct cgraph_edge *tmp = edge;
> + if (dump_file)
> + fprintf (dump_file, "Speculative call turned into direct call.\n");
> + edge = e2;
> + e2 = tmp;
> + }
> + edge->count += e2->count;
> + edge->frequency += e2->frequency;
> + edge->speculative = false;
> + e2->speculative = false;
> + if (e2->indirect_unknown_callee || e2->inline_failed)
> + cgraph_remove_edge (e2);
> + else
> + cgraph_remove_node_and_inline_clones (e2->callee, NULL);
> + if (edge->caller->call_site_hash)
> + cgraph_update_edge_in_call_site_hash (edge);
> + ipa_remove_reference (ref);
> + return edge;
> +}
> +
> /* Make an indirect EDGE with an unknown callee an ordinary edge leading to
> CALLEE. DELTA is an integer constant that is to be added to the this
> pointer (first parameter) to compensate for skipping a thunk adjustment. */
>
> -void
> +struct cgraph_edge *
> cgraph_make_edge_direct (struct cgraph_edge *edge, struct cgraph_node *callee)
> {
> + gcc_assert (edge->indirect_unknown_callee);
> +
> + /* If we are redirecting speculative call, make it non-speculative. */
> + if (edge->indirect_unknown_callee && edge->speculative)
> + {
> + edge = cgraph_resolve_speculation (edge, callee->symbol.decl);
> +
> + /* On successful speculation just return the pre existing direct edge. */
> + if (!edge->indirect_unknown_callee)
> + return edge;
> + }
> +
> edge->indirect_unknown_callee = 0;
> + ggc_free (edge->indirect_info);
> + edge->indirect_info = NULL;
>
> /* Get the edge out of the indirect edge list. */
> if (edge->prev_callee)
> @@ -1024,6 +1236,7 @@ cgraph_make_edge_direct (struct cgraph_e
>
> /* We need to re-determine the inlining status of the edge. */
> initialize_inline_failed (edge);
> + return edge;
> }
>
> /* If necessary, change the function declaration in the call statement
> @@ -1039,6 +1252,50 @@ cgraph_redirect_edge_call_stmt_to_callee
> struct cgraph_node *node;
> #endif
>
> + if (e->speculative)
> + {
> + struct cgraph_edge *e2;
> + gimple new_stmt;
> + struct ipa_ref *ref;
> +
> + cgraph_speculative_call_info (e, e, e2, ref);
> + if (gimple_call_fndecl (e->call_stmt))
> + e = cgraph_resolve_speculation (e, gimple_call_fndecl (e->call_stmt));
> + else
> + {
> + if (dump_file)
> + fprintf (dump_file, "Expanding speculative call of %s/%i -> %s/%i\n",
> + xstrdup (cgraph_node_name (e->caller)), e->caller->symbol.order,
> + xstrdup (cgraph_node_name (e->callee)), e->callee->symbol.order);
> + gcc_assert (e2->speculative);
> + push_cfun (DECL_STRUCT_FUNCTION (e->caller->symbol.decl));
> + new_stmt = gimple_ic (e->call_stmt, cgraph (ref->referred),
> + e->count || e2->count
> + ? RDIV (e->count * REG_BR_PROB_BASE,
> + e->count + e2->count)
> + : e->frequency || e2->frequency
> + ? RDIV (e->frequency * REG_BR_PROB_BASE,
> + e->frequency + e2->frequency)
> + : REG_BR_PROB_BASE / 2,
> + e->count, e->count + e2->count);
> + e->speculative = false;
> + cgraph_set_call_stmt_including_clones (e->caller, e->call_stmt, new_stmt, false);
> + e->frequency = compute_call_stmt_bb_frequency (e->caller->symbol.decl,
> + gimple_bb (e->call_stmt));
> + e2->frequency = compute_call_stmt_bb_frequency (e2->caller->symbol.decl,
> + gimple_bb (e2->call_stmt));
> + e2->speculative = false;
> + ref->speculative = false;
> + ref->stmt = NULL;
> + /* Indirect edges are not both in the call site hash.
> + get it updated. */
> + if (e->caller->call_site_hash)
> + cgraph_update_edge_in_call_site_hash (e2);
> + pop_cfun ();
> + /* Continue redirecting E to proper target. */
> + }
> + }
> +
> if (e->indirect_unknown_callee
> || decl == e->callee->symbol.decl)
> return e->call_stmt;
> @@ -1099,7 +1356,7 @@ cgraph_redirect_edge_call_stmt_to_callee
> update_stmt (new_stmt);
> }
>
> - cgraph_set_call_stmt_including_clones (e->caller, e->call_stmt, new_stmt);
> + cgraph_set_call_stmt_including_clones (e->caller, e->call_stmt, new_stmt, false);
>
> if (cgraph_dump_file)
> {
> @@ -1603,6 +1860,8 @@ dump_cgraph_node (FILE *f, struct cgraph
> if (edge->frequency)
> fprintf (f, "(%.2f per call) ",
> edge->frequency / (double)CGRAPH_FREQ_BASE);
> + if (edge->speculative)
> + fprintf(f, "(speculative) ");
> if (!edge->inline_failed)
> fprintf(f, "(inlined) ");
> if (edge->indirect_inlining_edge)
> @@ -1616,6 +1875,8 @@ dump_cgraph_node (FILE *f, struct cgraph
> {
> fprintf (f, "%s/%i ", cgraph_node_asm_name (edge->callee),
> edge->callee->symbol.order);
> + if (edge->speculative)
> + fprintf(f, "(speculative) ");
> if (!edge->inline_failed)
> fprintf(f, "(inlined) ");
> if (edge->indirect_inlining_edge)
> @@ -2262,6 +2523,7 @@ verify_edge_count_and_frequency (struct
> }
> if (gimple_has_body_p (e->caller->symbol.decl)
> && !e->caller->global.inlined_to
> + && !e->speculative
> /* FIXME: Inline-analysis sets frequency to 0 when edge is optimized out.
> Remove this once edges are actually removed from the function at that time. */
> && (e->frequency
> @@ -2316,7 +2578,7 @@ verify_edge_corresponds_to_fndecl (struc
>
> /* We do not know if a node from a different partition is an alias or what it
> aliases and therefore cannot do the former_clone_of check reliably. */
> - if (!node || node->symbol.in_other_partition)
> + if (!node || node->symbol.in_other_partition || e->callee->symbol.in_other_partition)
> return false;
> node = cgraph_function_or_thunk_node (node, NULL);
>
> @@ -2625,7 +2887,7 @@ verify_cgraph_node (struct cgraph_node *
> }
> for (e = node->indirect_calls; e; e = e->next_callee)
> {
> - if (!e->aux)
> + if (!e->aux && !e->speculative)
> {
> error ("an indirect edge from %s has no corresponding call_stmt",
> identifier_to_locale (cgraph_node_name (e->caller)));
> Index: cgraph.h
> ===================================================================
> --- cgraph.h (revision 201568)
> +++ cgraph.h (working copy)
> @@ -483,6 +483,24 @@ struct GTY((chain_next ("%h.next_caller"
> unsigned int call_stmt_cannot_inline_p : 1;
> /* Can this call throw externally? */
> unsigned int can_throw_external : 1;
> + /* Edges with SPECULATIVE flag represents indirect calls that was
> + speculatively turned into direct (i.e. by profile feedback).
> + The final code sequence will have form:
> +
> + if (call_target == expected_fn)
> + expected_fn ();
> + else
> + call_target ();
> +
> + Every speculative call is represented by three components attached
> + to a same call statement:
> + 1) a direct call (to expected_fn)
> + 2) an indirect call (to call_target)
> + 3) a IPA_REF_ADDR refrence to expected_fn.
> +
> + Optimizers may later redirect direct call to clone, so 1) and 3)
> + do not need to necesarily agree with destination. */
> + unsigned int speculative : 1;
> };
>
> #define CGRAPH_FREQ_BASE 1000
> @@ -629,7 +647,7 @@ struct cgraph_node * cgraph_add_thunk (s
> HOST_WIDE_INT, tree, tree);
> struct cgraph_node *cgraph_node_for_asm (tree);
> struct cgraph_edge *cgraph_edge (struct cgraph_node *, gimple);
> -void cgraph_set_call_stmt (struct cgraph_edge *, gimple);
> +void cgraph_set_call_stmt (struct cgraph_edge *, gimple, bool update_speculative = true);
> void cgraph_update_edges_for_call_stmt (gimple, tree, gimple);
> struct cgraph_local_info *cgraph_local_info (tree);
> struct cgraph_global_info *cgraph_global_info (tree);
> @@ -641,7 +659,7 @@ void cgraph_call_edge_duplication_hooks
> struct cgraph_edge *);
>
> void cgraph_redirect_edge_callee (struct cgraph_edge *, struct cgraph_node *);
> -void cgraph_make_edge_direct (struct cgraph_edge *, struct cgraph_node *);
> +struct cgraph_edge *cgraph_make_edge_direct (struct cgraph_edge *, struct cgraph_node *);
> bool cgraph_only_called_directly_p (struct cgraph_node *);
>
> bool cgraph_function_possibly_inlined_p (tree);
> @@ -702,6 +720,14 @@ bool cgraph_propagate_frequency (struct
> struct cgraph_node * cgraph_function_node (struct cgraph_node *,
> enum availability *avail = NULL);
> bool cgraph_get_body (struct cgraph_node *node);
> +void
> +cgraph_turn_edge_to_speculative (struct cgraph_edge *,
> + struct cgraph_node *,
> + gcov_type, int);
> +void cgraph_speculative_call_info (struct cgraph_edge *,
> + struct cgraph_edge *&,
> + struct cgraph_edge *&,
> + struct ipa_ref *&);
>
> /* In cgraphunit.c */
> struct asm_node *add_asm_node (tree);
> @@ -735,7 +761,8 @@ struct cgraph_node * cgraph_create_virtu
> const char *clone_name);
> struct cgraph_node *cgraph_find_replacement_node (struct cgraph_node *);
> bool cgraph_remove_node_and_inline_clones (struct cgraph_node *, struct cgraph_node *);
> -void cgraph_set_call_stmt_including_clones (struct cgraph_node *, gimple, gimple);
> +void cgraph_set_call_stmt_including_clones (struct cgraph_node *, gimple, gimple,
> + bool update_speculative = true);
> void cgraph_create_edge_including_clones (struct cgraph_node *,
> struct cgraph_node *,
> gimple, gimple, gcov_type, int,
> Index: value-prof.c
> ===================================================================
> --- value-prof.c (revision 201568)
> +++ value-prof.c (working copy)
> @@ -1248,7 +1248,7 @@ check_ic_target (gimple call_stmt, struc
> old call
> */
>
> -static gimple
> +gimple
> gimple_ic (gimple icall_stmt, struct cgraph_node *direct_call,
> int prob, gcov_type count, gcov_type all)
> {
> Index: value-prof.h
> ===================================================================
> --- value-prof.h (revision 201568)
> +++ value-prof.h (working copy)
> @@ -86,6 +86,8 @@ void gimple_move_stmt_histograms (struct
> void verify_histograms (void);
> void free_histograms (void);
> void stringop_block_profile (gimple, unsigned int *, HOST_WIDE_INT *);
> +gimple gimple_ic (gimple, struct cgraph_node *, int, gcov_type, gcov_type);
> +
>
> /* In tree-profile.c. */
> extern void gimple_init_edge_profiler (void);
> Index: ipa-cp.c
> ===================================================================
> --- ipa-cp.c (revision 201568)
> +++ ipa-cp.c (working copy)
> @@ -2278,14 +2278,15 @@ ipcp_discover_new_direct_edges (struct c
> aggvals);
> if (target)
> {
> + bool agg_contents = ie->indirect_info->agg_contents;
> + bool polymorphic = ie->indirect_info->polymorphic;
> + bool param_index = ie->indirect_info->param_index;
> struct cgraph_edge *cs = ipa_make_edge_direct_to_target (ie, target);
> found = true;
>
> - if (cs && !ie->indirect_info->agg_contents
> - && !ie->indirect_info->polymorphic)
> + if (cs && !agg_contents && !polymorphic)
> {
> struct ipa_node_params *info = IPA_NODE_REF (node);
> - int param_index = ie->indirect_info->param_index;
> int c = ipa_get_controlled_uses (info, param_index);
> if (c != IPA_UNDESCRIBED_USE)
> {
> @@ -2299,7 +2300,7 @@ ipcp_discover_new_direct_edges (struct c
> if (c == 0
> && (to_del = ipa_find_reference ((symtab_node) node,
> (symtab_node) cs->callee,
> - NULL)))
> + NULL, 0)))
> {
> if (dump_file && (dump_flags & TDF_DETAILS))
> fprintf (dump_file, " and even removing its "
> Index: ipa-ref.c
> ===================================================================
> --- ipa-ref.c (revision 201601)
> +++ ipa-ref.c (working copy)
> @@ -38,7 +38,7 @@ ipa_record_reference (symtab_node referr
> symtab_node referred_node,
> enum ipa_ref_use use_type, gimple stmt)
> {
> - struct ipa_ref *ref;
> + struct ipa_ref *ref, *ref2;
> struct ipa_ref_list *list, *list2;
> ipa_ref_t *old_references;
>
> @@ -56,14 +56,16 @@ ipa_record_reference (symtab_node referr
> ref->referring = referring_node;
> ref->referred = referred_node;
> ref->stmt = stmt;
> + ref->lto_stmt_uid = 0;
> ref->use = use_type;
> + ref->speculative = 0;
>
> /* If vector was moved in memory, update pointers. */
> if (old_references != list->references->address ())
> {
> int i;
> - for (i = 0; ipa_ref_list_reference_iterate (list, i, ref); i++)
> - ipa_ref_referred_ref_list (ref)->referring[ref->referred_index] = ref;
> + for (i = 0; ipa_ref_list_reference_iterate (list, i, ref2); i++)
> + ipa_ref_referred_ref_list (ref2)->referring[ref2->referred_index] = ref2;
> }
> return ref;
> }
> @@ -155,6 +157,8 @@ ipa_dump_references (FILE * file, struct
> symtab_node_asm_name (ref->referred),
> ref->referred->symbol.order,
> ipa_ref_use_name [ref->use]);
> + if (ref->speculative)
> + fprintf (file, " (speculative)");
> }
> fprintf (file, "\n");
> }
> @@ -172,22 +176,50 @@ ipa_dump_referring (FILE * file, struct
> symtab_node_asm_name (ref->referring),
> ref->referring->symbol.order,
> ipa_ref_use_name [ref->use]);
> + if (ref->speculative)
> + fprintf (file, " (speculative)");
> }
> fprintf (file, "\n");
> }
>
> +/* Clone reference REF to DEST_NODE and set its stmt to STMT. */
> +
> +struct ipa_ref *
> +ipa_clone_ref (struct ipa_ref *ref,
> + symtab_node dest_node,
> + gimple stmt)
> +{
> + bool speculative = ref->speculative;
> + unsigned int stmt_uid = ref->lto_stmt_uid;
> + struct ipa_ref *ref2;
> +
> + ref2 = ipa_record_reference (dest_node,
> + ref->referred,
> + ref->use, stmt);
> + ref2->speculative = speculative;
> + ref2->lto_stmt_uid = stmt_uid;
> + return ref2;
> +}
> +
> /* Clone all references from SRC to DEST_NODE or DEST_VARPOOL_NODE. */
>
> void
> ipa_clone_references (symtab_node dest_node,
> struct ipa_ref_list *src)
> {
> - struct ipa_ref *ref;
> + struct ipa_ref *ref, *ref2;
> int i;
> for (i = 0; ipa_ref_list_reference_iterate (src, i, ref); i++)
> - ipa_record_reference (dest_node,
> - ref->referred,
> - ref->use, ref->stmt);
> + {
> + bool speculative = ref->speculative;
> + unsigned int stmt_uid = ref->lto_stmt_uid;
> +
> + ref2 = ipa_record_reference (dest_node,
> + ref->referred,
> + ref->use, ref->stmt);
> + ref2->speculative = speculative;
> + ref2->lto_stmt_uid = stmt_uid;
> + }
> }
>
> /* Clone all referring from SRC to DEST_NODE or DEST_VARPOOL_NODE. */
> @@ -196,12 +228,19 @@ void
> ipa_clone_referring (symtab_node dest_node,
> struct ipa_ref_list *src)
> {
> - struct ipa_ref *ref;
> + struct ipa_ref *ref, *ref2;
> int i;
> for (i = 0; ipa_ref_list_referring_iterate (src, i, ref); i++)
> - ipa_record_reference (ref->referring,
> - dest_node,
> - ref->use, ref->stmt);
> + {
> + bool speculative = ref->speculative;
> + unsigned int stmt_uid = ref->lto_stmt_uid;
> +
> + ref2 = ipa_record_reference (ref->referring,
> + dest_node,
> + ref->use, ref->stmt);
> + ref2->speculative = speculative;
> + ref2->lto_stmt_uid = stmt_uid;
> + }
> }
>
> /* Return true when execution of REF can lead to return from
> @@ -230,14 +269,17 @@ ipa_ref_has_aliases_p (struct ipa_ref_li
>
> struct ipa_ref *
> ipa_find_reference (symtab_node referring_node, symtab_node referred_node,
> - gimple stmt)
> + gimple stmt, unsigned int lto_stmt_uid)
> {
> struct ipa_ref *r = NULL;
> int i;
>
> for (i = 0; ipa_ref_list_reference_iterate (&referring_node->symbol.ref_list, i, r); i++)
> if (r->referred == referred_node
> - && (in_lto_p || r->stmt == stmt))
> + && !r->speculative
> + && ((stmt && r->stmt == stmt)
> + || (lto_stmt_uid && r->lto_stmt_uid == lto_stmt_uid)
> + || (!stmt && !lto_stmt_uid && !r->stmt && !r->lto_stmt_uid)))
> return r;
> return NULL;
> }
> @@ -257,7 +299,9 @@ ipa_remove_stmt_references (symtab_node
> }
>
> /* Remove all stmt references in non-speculative references.
> - Those are not maintained during inlining & clonning. */
> + Those are not maintained during inlining & clonning.
> + The exception are speculative references that are updated along
> + with callgraph edges associated with them. */
>
> void
> ipa_clear_stmts_in_references (symtab_node referring_node)
> @@ -266,5 +310,6 @@ ipa_clear_stmts_in_references (symtab_no
> int i;
>
> for (i = 0; ipa_ref_list_reference_iterate (&referring_node->symbol.ref_list, i, r); i++)
> - r->stmt = NULL;
> + if (!r->speculative)
> + r->stmt = NULL;
> }
> Index: ipa-ref.h
> ===================================================================
> --- ipa-ref.h (revision 201601)
> +++ ipa-ref.h (working copy)
> @@ -40,8 +40,10 @@ struct GTY(()) ipa_ref
> symtab_node referring;
> symtab_node referred;
> gimple stmt;
> + unsigned int lto_stmt_uid;
> unsigned int referred_index;
> ENUM_BITFIELD (ipa_ref_use) use:2;
> + unsigned int speculative:1;
> };
>
> typedef struct ipa_ref ipa_ref_t;
> @@ -71,8 +73,9 @@ void ipa_dump_references (FILE *, struct
> void ipa_dump_referring (FILE *, struct ipa_ref_list *);
> void ipa_clone_references (symtab_node, struct ipa_ref_list *);
> void ipa_clone_referring (symtab_node, struct ipa_ref_list *);
> +struct ipa_ref * ipa_clone_ref (struct ipa_ref *, symtab_node, gimple);
> bool ipa_ref_cannot_lead_to_return (struct ipa_ref *);
> bool ipa_ref_has_aliases_p (struct ipa_ref_list *);
> -struct ipa_ref * ipa_find_reference (symtab_node, symtab_node, gimple);
> +struct ipa_ref * ipa_find_reference (symtab_node, symtab_node, gimple, unsigned int);
> void ipa_remove_stmt_references (symtab_node, gimple);
> void ipa_clear_stmts_in_references (symtab_node);
> Index: lto-cgraph.c
> ===================================================================
> --- lto-cgraph.c (revision 201568)
> +++ lto-cgraph.c (working copy)
> @@ -273,12 +273,13 @@ lto_output_edge (struct lto_simple_outpu
>
> bp = bitpack_create (ob->main_stream);
> uid = (!gimple_has_body_p (edge->caller->symbol.decl)
> - ? edge->lto_stmt_uid : gimple_uid (edge->call_stmt));
> + ? edge->lto_stmt_uid : gimple_uid (edge->call_stmt) + 1);
> bp_pack_enum (&bp, cgraph_inline_failed_enum,
> CIF_N_REASONS, edge->inline_failed);
> bp_pack_var_len_unsigned (&bp, uid);
> bp_pack_var_len_unsigned (&bp, edge->frequency);
> bp_pack_value (&bp, edge->indirect_inlining_edge, 1);
> + bp_pack_value (&bp, edge->speculative, 1);
> bp_pack_value (&bp, edge->call_stmt_cannot_inline_p, 1);
> bp_pack_value (&bp, edge->can_throw_external, 1);
> if (edge->indirect_unknown_callee)
> @@ -589,13 +590,24 @@ lto_output_ref (struct lto_simple_output
> {
> struct bitpack_d bp;
> int nref;
> + int uid = ref->lto_stmt_uid;
> + struct cgraph_node *node;
>
> bp = bitpack_create (ob->main_stream);
> bp_pack_value (&bp, ref->use, 2);
> + bp_pack_value (&bp, ref->speculative, 1);
> streamer_write_bitpack (&bp);
> nref = lto_symtab_encoder_lookup (encoder, ref->referred);
> gcc_assert (nref != LCC_NOT_FOUND);
> streamer_write_hwi_stream (ob->main_stream, nref);
> +
> + node = dyn_cast <cgraph_node> (ref->referring);
> + if (node)
> + {
> + if (ref->stmt)
> + uid = gimple_uid (ref->stmt) + 1;
> + streamer_write_hwi_stream (ob->main_stream, uid);
> + }
> }
>
> /* Stream out profile_summary to OB. */
> @@ -1116,11 +1128,17 @@ input_ref (struct lto_input_block *ib,
> symtab_node node = NULL;
> struct bitpack_d bp;
> enum ipa_ref_use use;
> + bool speculative;
> + struct ipa_ref *ref;
>
> bp = streamer_read_bitpack (ib);
> use = (enum ipa_ref_use) bp_unpack_value (&bp, 2);
> + speculative = (enum ipa_ref_use) bp_unpack_value (&bp, 1);
> node = nodes[streamer_read_hwi (ib)];
> - ipa_record_reference (referring_node, node, use, NULL);
> + ref = ipa_record_reference (referring_node, node, use, NULL);
> + ref->speculative = speculative;
> + if (is_a <cgraph_node> (referring_node))
> + ref->lto_stmt_uid = streamer_read_hwi (ib);
> }
>
> /* Read an edge from IB. NODES points to a vector of previously read nodes for
> @@ -1167,6 +1185,7 @@ input_edge (struct lto_input_block *ib,
> edge = cgraph_create_edge (caller, callee, NULL, count, freq);
>
> edge->indirect_inlining_edge = bp_unpack_value (&bp, 1);
> + edge->speculative = bp_unpack_value (&bp, 1);
> edge->lto_stmt_uid = stmt_id;
> edge->inline_failed = inline_failed;
> edge->call_stmt_cannot_inline_p = bp_unpack_value (&bp, 1);
> Index: cgraphclones.c
> ===================================================================
> --- cgraphclones.c (revision 201601)
> +++ cgraphclones.c (working copy)
> @@ -147,6 +147,7 @@ cgraph_clone_edge (struct cgraph_edge *e
> /* Clone flags that depend on call_stmt availability manually. */
> new_edge->can_throw_external = e->can_throw_external;
> new_edge->call_stmt_cannot_inline_p = e->call_stmt_cannot_inline_p;
> + new_edge->speculative = e->speculative;
> if (update_original)
> {
> e->count -= new_edge->count;
> @@ -474,17 +475,21 @@ cgraph_find_replacement_node (struct cgr
> }
>
> /* Like cgraph_set_call_stmt but walk the clone tree and update all
> - clones sharing the same function body. */
> + clones sharing the same function body.
> + When WHOLE_SPECULATIVE_EDGES is true, all three components of
> + speculative edge gets updated. Otherwise we update only direct
> + call. */
>
> void
> cgraph_set_call_stmt_including_clones (struct cgraph_node *orig,
> - gimple old_stmt, gimple new_stmt)
> + gimple old_stmt, gimple new_stmt,
> + bool update_speculative)
> {
> struct cgraph_node *node;
> struct cgraph_edge *edge = cgraph_edge (orig, old_stmt);
>
> if (edge)
> - cgraph_set_call_stmt (edge, new_stmt);
> + cgraph_set_call_stmt (edge, new_stmt, update_speculative);
>
> node = orig->clones;
> if (node)
> @@ -492,7 +497,23 @@ cgraph_set_call_stmt_including_clones (s
> {
> struct cgraph_edge *edge = cgraph_edge (node, old_stmt);
> if (edge)
> - cgraph_set_call_stmt (edge, new_stmt);
> + {
> + cgraph_set_call_stmt (edge, new_stmt, update_speculative);
> + /* If UPDATE_SPECULATIVE is false, it means that we are turning
> + speculative call into a real code sequence. Update the
> + callgraph edges. */
> + if (edge->speculative && !update_speculative)
> + {
> + struct cgraph_edge *direct, *indirect;
> + struct ipa_ref *ref;
> +
> + gcc_assert (!edge->indirect_unknown_callee);
> + cgraph_speculative_call_info (edge, direct, indirect, ref);
> + direct->speculative = false;
> + indirect->speculative = false;
> + ref->speculative = false;
> + }
> + }
> if (node->clones)
> node = node->clones;
> else if (node->next_sibling_clone)
> @@ -811,6 +832,7 @@ cgraph_materialize_all_clones (void)
> {
> struct cgraph_node *node;
> bool stabilized = false;
> +
>
> if (cgraph_dump_file)
> fprintf (cgraph_dump_file, "Materializing clones\n");
> Index: ipa-inline.c
> ===================================================================
> --- ipa-inline.c (revision 201601)
> +++ ipa-inline.c (working copy)
> @@ -1388,6 +1388,17 @@ add_new_edges_to_heap (fibheap_t heap, v
> }
> }
>
> +/* Remove EDGE from the fibheap. */
> +
> +static void
> +heap_edge_removal_hook (struct cgraph_edge *e, void *data)
> +{
> + if (e->aux)
> + {
> + fibheap_delete_node ((fibheap_t)data, (fibnode_t)e->aux);
> + e->aux = NULL;
> + }
> +}
>
> /* We use greedy algorithm for inlining of small functions:
> All inline candidates are put into prioritized heap ordered in
> @@ -1406,10 +1417,14 @@ inline_small_functions (void)
> vec<cgraph_edge_p> new_indirect_edges = vNULL;
> int initial_size = 0;
> struct cgraph_node **order = XCNEWVEC (struct cgraph_node *, cgraph_n_nodes);
> + struct cgraph_edge_hook_list *edge_removal_hook_holder;
>
> if (flag_indirect_inlining)
> new_indirect_edges.create (8);
>
> + edge_removal_hook_holder
> + = cgraph_add_edge_removal_hook (&heap_edge_removal_hook, edge_heap);
> +
> /* Compute overall unit size and other global parameters used by badness
> metrics. */
>
> @@ -1663,6 +1678,7 @@ inline_small_functions (void)
> initial_size, overall_size,
> initial_size ? overall_size * 100 / (initial_size) - 100: 0);
> BITMAP_FREE (updated_nodes);
> + cgraph_remove_edge_removal_hook (edge_removal_hook_holder);
> }
>
> /* Flatten NODE. Performed both during early inlining and
> Index: lto-streamer-in.c
> ===================================================================
> --- lto-streamer-in.c (revision 201627)
> +++ lto-streamer-in.c (working copy)
> @@ -755,30 +755,60 @@ input_ssa_names (struct lto_input_block
> so they point to STMTS. */
>
> static void
> -fixup_call_stmt_edges_1 (struct cgraph_node *node, gimple *stmts)
> +fixup_call_stmt_edges_1 (struct cgraph_node *node, gimple *stmts,
> + struct function *fn)
> {
> struct cgraph_edge *cedge;
> + struct ipa_ref *ref;
> + unsigned int i;
> +
> for (cedge = node->callees; cedge; cedge = cedge->next_callee)
> - cedge->call_stmt = stmts[cedge->lto_stmt_uid];
> + {
> + if (gimple_stmt_max_uid (fn) < cedge->lto_stmt_uid)
> + fatal_error ("Cgraph edge statement index out of range");
> + cedge->call_stmt = stmts[cedge->lto_stmt_uid - 1];
> + if (!cedge->call_stmt)
> + fatal_error ("Cgraph edge statement index not found");
> + }
> for (cedge = node->indirect_calls; cedge; cedge = cedge->next_callee)
> - cedge->call_stmt = stmts[cedge->lto_stmt_uid];
> + {
> + if (gimple_stmt_max_uid (fn) < cedge->lto_stmt_uid)
> + fatal_error ("Cgraph edge statement index out of range");
> + cedge->call_stmt = stmts[cedge->lto_stmt_uid - 1];
> + if (!cedge->call_stmt)
> + fatal_error ("Cgraph edge statement index not found");
> + }
> + for (i = 0;
> + ipa_ref_list_reference_iterate (&node->symbol.ref_list, i, ref);
> + i++)
> + if (ref->lto_stmt_uid)
> + {
> + if (gimple_stmt_max_uid (fn) < ref->lto_stmt_uid)
> + fatal_error ("Reference statement index out of range");
> + ref->stmt = stmts[ref->lto_stmt_uid - 1];
> + if (!ref->stmt)
> + fatal_error ("Reference statement index not found");
> + }
> }
>
> +
> /* Fixup call_stmt pointers in NODE and all clones. */
>
> static void
> fixup_call_stmt_edges (struct cgraph_node *orig, gimple *stmts)
> {
> struct cgraph_node *node;
> + struct function *fn;
>
> while (orig->clone_of)
> orig = orig->clone_of;
> + fn = DECL_STRUCT_FUNCTION (orig->symbol.decl);
>
> - fixup_call_stmt_edges_1 (orig, stmts);
> + fixup_call_stmt_edges_1 (orig, stmts, fn);
> if (orig->clones)
> for (node = orig->clones; node != orig;)
> {
> - fixup_call_stmt_edges_1 (node, stmts);
> + fixup_call_stmt_edges_1 (node, stmts, fn);
> if (node->clones)
> node = node->clones;
> else if (node->next_sibling_clone)
> Index: ipa-prop.c
> ===================================================================
> --- ipa-prop.c (revision 201568)
> +++ ipa-prop.c (working copy)
> @@ -2313,12 +2313,6 @@ ipa_make_edge_direct_to_target (struct c
> the cgraph node too early. */
> gcc_assert (!callee->global.inlined_to);
>
> - cgraph_make_edge_direct (ie, callee);
> - es = inline_edge_summary (ie);
> - es->call_stmt_size -= (eni_size_weights.indirect_call_cost
> - - eni_size_weights.call_cost);
> - es->call_stmt_time -= (eni_time_weights.indirect_call_cost
> - - eni_time_weights.call_cost);
> if (dump_file && !unreachable)
> {
> fprintf (dump_file, "ipa-prop: Discovered %s call to a known target "
> @@ -2326,14 +2320,19 @@ ipa_make_edge_direct_to_target (struct c
> ie->indirect_info->polymorphic ? "a virtual" : "an indirect",
> xstrdup (cgraph_node_name (ie->caller)),
> ie->caller->symbol.order,
> - xstrdup (cgraph_node_name (ie->callee)),
> - ie->callee->symbol.order);
> + xstrdup (cgraph_node_name (callee)),
> + callee->symbol.order);
> if (ie->call_stmt)
> print_gimple_stmt (dump_file, ie->call_stmt, 2, TDF_SLIM);
> else
> fprintf (dump_file, "with uid %i\n", ie->lto_stmt_uid);
> - }
> - callee = cgraph_function_or_thunk_node (callee, NULL);
> + }
> + ie = cgraph_make_edge_direct (ie, callee);
> + es = inline_edge_summary (ie);
> + es->call_stmt_size -= (eni_size_weights.indirect_call_cost
> + - eni_size_weights.call_cost);
> + es->call_stmt_time -= (eni_time_weights.indirect_call_cost
> + - eni_time_weights.call_cost);
>
> return ie;
> }
> @@ -2374,7 +2373,7 @@ remove_described_reference (symtab_node
>
> origin = rdesc->cs;
> to_del = ipa_find_reference ((symtab_node) origin->caller, symbol,
> - origin->call_stmt);
> + origin->call_stmt, origin->lto_stmt_uid);
> gcc_assert (to_del);
> ipa_remove_reference (to_del);
> if (dump_file)
> @@ -2408,9 +2407,11 @@ try_make_edge_direct_simple_call (struct
> struct ipa_jump_func *jfunc,
> struct ipa_node_params *new_root_info)
> {
> - struct ipa_cst_ref_desc *rdesc;
> struct cgraph_edge *cs;
> tree target;
> + bool agg_contents = ie->indirect_info->agg_contents;
> + bool speculative = ie->speculative;
> + struct ipa_cst_ref_desc *rdesc;
>
> if (ie->indirect_info->agg_contents)
> target = ipa_find_agg_cst_for_param (&jfunc->agg,
> @@ -2422,7 +2423,8 @@ try_make_edge_direct_simple_call (struct
> return NULL;
> cs = ipa_make_edge_direct_to_target (ie, target);
>
> - if (cs && !ie->indirect_info->agg_contents
> + /* FIXME: speculative edges can be handled. */
> + if (cs && !agg_contents && !speculative
> && jfunc->type == IPA_JF_CONST
> && (rdesc = jfunc_rdesc_usable (jfunc))
> && --rdesc->refcount == 0)
> @@ -2521,7 +2523,14 @@ update_indirect_edges_after_inlining (st
> else
> new_direct_edge = try_make_edge_direct_simple_call (ie, jfunc,
> new_root_info);
> - if (new_direct_edge)
> + /* If speculation was removed, then we need to do nothing. */
> + if (new_direct_edge && new_direct_edge != ie)
> + {
> + new_direct_edge->indirect_inlining_edge = 1;
> + top = IPA_EDGE_REF (cs);
> + res = true;
> + }
> + else if (new_direct_edge)
> {
> new_direct_edge->indirect_inlining_edge = 1;
> if (new_direct_edge->call_stmt)
> @@ -2532,9 +2541,9 @@ update_indirect_edges_after_inlining (st
> if (new_edges)
> {
> new_edges->safe_push (new_direct_edge);
> - top = IPA_EDGE_REF (cs);
> res = true;
> }
> + top = IPA_EDGE_REF (cs);
> }
> else if (jfunc->type == IPA_JF_PASS_THROUGH
> && ipa_get_jf_pass_through_operation (jfunc) == NOP_EXPR)
> @@ -2645,7 +2654,7 @@ propagate_controlled_uses (struct cgraph
> && TREE_CODE (TREE_OPERAND (t, 0)) == FUNCTION_DECL
> && (n = cgraph_get_node (TREE_OPERAND (t, 0)))
> && (ref = ipa_find_reference ((symtab_node) new_root,
> - (symtab_node) n, NULL)))
> + (symtab_node) n, NULL, 0)))
> {
> if (dump_file)
> fprintf (dump_file, "ipa-prop: Removing cloning-created "
> @@ -2683,7 +2692,7 @@ propagate_controlled_uses (struct cgraph
> {
> struct ipa_ref *ref;
> ref = ipa_find_reference ((symtab_node) clone,
> - (symtab_node) n, NULL);
> + (symtab_node) n, NULL, 0);
> if (ref)
> {
> if (dump_file)
> Index: tree-inline.c
> ===================================================================
> --- tree-inline.c (revision 201568)
> +++ tree-inline.c (working copy)
> @@ -1676,6 +1676,8 @@ copy_bb (copy_body_data *id, basic_block
> if (edge)
> {
> int edge_freq = edge->frequency;
> + int new_freq;
> + struct cgraph_edge *old_edge = edge;
> edge = cgraph_clone_edge (edge, id->dst_node, stmt,
> gimple_uid (stmt),
> REG_BR_PROB_BASE, CGRAPH_FREQ_BASE,
> @@ -1683,25 +1685,52 @@ copy_bb (copy_body_data *id, basic_block
> /* We could also just rescale the frequency, but
> doing so would introduce roundoff errors and make
> verifier unhappy. */
> - edge->frequency
> - = compute_call_stmt_bb_frequency (id->dst_node->symbol.decl,
> - copy_basic_block);
> - if (dump_file
> - && profile_status_for_function (cfun) != PROFILE_ABSENT
> - && (edge_freq > edge->frequency + 10
> - || edge_freq < edge->frequency - 10))
> + new_freq = compute_call_stmt_bb_frequency (id->dst_node->symbol.decl,
> + copy_basic_block);
> +
> + /* Speculative calls consist of two edges - direct and indirect.
> + Duplicate the whole thing and distribute frequencies accordingly. */
> + if (edge->speculative)
> {
> - fprintf (dump_file, "Edge frequency estimated by "
> - "cgraph %i diverge from inliner's estimate %i\n",
> - edge_freq,
> - edge->frequency);
> - fprintf (dump_file,
> - "Orig bb: %i, orig bb freq %i, new bb freq %i\n",
> - bb->index,
> - bb->frequency,
> - copy_basic_block->frequency);
> + struct cgraph_edge *direct, *indirect;
> + struct ipa_ref *ref;
> +
> + gcc_assert (!edge->indirect_unknown_callee);
> + cgraph_speculative_call_info (old_edge, direct, indirect, ref);
> + indirect = cgraph_clone_edge (indirect, id->dst_node, stmt,
> + gimple_uid (stmt),
> + REG_BR_PROB_BASE, CGRAPH_FREQ_BASE,
> + true);
> + if (old_edge->frequency + indirect->frequency)
> + {
> + edge->frequency = MIN (RDIV ((gcov_type)new_freq * old_edge->frequency,
> + (old_edge->frequency + indirect->frequency)),
> + CGRAPH_FREQ_MAX);
> + indirect->frequency = MIN (RDIV ((gcov_type)new_freq * indirect->frequency,
> + (old_edge->frequency + indirect->frequency)),
> + CGRAPH_FREQ_MAX);
> + }
> + ipa_clone_ref (ref, (symtab_node)id->dst_node, stmt);
> + }
> + else
> + {
> + edge->frequency = new_freq;
> + if (dump_file
> + && profile_status_for_function (cfun) != PROFILE_ABSENT
> + && (edge_freq > edge->frequency + 10
> + || edge_freq < edge->frequency - 10))
> + {
> + fprintf (dump_file, "Edge frequency estimated by "
> + "cgraph %i diverge from inliner's estimate %i\n",
> + edge_freq,
> + edge->frequency);
> + fprintf (dump_file,
> + "Orig bb: %i, orig bb freq %i, new bb freq %i\n",
> + bb->index,
> + bb->frequency,
> + copy_basic_block->frequency);
> + }
> }
> - stmt = cgraph_redirect_edge_call_stmt_to_callee (edge);
> }
> break;
>
> @@ -2232,6 +2261,23 @@ copy_loops (bitmap blocks_to_copy,
> }
> }
>
> +/* Call cgraph_redirect_edge_call_stmt_to_callee on all calls in BB */
> +
> +void
> +redirect_all_calls (copy_body_data * id, basic_block bb)
> +{
> + gimple_stmt_iterator si;
> + for (si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next (&si))
> + {
> + if (is_gimple_call (gsi_stmt (si)))
> + {
> + struct cgraph_edge *edge = cgraph_edge (id->dst_node, gsi_stmt (si));
> + if (edge)
> + cgraph_redirect_edge_call_stmt_to_callee (edge);
> + }
> + }
> +}
> +
> /* Make a copy of the body of FN so that it can be inserted inline in
> another function. Walks FN via CFG, returns new fndecl. */
>
> @@ -2356,6 +2402,10 @@ copy_cfg_body (copy_body_data * id, gcov
> && bb->index != ENTRY_BLOCK
> && bb->index != EXIT_BLOCK)
> maybe_move_debug_stmts_to_successors (id, (basic_block) bb->aux);
> + /* Update call edge destinations. This can not be done before loop
> + info is updated, because we may split basic blocks. */
> + if (id->transform_call_graph_edges == CB_CGE_DUPLICATE)
> + redirect_all_calls (id, (basic_block)bb->aux);
> ((basic_block)bb->aux)->aux = NULL;
> bb->aux = NULL;
> }
> @@ -2367,6 +2417,10 @@ copy_cfg_body (copy_body_data * id, gcov
> if (need_debug_cleanup)
> maybe_move_debug_stmts_to_successors (id, BASIC_BLOCK (last));
> BASIC_BLOCK (last)->aux = NULL;
> + /* Update call edge destinations. This can not be done before loop
> + info is updated, because we may split basic blocks. */
> + if (id->transform_call_graph_edges == CB_CGE_DUPLICATE)
> + redirect_all_calls (id, BASIC_BLOCK (last));
> }
> entry_block_map->aux = NULL;
> exit_block_map->aux = NULL;
> @@ -4941,43 +4995,47 @@ delete_unreachable_blocks_update_callgra
> gimple_stmt_iterator bsi;
>
> for (bsi = gsi_start_bb (b); !gsi_end_p (bsi); gsi_next (&bsi))
> - if (gimple_code (gsi_stmt (bsi)) == GIMPLE_CALL)
> - {
> - struct cgraph_edge *e;
> - struct cgraph_node *node;
> + {
> + struct cgraph_edge *e;
> + struct cgraph_node *node;
>
> - if ((e = cgraph_edge (id->dst_node, gsi_stmt (bsi))) != NULL)
> + ipa_remove_stmt_references ((symtab_node)id->dst_node, gsi_stmt (bsi));
> +
> + if (gimple_code (gsi_stmt (bsi)) == GIMPLE_CALL
> + &&(e = cgraph_edge (id->dst_node, gsi_stmt (bsi))) != NULL)
> + {
> + if (!e->inline_failed)
> + cgraph_remove_node_and_inline_clones (e->callee, id->dst_node);
> + else
> + cgraph_remove_edge (e);
> + }
> + if (id->transform_call_graph_edges == CB_CGE_MOVE_CLONES
> + && id->dst_node->clones)
> + for (node = id->dst_node->clones; node != id->dst_node;)
> {
> - if (!e->inline_failed)
> - cgraph_remove_node_and_inline_clones (e->callee, id->dst_node);
> + ipa_remove_stmt_references ((symtab_node)node, gsi_stmt (bsi));
> + if (gimple_code (gsi_stmt (bsi)) == GIMPLE_CALL
> + && (e = cgraph_edge (node, gsi_stmt (bsi))) != NULL)
> + {
> + if (!e->inline_failed)
> + cgraph_remove_node_and_inline_clones (e->callee, id->dst_node);
> + else
> + cgraph_remove_edge (e);
> + }
> +
> + if (node->clones)
> + node = node->clones;
> + else if (node->next_sibling_clone)
> + node = node->next_sibling_clone;
> else
> - cgraph_remove_edge (e);
> + {
> + while (node != id->dst_node && !node->next_sibling_clone)
> + node = node->clone_of;
> + if (node != id->dst_node)
> + node = node->next_sibling_clone;
> + }
> }
> - if (id->transform_call_graph_edges == CB_CGE_MOVE_CLONES
> - && id->dst_node->clones)
> - for (node = id->dst_node->clones; node != id->dst_node;)
> - {
> - if ((e = cgraph_edge (node, gsi_stmt (bsi))) != NULL)
> - {
> - if (!e->inline_failed)
> - cgraph_remove_node_and_inline_clones (e->callee, id->dst_node);
> - else
> - cgraph_remove_edge (e);
> - }
> -
> - if (node->clones)
> - node = node->clones;
> - else if (node->next_sibling_clone)
> - node = node->next_sibling_clone;
> - else
> - {
> - while (node != id->dst_node && !node->next_sibling_clone)
> - node = node->clone_of;
> - if (node != id->dst_node)
> - node = node->next_sibling_clone;
> - }
> - }
> - }
> + }
> delete_basic_block (b);
> changed = true;
> }