This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: LTO/WHOPR streaming of varpool
- From: Richard Guenther <rguenther at suse dot de>
- To: Jan Hubicka <hubicka at ucw dot cz>
- Cc: gcc-patches at gcc dot gnu dot org, dnovillo at redhat dot com
- Date: Thu, 29 Apr 2010 10:47:10 +0200 (CEST)
- Subject: Re: LTO/WHOPR streaming of varpool
- References: <20100428180912.GF9094@kam.mff.cuni.cz>
On Wed, 28 Apr 2010, Jan Hubicka wrote:
> Hi,
> this patch implements streaming of varpool and fixes several correctness issues
> with whopr (and makes LTO correct wrt unused variable removing too).
> Main issue with whopr was that it shipped variable declarations into multiple units
> and turned them into static vars creating large binarries (and invalid programs
> too). This is now fixed and WHOPR produce pretty much same size of GCC binary
> as LTO.
>
> Most of stuff in unexciting clonning of what we do for cgraph nodes already.
> I removed original logic from lto.c to figure out if variable can be accessed
> by other partition since it can not work well with WPA anyway (function bodies
> are not around and thus we don't know if inlining function will carry reference
> to variable to other unit). This is also what caused link error on several SPEC
> programs and consequently lead me to temporarily disable unused variable removal
> that result in current explossion of whopr binary size (since all variables
> are copied to every unit).
>
> There is room for cleanups left, especially in WPA, but I will do it incrementally.
> Also we should start streaming variable initializers in separate sections as we do
> for function bodies as well as list of references for each cgraph node/varpool
> to drive unused variable/function removal.
>
> Bootstrapped/regtested x86_64-linux, OK?
>
> * lto-symtab.c (lto_symtab_entry_def) Add vnode.
> (lto_varpool_replace_node): New.
> (lto_symtab_resolve_symbols): Resolve varpool nodes.
> (lto_symtab_merge_decls_1): Prefer decls with varpool node.
> (lto_symtab_merge_cgraph_nodes_1): Merge varpools.
> * cgraph.h (varpool_node_ptr): New type.
> (varpool_node_ptr): New vector.
> (varpool_node_set_def): New structure.
> (varpool_node_set): New type.
> (varpool_node_set): New vector.
> (varpool_node_set_element_def): New structure.
> (varpool_node_set_element, const_varpool_node_set_element): New types.
> (varpool_node_set_iterator): New type.
> (varpool_node): Add prev pointers, add used_from_other_partition,
> in_other_partition.
> (varpool_node_set_new, varpool_node_set_find, varpool_node_set_add,
> varpool_node_set_remove, dump_varpool_node_set, debug_varpool_node_set,
> varpool_get_node, varpool_remove_node): Declare.
> (vsi_end_p, vsi_next, vsi_node, vsi_start, varpool_node_in_set_p,
> varpool_node_set_size): New inlines.
> * tree-pass.h (varpool_node_set_def): Forward declare.
> (ipa_opt_pass_d): Summary writting takes vnode sets too.
> (ipa_write_optimization_summaries): Update prototype.
> * ipa-cp.c (ipcp_write_summary): Update.
> * ipa-reference.c (ipa_reference_write_summary): Update.
> * lto-cgraph.c (lto_output_varpool_node): New static function.
> (output_varpool): New function.
> (input_varpool_node): New static function.
> (input_varpool_1): New function.
> (input_cgraph): Input varpool.
> * ipa-pure-const.c (pure_const_write_summary): Update.
> * lto-streamer-out.c (lto_output): Update, output varpool too.
> (write_global_stream): Kill WPA hack.
> (produce_asm_for_decls): Update.
> * ipa-inline.c (inline_write_summary): Update.
> * lto-streamer-in.c (lto_input_tree_ref, lto_input_tree): Do not build cgraph.
> * lto-section-in.c (lto_section_name): Add varpool and jump funcs.
> * ipa.c (hash_varpool_node_set_element, eq_varpool_node_set_element,
> varpool_node_set_new, varpool_node_set_add,
> varpool_node_set_remove, varpool_node_set_find, dump_varpool_node_set,
> debug_varpool_node_set): New functions.
> * passes.c (rest_of_decl_compilation): when in LTO do not finalize.
> (execute_one_pass): Process new decls too.
> (ipa_write_summaries_2): Pass around vsets.
> (ipa_write_summaries_1): Likewise.
> (ipa_write_summaries): Build vset; be more selective about cgraph nodes
> to add.
> (ipa_write_optimization_summaries_1): Pass around vsets.
> (ipa_write_optimization_summaries): Likewise.
> * varpool.c (varpool_get_node): New.
> (varpool_node): Update doubly linked lists.
> (varpool_remove_node): New.
> (dump_varpool_node): More dumping.
> (varpool_enqueue_needed_node): Update doubly linked lists.
> (decide_is_variable_needed): Kill ltrans hack.
> (varpool_finalize_decl): Kill lto hack.
> (varpool_assemble_decl): Skip decls in other partitions.
> (varpool_assemble_pending_decls): Update doubly linkes lists.
> (varpool_empty_needed_queue): Likewise.
> (varpool_extra_name_alias): Likewise.
> * lto-streamer.c (lto_get_section_name): Add vars section.
> * lto-streamer.h (lto_section_type): Update.
> (output_varpool, input_varpool): Declare.
>
> * lto.c (lto_varpool_node_sets): New.
> (lto_1_to_1_map): Partition varpool too.
> (globalize_context_t, globalize_cross_file_statics,
> lto_scan_statics_in_ref_table, lto_scan_statics_in_cgraph_node,
> lto_scan_statics_in_remaining_global_vars): Remove.
> (lto_promote_cross_file_statics): Rewrite.
> (get_filename_for_set): Take vset argument.
> (lto_wpa_write_files): Pass around vsets.
>
> Index: lto-symtab.c
> ===================================================================
> --- lto-symtab.c (revision 158825)
> +++ lto-symtab.c (working copy)
> @@ -44,6 +44,9 @@ struct GTY(()) lto_symtab_entry_def
> /* The cgraph node if decl is a function decl. Filled in during the
> merging process. */
> struct cgraph_node *node;
> + /* The varpool node if decl is a variable decl. Filled in during the
> + merging process. */
> + struct varpool_node *vnode;
> /* LTO file-data and symbol resolution for this decl. */
> struct lto_file_decl_data * GTY((skip (""))) file_data;
> enum ld_plugin_symbol_resolution resolution;
> @@ -244,6 +247,23 @@ lto_cgraph_replace_node (struct cgraph_n
> cgraph_remove_node (node);
> }
>
> +/* Replace the cgraph node NODE with PREVAILING_NODE in the cgraph, merging
> + all edges and removing the old node. */
> +
> +static void
> +lto_varpool_replace_node (struct varpool_node *node,
> + struct varpool_node *prevailing_node)
> +{
> + /* Merge node flags. */
> + if (node->needed)
> + varpool_mark_needed_node (prevailing_node);
> + gcc_assert (!node->finalized || prevailing_node->finalized);
> + gcc_assert (!node->analyzed || prevailing_node->analyzed);
> +
> + /* Finally remove the replaced node. */
> + varpool_remove_node (node);
> +}
> +
> /* Merge two variable or function symbol table entries PREVAILING and ENTRY.
> Return false if the symbols are not fully compatible and a diagnostic
> should be emitted. */
> @@ -406,6 +426,8 @@ lto_symtab_resolve_symbols (void **slot)
> {
> if (TREE_CODE (e->decl) == FUNCTION_DECL)
> e->node = cgraph_get_node (e->decl);
> + else if (TREE_CODE (e->decl) == VAR_DECL)
> + e->vnode = varpool_get_node (e->decl);
> }
>
> e = (lto_symtab_entry_t) *slot;
> @@ -559,6 +581,10 @@ lto_symtab_merge_decls_1 (void **slot, v
> while (!prevailing->node
> && prevailing->next)
> prevailing = prevailing->next;
> + if (TREE_CODE (prevailing->decl) == VAR_DECL)
> + while (!prevailing->vnode
> + && prevailing->next)
> + prevailing = prevailing->next;
Err, instead of duplicating the loop please just make the existing
one unconditional.
> /* We do not stream varpool nodes, so the first decl has to
> be good enough for now.
> ??? For QOI choose a variable with readonly initializer
And update this comment - well, remove the following grok
and instead make vars with an initializer not replaceable?
> @@ -672,6 +698,8 @@ lto_symtab_merge_cgraph_nodes_1 (void **
> }
> lto_cgraph_replace_node (e->node, prevailing->node);
> }
> + if (e->vnode != NULL)
> + lto_varpool_replace_node (e->vnode, prevailing->vnode);
Update the comment before the loop.
The rest looks good. Thanks for working on this.
I agree we should have some testing coverage here - it should
be easy to add a testcase that we'd previously miscompile, no?
Thanks,
Richard.
> }
>
> /* Drop all but the prevailing decl from the symtab. */
> Index: cgraph.h
> ===================================================================
> --- cgraph.h (revision 158825)
> +++ cgraph.h (working copy)
> @@ -302,12 +302,33 @@ struct GTY(()) cgraph_node_set_def
> PTR GTY ((skip)) aux;
> };
>
> +typedef struct varpool_node *varpool_node_ptr;
> +
> +DEF_VEC_P(varpool_node_ptr);
> +DEF_VEC_ALLOC_P(varpool_node_ptr,heap);
> +DEF_VEC_ALLOC_P(varpool_node_ptr,gc);
> +
> +/* A varpool node set is a collection of varpool nodes. A varpool node
> + can appear in multiple sets. */
> +struct GTY(()) varpool_node_set_def
> +{
> + htab_t GTY((param_is (struct varpool_node_set_element_def))) hashtab;
> + VEC(varpool_node_ptr, gc) *nodes;
> + PTR GTY ((skip)) aux;
> +};
> +
> typedef struct cgraph_node_set_def *cgraph_node_set;
>
> DEF_VEC_P(cgraph_node_set);
> DEF_VEC_ALLOC_P(cgraph_node_set,gc);
> DEF_VEC_ALLOC_P(cgraph_node_set,heap);
>
> +typedef struct varpool_node_set_def *varpool_node_set;
> +
> +DEF_VEC_P(varpool_node_set);
> +DEF_VEC_ALLOC_P(varpool_node_set,gc);
> +DEF_VEC_ALLOC_P(varpool_node_set,heap);
> +
> /* A cgraph node set element contains an index in the vector of nodes in
> the set. */
> struct GTY(()) cgraph_node_set_element_def
> @@ -326,6 +347,24 @@ typedef struct
> unsigned index;
> } cgraph_node_set_iterator;
>
> +/* A varpool node set element contains an index in the vector of nodes in
> + the set. */
> +struct GTY(()) varpool_node_set_element_def
> +{
> + struct varpool_node *node;
> + HOST_WIDE_INT index;
> +};
> +
> +typedef struct varpool_node_set_element_def *varpool_node_set_element;
> +typedef const struct varpool_node_set_element_def *const_varpool_node_set_element;
> +
> +/* Iterator structure for varpool node sets. */
> +typedef struct
> +{
> + varpool_node_set set;
> + unsigned index;
> +} varpool_node_set_iterator;
> +
> #define DEFCIFCODE(code, string) CIF_ ## code,
> /* Reasons for inlining failures. */
> typedef enum {
> @@ -380,9 +419,9 @@ DEF_VEC_ALLOC_P(cgraph_edge_p,heap);
> struct GTY((chain_next ("%h.next"))) varpool_node {
> tree decl;
> /* Pointer to the next function in varpool_nodes. */
> - struct varpool_node *next;
> + struct varpool_node *next, *prev;
> /* Pointer to the next function in varpool_nodes_queue. */
> - struct varpool_node *next_needed;
> + struct varpool_node *next_needed, *prev_needed;
> /* For normal nodes a pointer to the first extra name alias. For alias
> nodes a pointer to the normal node. */
> struct varpool_node *extra_name;
> @@ -407,6 +446,10 @@ struct GTY((chain_next ("%h.next"))) var
> /* Set for aliases once they got through assemble_alias. Also set for
> extra name aliases in varpool_extra_name_alias. */
> unsigned alias : 1;
> + /* Set when variable is used from other LTRANS partition. */
> + unsigned used_from_other_partition : 1;
> + /* Set when variable is available in the other LTO partition. */
> + unsigned in_other_partition : 1;
> };
>
> /* Every top level asm statement is put into a cgraph_asm_node. */
> @@ -574,6 +617,13 @@ void cgraph_node_set_remove (cgraph_node
> void dump_cgraph_node_set (FILE *, cgraph_node_set);
> void debug_cgraph_node_set (cgraph_node_set);
>
> +varpool_node_set varpool_node_set_new (void);
> +varpool_node_set_iterator varpool_node_set_find (varpool_node_set,
> + struct varpool_node *);
> +void varpool_node_set_add (varpool_node_set, struct varpool_node *);
> +void varpool_node_set_remove (varpool_node_set, struct varpool_node *);
> +void dump_varpool_node_set (FILE *, varpool_node_set);
> +void debug_varpool_node_set (varpool_node_set);
>
> /* In predict.c */
> bool cgraph_maybe_hot_edge_p (struct cgraph_edge *e);
> @@ -596,6 +646,9 @@ void cgraph_make_decl_local (tree);
> void cgraph_make_node_local (struct cgraph_node *);
> bool cgraph_node_can_be_local_p (struct cgraph_node *);
>
> +
> +struct varpool_node * varpool_get_node (tree decl);
> +void varpool_remove_node (struct varpool_node *node);
> bool varpool_assemble_pending_decls (void);
> bool varpool_assemble_decl (struct varpool_node *node);
> bool varpool_analyze_pending_decls (void);
> @@ -713,6 +766,54 @@ cgraph_node_set_size (cgraph_node_set se
> return htab_elements (set->hashtab);
> }
>
> +/* Return true if iterator CSI points to nothing. */
> +static inline bool
> +vsi_end_p (varpool_node_set_iterator vsi)
> +{
> + return vsi.index >= VEC_length (varpool_node_ptr, vsi.set->nodes);
> +}
> +
> +/* Advance iterator CSI. */
> +static inline void
> +vsi_next (varpool_node_set_iterator *vsi)
> +{
> + vsi->index++;
> +}
> +
> +/* Return the node pointed to by CSI. */
> +static inline struct varpool_node *
> +vsi_node (varpool_node_set_iterator vsi)
> +{
> + return VEC_index (varpool_node_ptr, vsi.set->nodes, vsi.index);
> +}
> +
> +/* Return an iterator to the first node in SET. */
> +static inline varpool_node_set_iterator
> +vsi_start (varpool_node_set set)
> +{
> + varpool_node_set_iterator vsi;
> +
> + vsi.set = set;
> + vsi.index = 0;
> + return vsi;
> +}
> +
> +/* Return true if SET contains NODE. */
> +static inline bool
> +varpool_node_in_set_p (struct varpool_node *node, varpool_node_set set)
> +{
> + varpool_node_set_iterator vsi;
> + vsi = varpool_node_set_find (set, node);
> + return !vsi_end_p (vsi);
> +}
> +
> +/* Return number of nodes in SET. */
> +static inline size_t
> +varpool_node_set_size (varpool_node_set set)
> +{
> + return htab_elements (set->hashtab);
> +}
> +
> /* Uniquize all constants that appear in memory.
> Each constant in memory thus far output is recorded
> in `const_desc_table'. */
> Index: tree-pass.h
> ===================================================================
> --- tree-pass.h (revision 158825)
> +++ tree-pass.h (working copy)
> @@ -168,6 +168,7 @@ struct rtl_opt_pass
> struct varpool_node;
> struct cgraph_node;
> struct cgraph_node_set_def;
> +struct varpool_node_set_def;
>
> /* Description of IPA pass with generate summary, write, execute, read and
> transform stages. */
> @@ -180,13 +181,15 @@ struct ipa_opt_pass_d
> void (*generate_summary) (void);
>
> /* This hook is used to serialize IPA summaries on disk. */
> - void (*write_summary) (struct cgraph_node_set_def *);
> + void (*write_summary) (struct cgraph_node_set_def *,
> + struct varpool_node_set_def *);
>
> /* This hook is used to deserialize IPA summaries from disk. */
> void (*read_summary) (void);
>
> /* This hook is used to serialize IPA optimization summaries on disk. */
> - void (*write_optimization_summary) (struct cgraph_node_set_def *);
> + void (*write_optimization_summary) (struct cgraph_node_set_def *,
> + struct varpool_node_set_def *);
>
> /* This hook is used to deserialize IPA summaries from disk. */
> void (*read_optimization_summary) (void);
> @@ -607,7 +610,8 @@ extern const char *get_current_pass_name
> extern void print_current_pass (FILE *);
> extern void debug_pass (void);
> extern void ipa_write_summaries (void);
> -extern void ipa_write_optimization_summaries (struct cgraph_node_set_def *);
> +extern void ipa_write_optimization_summaries (struct cgraph_node_set_def *,
> + struct varpool_node_set_def *);
> extern void ipa_read_summaries (void);
> extern void ipa_read_optimization_summaries (void);
> extern void register_one_dump_file (struct opt_pass *);
> Index: ipa-cp.c
> ===================================================================
> --- ipa-cp.c (revision 158825)
> +++ ipa-cp.c (working copy)
> @@ -1304,7 +1304,8 @@ ipcp_generate_summary (void)
>
> /* Write ipcp summary for nodes in SET. */
> static void
> -ipcp_write_summary (cgraph_node_set set)
> +ipcp_write_summary (cgraph_node_set set,
> + varpool_node_set vset ATTRIBUTE_UNUSED)
> {
> ipa_prop_write_jump_functions (set);
> }
> Index: ipa-reference.c
> ===================================================================
> --- ipa-reference.c (revision 158825)
> +++ ipa-reference.c (working copy)
> @@ -1040,7 +1040,8 @@ write_node_summary_p (struct cgraph_node
> /* Serialize the ipa info for lto. */
>
> static void
> -ipa_reference_write_summary (cgraph_node_set set)
> +ipa_reference_write_summary (cgraph_node_set set,
> + varpool_node_set vset ATTRIBUTE_UNUSED)
> {
> struct cgraph_node *node;
> struct lto_simple_output_block *ob
> Index: lto-cgraph.c
> ===================================================================
> --- lto-cgraph.c (revision 158825)
> +++ lto-cgraph.c (working copy)
> @@ -372,6 +372,50 @@ lto_output_node (struct lto_simple_outpu
> lto_output_uleb128_stream (ob->main_stream, 0);
> }
>
> +/* Output the cgraph NODE to OB. ENCODER is used to find the
> + reference number of NODE->inlined_to. SET is the set of nodes we
> + are writing to the current file. If NODE is not in SET, then NODE
> + is a boundary of a cgraph_node_set and we pretend NODE just has a
> + decl and no callees. WRITTEN_DECLS is the set of FUNCTION_DECLs
> + that have had their callgraph node written so far. This is used to
> + determine if NODE is a clone of a previously written node. */
> +
> +static void
> +lto_output_varpool_node (struct lto_simple_output_block *ob, struct varpool_node *node,
> + varpool_node_set set)
> +{
> + bool boundary_p = !varpool_node_in_set_p (node, set) && node->analyzed;
> + struct bitpack_d *bp;
> + struct varpool_node *alias;
> + int count = 0;
> +
> + lto_output_var_decl_index (ob->decl_state, ob->main_stream, node->decl);
> + bp = bitpack_create ();
> + bp_pack_value (bp, node->externally_visible, 1);
> + bp_pack_value (bp, node->force_output, 1);
> + bp_pack_value (bp, node->finalized, 1);
> + gcc_assert (node->finalized || !node->analyzed);
> + gcc_assert (node->needed);
> + gcc_assert (!node->alias);
> + /* FIXME: We have no idea how we move references around. For moment assume that
> + everything is used externally. */
> + bp_pack_value (bp, flag_wpa, 1); /* used_from_other_parition. */
> + bp_pack_value (bp, boundary_p, 1); /* in_other_partition. */
> + /* Also emit any extra name aliases. */
> + for (alias = node->extra_name; alias; alias = alias->next)
> + count++;
> + bp_pack_value (bp, count != 0, 1);
> + lto_output_bitpack (ob->main_stream, bp);
> + bitpack_delete (bp);
> +
> + if (count)
> + {
> + lto_output_uleb128_stream (ob->main_stream, count);
> + for (alias = node->extra_name; alias; alias = alias->next)
> + lto_output_var_decl_index (ob->decl_state, ob->main_stream, alias->decl);
> + }
> +}
> +
> /* Stream out profile_summary to OB. */
>
> static void
> @@ -548,6 +592,32 @@ input_overwrite_node (struct lto_file_de
> node->frequency = (enum node_frequency)bp_unpack_value (bp, 2);
> }
>
> +/* Output the part of the cgraph in SET. */
> +
> +void
> +output_varpool (varpool_node_set set)
> +{
> + struct varpool_node *node;
> + struct lto_simple_output_block *ob;
> + int len = 0;
> +
> + ob = lto_create_simple_output_block (LTO_section_varpool);
> +
> + for (node = varpool_nodes; node; node = node->next)
> + if (node->needed && node->analyzed)
> + len++;
> +
> + lto_output_uleb128_stream (ob->main_stream, len);
> +
> + /* Write out the nodes. We must first output a node and then its clones,
> + otherwise at a time reading back the node there would be nothing to clone
> + from. */
> + for (node = varpool_nodes; node; node = node->next)
> + if (node->needed && node->analyzed)
> + lto_output_varpool_node (ob, node, set);
> +
> + lto_destroy_simple_output_block (ob);
> +}
>
> /* Read a node from input_block IB. TAG is the node's tag just read.
> Return the node read or overwriten. */
> @@ -667,6 +737,48 @@ input_node (struct lto_file_decl_data *f
> return node;
> }
>
> +/* Read a node from input_block IB. TAG is the node's tag just read.
> + Return the node read or overwriten. */
> +
> +static struct varpool_node *
> +input_varpool_node (struct lto_file_decl_data *file_data,
> + struct lto_input_block *ib)
> +{
> + int decl_index;
> + tree var_decl;
> + struct varpool_node *node;
> + struct bitpack_d *bp;
> + bool aliases_p;
> + int count;
> +
> + decl_index = lto_input_uleb128 (ib);
> + var_decl = lto_file_decl_data_get_var_decl (file_data, decl_index);
> + node = varpool_node (var_decl);
> +
> + bp = lto_input_bitpack (ib);
> + node->externally_visible = bp_unpack_value (bp, 1);
> + node->force_output = bp_unpack_value (bp, 1);
> + node->finalized = bp_unpack_value (bp, 1);
> + node->analyzed = 1;
> + node->used_from_other_partition = bp_unpack_value (bp, 1);
> + node->in_other_partition = bp_unpack_value (bp, 1);
> + aliases_p = bp_unpack_value (bp, 1);
> + if (node->finalized)
> + varpool_mark_needed_node (node);
> + bitpack_delete (bp);
> + if (aliases_p)
> + {
> + count = lto_input_uleb128 (ib);
> + for (; count > 0; count --)
> + {
> + tree decl = lto_file_decl_data_get_var_decl (file_data,
> + lto_input_uleb128 (ib));
> + varpool_extra_name_alias (decl, var_decl);
> + }
> + }
> + return node;
> +}
> +
>
> /* Read an edge from IB. NODES points to a vector of previously read
> nodes for decoding caller and callee of the edge to be read. */
> @@ -782,6 +894,22 @@ input_cgraph_1 (struct lto_file_decl_dat
> VEC_free (cgraph_node_ptr, heap, nodes);
> }
>
> +/* Read a varpool from IB using the info in FILE_DATA. */
> +
> +static void
> +input_varpool_1 (struct lto_file_decl_data *file_data,
> + struct lto_input_block *ib)
> +{
> + unsigned HOST_WIDE_INT len;
> +
> + len = lto_input_uleb128 (ib);
> + while (len)
> + {
> + input_varpool_node (file_data, ib);
> + len--;
> + }
> +}
> +
> static struct gcov_ctr_summary lto_gcov_summary;
>
> /* Input profile_info from IB. */
> @@ -837,6 +965,12 @@ input_cgraph (void)
> lto_destroy_simple_input_block (file_data, LTO_section_cgraph,
> ib, data, len);
>
> + ib = lto_create_simple_input_block (file_data, LTO_section_varpool,
> + &data, &len);
> + input_varpool_1 (file_data, ib);
> + lto_destroy_simple_input_block (file_data, LTO_section_varpool,
> + ib, data, len);
> +
> /* Assume that every file read needs to be processed by LTRANS. */
> if (flag_wpa)
> lto_mark_file_for_ltrans (file_data);
> Index: ipa-pure-const.c
> ===================================================================
> --- ipa-pure-const.c (revision 158825)
> +++ ipa-pure-const.c (working copy)
> @@ -771,7 +771,8 @@ generate_summary (void)
> /* Serialize the ipa info for lto. */
>
> static void
> -pure_const_write_summary (cgraph_node_set set)
> +pure_const_write_summary (cgraph_node_set set,
> + varpool_node_set vset ATTRIBUTE_UNUSED)
> {
> struct cgraph_node *node;
> struct lto_simple_output_block *ob
> Index: lto-streamer-out.c
> ===================================================================
> --- lto-streamer-out.c (revision 158825)
> +++ lto-streamer-out.c (working copy)
> @@ -2089,7 +2089,7 @@ lto_writer_init (void)
> /* Main entry point from the pass manager. */
>
> static void
> -lto_output (cgraph_node_set set)
> +lto_output (cgraph_node_set set, varpool_node_set vset)
> {
> struct cgraph_node *node;
> struct lto_out_decl_state *decl_state;
> @@ -2122,6 +2122,7 @@ lto_output (cgraph_node_set set)
> have been renumbered so that edges can be associated with call
> statements using the statement UIDs. */
> output_cgraph (set);
> + output_varpool (vset);
>
> lto_bitmap_free (output);
> }
> @@ -2176,20 +2177,6 @@ write_global_stream (struct output_block
> t = lto_tree_ref_encoder_get_tree (encoder, index);
> if (!lto_streamer_cache_lookup (ob->writer_cache, t, NULL))
> lto_output_tree (ob, t, false);
> -
> - if (flag_wpa)
> - {
> - /* In WPA we should not emit multiple definitions of the
> - same symbol to all the files in the link set. If
> - T had already been emitted as the pervailing definition
> - in one file, do not emit it in the others. */
> - /* FIXME lto. We should check if T belongs to the
> - file we are writing to. */
> - if (TREE_CODE (t) == VAR_DECL
> - && TREE_PUBLIC (t)
> - && !DECL_EXTERNAL (t))
> - TREE_ASM_WRITTEN (t) = 1;
> - }
> }
> }
>
> @@ -2442,7 +2429,7 @@ produce_symtab (struct lto_streamer_cach
> recover these on other side. */
>
> static void
> -produce_asm_for_decls (cgraph_node_set set)
> +produce_asm_for_decls (cgraph_node_set set, varpool_node_set vset ATTRIBUTE_UNUSED)
> {
> struct lto_out_decl_state *out_state;
> struct lto_out_decl_state *fn_out_state;
> Index: ipa-inline.c
> ===================================================================
> --- ipa-inline.c (revision 158825)
> +++ ipa-inline.c (working copy)
> @@ -2095,7 +2095,8 @@ inline_read_summary (void)
> active, we don't need to write them twice. */
>
> static void
> -inline_write_summary (cgraph_node_set set)
> +inline_write_summary (cgraph_node_set set,
> + varpool_node_set vset ATTRIBUTE_UNUSED)
> {
> if (flag_indirect_inlining && !flag_ipa_cp)
> ipa_prop_write_jump_functions (set);
> Index: lto-streamer-in.c
> ===================================================================
> --- lto-streamer-in.c (revision 158825)
> +++ lto-streamer-in.c (working copy)
> @@ -358,8 +358,6 @@ lto_input_tree_ref (struct lto_input_blo
> case LTO_label_decl_ref:
> ix_u = lto_input_uleb128 (ib);
> result = lto_file_decl_data_get_var_decl (data_in->file_data, ix_u);
> - if (TREE_CODE (result) == VAR_DECL)
> - varpool_mark_needed_node (varpool_node (result));
> break;
>
> default:
> @@ -2732,7 +2730,6 @@ lto_input_tree (struct lto_input_block *
> result = lto_file_decl_data_get_var_decl (data_in->file_data, ix);
> ix = lto_input_uleb128 (ib);
> target = lto_file_decl_data_get_var_decl (data_in->file_data, ix);
> - varpool_extra_name_alias (result, target);
> }
> else if (tag == lto_tree_code_to_tag (INTEGER_CST))
> {
> Index: lto-section-in.c
> ===================================================================
> --- lto-section-in.c (revision 158825)
> +++ lto-section-in.c (working copy)
> @@ -52,6 +52,8 @@ const char *lto_section_name[LTO_N_SECTI
> "function_body",
> "static_initializer",
> "cgraph",
> + "varpool",
> + "jump_funcs"
> "ipa_pure_const",
> "ipa_reference",
> "symtab",
> Index: ipa.c
> ===================================================================
> --- ipa.c (revision 158825)
> +++ ipa.c (working copy)
> @@ -762,6 +762,164 @@ debug_cgraph_node_set (cgraph_node_set s
> dump_cgraph_node_set (stderr, set);
> }
>
> +/* Hash a varpool node set element. */
> +
> +static hashval_t
> +hash_varpool_node_set_element (const void *p)
> +{
> + const_varpool_node_set_element element = (const_varpool_node_set_element) p;
> + return htab_hash_pointer (element->node);
> +}
> +
> +/* Compare two varpool node set elements. */
> +
> +static int
> +eq_varpool_node_set_element (const void *p1, const void *p2)
> +{
> + const_varpool_node_set_element e1 = (const_varpool_node_set_element) p1;
> + const_varpool_node_set_element e2 = (const_varpool_node_set_element) p2;
> +
> + return e1->node == e2->node;
> +}
> +
> +/* Create a new varpool node set. */
> +
> +varpool_node_set
> +varpool_node_set_new (void)
> +{
> + varpool_node_set new_node_set;
> +
> + new_node_set = GGC_NEW (struct varpool_node_set_def);
> + new_node_set->hashtab = htab_create_ggc (10,
> + hash_varpool_node_set_element,
> + eq_varpool_node_set_element,
> + NULL);
> + new_node_set->nodes = NULL;
> + return new_node_set;
> +}
> +
> +/* Add varpool_node NODE to varpool_node_set SET. */
> +
> +void
> +varpool_node_set_add (varpool_node_set set, struct varpool_node *node)
> +{
> + void **slot;
> + varpool_node_set_element element;
> + struct varpool_node_set_element_def dummy;
> +
> + dummy.node = node;
> + slot = htab_find_slot (set->hashtab, &dummy, INSERT);
> +
> + if (*slot != HTAB_EMPTY_ENTRY)
> + {
> + element = (varpool_node_set_element) *slot;
> + gcc_assert (node == element->node
> + && (VEC_index (varpool_node_ptr, set->nodes, element->index)
> + == node));
> + return;
> + }
> +
> + /* Insert node into hash table. */
> + element =
> + (varpool_node_set_element) GGC_NEW (struct varpool_node_set_element_def);
> + element->node = node;
> + element->index = VEC_length (varpool_node_ptr, set->nodes);
> + *slot = element;
> +
> + /* Insert into node vector. */
> + VEC_safe_push (varpool_node_ptr, gc, set->nodes, node);
> +}
> +
> +/* Remove varpool_node NODE from varpool_node_set SET. */
> +
> +void
> +varpool_node_set_remove (varpool_node_set set, struct varpool_node *node)
> +{
> + void **slot, **last_slot;
> + varpool_node_set_element element, last_element;
> + struct varpool_node *last_node;
> + struct varpool_node_set_element_def dummy;
> +
> + dummy.node = node;
> + slot = htab_find_slot (set->hashtab, &dummy, NO_INSERT);
> + if (slot == NULL)
> + return;
> +
> + element = (varpool_node_set_element) *slot;
> + gcc_assert (VEC_index (varpool_node_ptr, set->nodes, element->index)
> + == node);
> +
> + /* Remove from vector. We do this by swapping node with the last element
> + of the vector. */
> + last_node = VEC_pop (varpool_node_ptr, set->nodes);
> + if (last_node != node)
> + {
> + dummy.node = last_node;
> + last_slot = htab_find_slot (set->hashtab, &dummy, NO_INSERT);
> + last_element = (varpool_node_set_element) *last_slot;
> + gcc_assert (last_element);
> +
> + /* Move the last element to the original spot of NODE. */
> + last_element->index = element->index;
> + VEC_replace (varpool_node_ptr, set->nodes, last_element->index,
> + last_node);
> + }
> +
> + /* Remove element from hash table. */
> + htab_clear_slot (set->hashtab, slot);
> + ggc_free (element);
> +}
> +
> +/* Find NODE in SET and return an iterator to it if found. A null iterator
> + is returned if NODE is not in SET. */
> +
> +varpool_node_set_iterator
> +varpool_node_set_find (varpool_node_set set, struct varpool_node *node)
> +{
> + void **slot;
> + struct varpool_node_set_element_def dummy;
> + varpool_node_set_element element;
> + varpool_node_set_iterator vsi;
> +
> + dummy.node = node;
> + slot = htab_find_slot (set->hashtab, &dummy, NO_INSERT);
> + if (slot == NULL)
> + vsi.index = (unsigned) ~0;
> + else
> + {
> + element = (varpool_node_set_element) *slot;
> + gcc_assert (VEC_index (varpool_node_ptr, set->nodes, element->index)
> + == node);
> + vsi.index = element->index;
> + }
> + vsi.set = set;
> +
> + return vsi;
> +}
> +
> +/* Dump content of SET to file F. */
> +
> +void
> +dump_varpool_node_set (FILE *f, varpool_node_set set)
> +{
> + varpool_node_set_iterator iter;
> +
> + for (iter = vsi_start (set); !vsi_end_p (iter); vsi_next (&iter))
> + {
> + struct varpool_node *node = vsi_node (iter);
> + dump_varpool_node (f, node);
> + }
> +}
> +
> +/* Dump content of SET to stderr. */
> +
> +void
> +debug_varpool_node_set (varpool_node_set set)
> +{
> + dump_varpool_node_set (stderr, set);
> +}
> +
> +
> /* Simple ipa profile pass propagating frequencies across the callgraph. */
>
> static unsigned int
> Index: lto/lto.c
> ===================================================================
> --- lto/lto.c (revision 158825)
> +++ lto/lto.c (working copy)
> @@ -523,6 +523,7 @@ free_section_data (struct lto_file_decl_
>
> /* Vector of all cgraph node sets. */
> static GTY (()) VEC(cgraph_node_set, gc) *lto_cgraph_node_sets;
> +static GTY (()) VEC(varpool_node_set, gc) *lto_varpool_node_sets;
>
>
> /* Group cgrah nodes by input files. This is used mainly for testing
> @@ -532,25 +533,21 @@ static void
> lto_1_to_1_map (void)
> {
> struct cgraph_node *node;
> + struct varpool_node *vnode;
> struct lto_file_decl_data *file_data;
> struct pointer_map_t *pmap;
> + struct pointer_map_t *vpmap;
> cgraph_node_set set;
> + varpool_node_set vset;
> void **slot;
>
> timevar_push (TV_WHOPR_WPA);
>
> lto_cgraph_node_sets = VEC_alloc (cgraph_node_set, gc, 1);
> -
> - /* If the cgraph is empty, create one cgraph node set so that there is still
> - an output file for any variables that need to be exported in a DSO. */
> - if (!cgraph_nodes)
> - {
> - set = cgraph_node_set_new ();
> - VEC_safe_push (cgraph_node_set, gc, lto_cgraph_node_sets, set);
> - goto finish;
> - }
> + lto_varpool_node_sets = VEC_alloc (varpool_node_set, gc, 1);
>
> pmap = pointer_map_create ();
> + vpmap = pointer_map_create ();
>
> for (node = cgraph_nodes; node; node = node->next)
> {
> @@ -558,6 +555,9 @@ lto_1_to_1_map (void)
> cloned from. */
> if (node->global.inlined_to || node->clone_of)
> continue;
> + /* Nodes without a body don not need partitioning. */
> + if (!node->analyzed)
> + continue;
> /* We only need to partition the nodes that we read from the
> gimple bytecode files. */
> file_data = node->local.lto_file_data;
> @@ -573,14 +573,50 @@ lto_1_to_1_map (void)
> slot = pointer_map_insert (pmap, file_data);
> *slot = set;
> VEC_safe_push (cgraph_node_set, gc, lto_cgraph_node_sets, set);
> + vset = varpool_node_set_new ();
> + slot = pointer_map_insert (vpmap, file_data);
> + *slot = vset;
> + VEC_safe_push (varpool_node_set, gc, lto_varpool_node_sets, vset);
> }
>
> cgraph_node_set_add (set, node);
> }
>
> + for (vnode = varpool_nodes; vnode; vnode = vnode->next)
> + {
> + if (vnode->alias)
> + continue;
> + slot = pointer_map_contains (vpmap, file_data);
> + if (slot)
> + vset = (varpool_node_set) *slot;
> + else
> + {
> + set = cgraph_node_set_new ();
> + slot = pointer_map_insert (pmap, file_data);
> + *slot = set;
> + VEC_safe_push (cgraph_node_set, gc, lto_cgraph_node_sets, set);
> + vset = varpool_node_set_new ();
> + slot = pointer_map_insert (vpmap, file_data);
> + *slot = vset;
> + VEC_safe_push (varpool_node_set, gc, lto_varpool_node_sets, vset);
> + }
> +
> + varpool_node_set_add (vset, vnode);
> + }
> +
> + /* If the cgraph is empty, create one cgraph node set so that there is still
> + an output file for any variables that need to be exported in a DSO. */
> + if (!lto_cgraph_node_sets)
> + {
> + set = cgraph_node_set_new ();
> + VEC_safe_push (cgraph_node_set, gc, lto_cgraph_node_sets, set);
> + vset = varpool_node_set_new ();
> + VEC_safe_push (varpool_node_set, gc, lto_varpool_node_sets, vset);
> + }
> +
> pointer_map_destroy (pmap);
> + pointer_map_destroy (vpmap);
>
> -finish:
> timevar_pop (TV_WHOPR_WPA);
>
> lto_stats.num_cgraph_partitions += VEC_length (cgraph_node_set,
> @@ -672,174 +708,6 @@ lto_add_all_inlinees (cgraph_node_set se
> lto_bitmap_free (original_decls);
> }
>
> -/* Owing to inlining, we may need to promote a file-scope variable
> - to a global variable. Consider this case:
> -
> - a.c:
> - static int var;
> -
> - void
> - foo (void)
> - {
> - var++;
> - }
> -
> - b.c:
> -
> - extern void foo (void);
> -
> - void
> - bar (void)
> - {
> - foo ();
> - }
> -
> - If WPA inlines FOO inside BAR, then the static variable VAR needs to
> - be promoted to global because BAR and VAR may be in different LTRANS
> - files. */
> -
> -/* This struct keeps track of states used in globalization. */
> -
> -typedef struct
> -{
> - /* Current cgraph node set. */
> - cgraph_node_set set;
> -
> - /* Function DECLs of cgraph nodes seen. */
> - bitmap seen_node_decls;
> -
> - /* Use in walk_tree to avoid multiple visits of a node. */
> - struct pointer_set_t *visited;
> -
> - /* static vars in this set. */
> - bitmap static_vars_in_set;
> -
> - /* static vars in all previous set. */
> - bitmap all_static_vars;
> -
> - /* all vars in all previous set. */
> - bitmap all_vars;
> -} globalize_context_t;
> -
> -/* Callback for walk_tree. Examine the tree pointer to by TP and see if
> - if its a file-scope static variable of function that need to be turned
> - into a global. */
> -
> -static tree
> -globalize_cross_file_statics (tree *tp, int *walk_subtrees ATTRIBUTE_UNUSED,
> - void *data)
> -{
> - globalize_context_t *context = (globalize_context_t *) data;
> - tree t = *tp;
> -
> - if (t == NULL_TREE)
> - return NULL;
> -
> - /* The logic for globalization of VAR_DECLs and FUNCTION_DECLs are
> - different. For functions, we can simply look at the cgraph node sets
> - to tell if there are references to static functions outside the set.
> - The cgraph node sets do not keep track of vars, we need to traverse
> - the trees to determine what vars need to be globalized. */
> - if (TREE_CODE (t) == VAR_DECL)
> - {
> - if (!TREE_PUBLIC (t))
> - {
> - /* This file-scope static variable is reachable from more
> - that one set. Make it global but with hidden visibility
> - so that we do not export it in dynamic linking. */
> - if (bitmap_bit_p (context->all_static_vars, DECL_UID (t)))
> - {
> - TREE_PUBLIC (t) = 1;
> - DECL_VISIBILITY (t) = VISIBILITY_HIDDEN;
> - }
> - bitmap_set_bit (context->static_vars_in_set, DECL_UID (t));
> - }
> - bitmap_set_bit (context->all_vars, DECL_UID (t));
> - walk_tree (&DECL_INITIAL (t), globalize_cross_file_statics, context,
> - context->visited);
> - }
> - else if (TREE_CODE (t) == FUNCTION_DECL && !TREE_PUBLIC (t))
> - {
> - if (!cgraph_node_in_set_p (cgraph_node (t), context->set)
> - || cgraph_node (t)->address_taken)
> - {
> - /* This file-scope static function is reachable from a set
> - which does not contain the function DECL. Make it global
> - but with hidden visibility. */
> - TREE_PUBLIC (t) = 1;
> - DECL_VISIBILITY (t) = VISIBILITY_HIDDEN;
> - }
> - }
> -
> - return NULL;
> -}
> -
> -/* Helper of lto_scan_statics_in_cgraph_node below. Scan TABLE for
> - static decls that may be used in more than one LTRANS file.
> - CONTEXT is a globalize_context_t for storing scanning states. */
> -
> -static void
> -lto_scan_statics_in_ref_table (struct lto_tree_ref_table *table,
> - globalize_context_t *context)
> -{
> - unsigned i;
> -
> - for (i = 0; i < table->size; i++)
> - walk_tree (&table->trees[i], globalize_cross_file_statics, context,
> - context->visited);
> -}
> -
> -/* Promote file-scope decl reachable from NODE if necessary to global.
> - CONTEXT is a globalize_context_t storing scanning states. */
> -
> -static void
> -lto_scan_statics_in_cgraph_node (struct cgraph_node *node,
> - globalize_context_t *context)
> -{
> - struct lto_in_decl_state *state;
> -
> - /* Do nothing if NODE has no function body. */
> - if (!node->analyzed)
> - return;
> -
> - /* Return if the DECL of nodes has been visited before. */
> - if (bitmap_bit_p (context->seen_node_decls, DECL_UID (node->decl)))
> - return;
> -
> - bitmap_set_bit (context->seen_node_decls, DECL_UID (node->decl));
> -
> - state = lto_get_function_in_decl_state (node->local.lto_file_data,
> - node->decl);
> - gcc_assert (state);
> -
> - lto_scan_statics_in_ref_table (&state->streams[LTO_DECL_STREAM_VAR_DECL],
> - context);
> - lto_scan_statics_in_ref_table (&state->streams[LTO_DECL_STREAM_FN_DECL],
> - context);
> -}
> -
> -/* Scan all global variables that we have not yet seen so far. CONTEXT
> - is a globalize_context_t storing scanning states. */
> -
> -static void
> -lto_scan_statics_in_remaining_global_vars (globalize_context_t *context)
> -{
> - tree var, var_context;
> - struct varpool_node *vnode;
> -
> - FOR_EACH_STATIC_VARIABLE (vnode)
> - {
> - var = vnode->decl;
> - var_context = DECL_CONTEXT (var);
> - if (TREE_STATIC (var)
> - && TREE_PUBLIC (var)
> - && (!var_context || TREE_CODE (var_context) != FUNCTION_DECL)
> - && !bitmap_bit_p (context->all_vars, DECL_UID (var)))
> - walk_tree (&var, globalize_cross_file_statics, context,
> - context->visited);
> - }
> -}
> -
> /* Find out all static decls that need to be promoted to global because
> of cross file sharing. This function must be run in the WPA mode after
> all inlinees are added. */
> @@ -847,39 +715,49 @@ lto_scan_statics_in_remaining_global_var
> static void
> lto_promote_cross_file_statics (void)
> {
> + struct varpool_node *vnode;
> unsigned i, n_sets;
> cgraph_node_set set;
> cgraph_node_set_iterator csi;
> - globalize_context_t context;
>
> - memset (&context, 0, sizeof (context));
> - context.all_vars = lto_bitmap_alloc ();
> - context.all_static_vars = lto_bitmap_alloc ();
> + gcc_assert (flag_wpa);
>
> + /* At moment we make no attempt to figure out who is refering the variables,
> + so all must become global. */
> + for (vnode = varpool_nodes; vnode; vnode = vnode->next)
> + if (!vnode->externally_visible)
> + {
> + TREE_PUBLIC (vnode->decl) = 1;
> + DECL_VISIBILITY (vnode->decl) = VISIBILITY_HIDDEN;
> + }
> n_sets = VEC_length (cgraph_node_set, lto_cgraph_node_sets);
> for (i = 0; i < n_sets; i++)
> {
> set = VEC_index (cgraph_node_set, lto_cgraph_node_sets, i);
> - context.set = set;
> - context.visited = pointer_set_create ();
> - context.static_vars_in_set = lto_bitmap_alloc ();
> - context.seen_node_decls = lto_bitmap_alloc ();
>
> + /* If node has either address taken (and we have no clue from where)
> + or it is called from other partition, it needs to be globalized. */
> for (csi = csi_start (set); !csi_end_p (csi); csi_next (&csi))
> - lto_scan_statics_in_cgraph_node (csi_node (csi), &context);
> -
> - if (i == n_sets - 1)
> - lto_scan_statics_in_remaining_global_vars (&context);
> -
> - bitmap_ior_into (context.all_static_vars, context.static_vars_in_set);
> + {
> + struct cgraph_node *node = csi_node (csi);
> + bool globalize = node->address_taken;
> + struct cgraph_edge *e;
> + for (e = node->callers; e && !globalize; e = e->next_caller)
> + {
> + struct cgraph_node *caller = e->caller;
> + if (caller->global.inlined_to)
> + caller = caller->global.inlined_to;
> + if (!cgraph_node_in_set_p (caller, set))
> + globalize = true;
> + }
> + if (globalize)
> + {
> + TREE_PUBLIC (node->decl) = 1;
> + DECL_VISIBILITY (node->decl) = VISIBILITY_HIDDEN;
> + }
> + }
>
> - pointer_set_destroy (context.visited);
> - lto_bitmap_free (context.static_vars_in_set);
> - lto_bitmap_free (context.seen_node_decls);
> }
> -
> - lto_bitmap_free (context.all_vars);
> - lto_bitmap_free (context.all_static_vars);
> }
>
>
> @@ -918,7 +796,7 @@ strip_extension (const char *fname)
> the same input file. */
>
> static char *
> -get_filename_for_set (cgraph_node_set set)
> +get_filename_for_set (cgraph_node_set set, varpool_node_set vset ATTRIBUTE_UNUSED)
> {
> char *fname = NULL;
> static const size_t max_fname_len = 100;
> @@ -999,6 +877,7 @@ lto_wpa_write_files (void)
> unsigned i, n_sets, last_out_file_ix, num_out_files;
> lto_file *file;
> cgraph_node_set set;
> + varpool_node_set vset;
>
> timevar_push (TV_WHOPR_WPA);
>
> @@ -1034,7 +913,8 @@ lto_wpa_write_files (void)
> char *temp_filename;
>
> set = VEC_index (cgraph_node_set, lto_cgraph_node_sets, i);
> - temp_filename = get_filename_for_set (set);
> + vset = VEC_index (varpool_node_set, lto_varpool_node_sets, i);
> + temp_filename = get_filename_for_set (set, vset);
> output_files[i] = temp_filename;
>
> if (cgraph_node_set_needs_ltrans_p (set))
> @@ -1046,7 +926,7 @@ lto_wpa_write_files (void)
>
> lto_set_current_out_file (file);
>
> - ipa_write_optimization_summaries (set);
> + ipa_write_optimization_summaries (set, vset);
>
> lto_set_current_out_file (NULL);
> lto_obj_file_close (file);
> Index: passes.c
> ===================================================================
> --- passes.c (revision 158825)
> +++ passes.c (working copy)
> @@ -191,7 +191,11 @@ rest_of_decl_compilation (tree decl,
> || DECL_INITIAL (decl))
> && !DECL_EXTERNAL (decl))
> {
> - if (TREE_CODE (decl) != FUNCTION_DECL)
> + /* When reading LTO unit, we also read varpool, so do not
> + rebuild it. */
> + if (in_lto_p && !cgraph_function_flags_ready)
> + ;
> + else if (TREE_CODE (decl) != FUNCTION_DECL)
> varpool_finalize_decl (decl);
> else
> assemble_variable (decl, top_level, at_end, 0);
> @@ -218,7 +222,9 @@ rest_of_decl_compilation (tree decl,
> }
>
> /* Let cgraph know about the existence of variables. */
> - if (TREE_CODE (decl) == VAR_DECL && !DECL_EXTERNAL (decl))
> + if (in_lto_p && !cgraph_function_flags_ready)
> + ;
> + else if (TREE_CODE (decl) == VAR_DECL && !DECL_EXTERNAL (decl))
> varpool_node (decl);
> }
>
> @@ -1616,7 +1622,10 @@ execute_one_pass (struct opt_pass *pass)
> }
>
> if (!current_function_decl)
> - cgraph_process_new_functions ();
> + {
> + cgraph_process_new_functions ();
> + varpool_analyze_pending_decls ();
> + }
>
> pass_fini_dump_file (pass);
>
> @@ -1649,6 +1658,7 @@ execute_pass_list (struct opt_pass *pass
>
> static void
> ipa_write_summaries_2 (struct opt_pass *pass, cgraph_node_set set,
> + varpool_node_set vset,
> struct lto_out_decl_state *state)
> {
> while (pass)
> @@ -1665,7 +1675,7 @@ ipa_write_summaries_2 (struct opt_pass *
> if (pass->tv_id)
> timevar_push (pass->tv_id);
>
> - ipa_pass->write_summary (set);
> + ipa_pass->write_summary (set,vset);
>
> /* If a timevar is present, start it. */
> if (pass->tv_id)
> @@ -1673,7 +1683,7 @@ ipa_write_summaries_2 (struct opt_pass *
> }
>
> if (pass->sub && pass->sub->type != GIMPLE_PASS)
> - ipa_write_summaries_2 (pass->sub, set, state);
> + ipa_write_summaries_2 (pass->sub, set, vset, state);
>
> pass = pass->next;
> }
> @@ -1684,14 +1694,14 @@ ipa_write_summaries_2 (struct opt_pass *
> summaries. SET is the set of nodes to be written. */
>
> static void
> -ipa_write_summaries_1 (cgraph_node_set set)
> +ipa_write_summaries_1 (cgraph_node_set set, varpool_node_set vset)
> {
> struct lto_out_decl_state *state = lto_new_out_decl_state ();
> lto_push_out_decl_state (state);
>
> gcc_assert (!flag_wpa);
> - ipa_write_summaries_2 (all_regular_ipa_passes, set, state);
> - ipa_write_summaries_2 (all_lto_gen_passes, set, state);
> + ipa_write_summaries_2 (all_regular_ipa_passes, set, vset, state);
> + ipa_write_summaries_2 (all_lto_gen_passes, set, vset, state);
>
> gcc_assert (lto_get_out_decl_state () == state);
> lto_pop_out_decl_state ();
> @@ -1704,7 +1714,9 @@ void
> ipa_write_summaries (void)
> {
> cgraph_node_set set;
> + varpool_node_set vset;
> struct cgraph_node **order;
> + struct varpool_node *vnode;
> int i, order_pos;
>
> if (!flag_generate_lto || errorcount || sorrycount)
> @@ -1736,13 +1748,20 @@ ipa_write_summaries (void)
> renumber_gimple_stmt_uids ();
> pop_cfun ();
> }
> - cgraph_node_set_add (set, node);
> + if (node->needed || node->reachable || node->address_taken)
> + cgraph_node_set_add (set, node);
> }
> + vset = varpool_node_set_new ();
> +
> + for (vnode = varpool_nodes; vnode; vnode = vnode->next)
> + if (vnode->needed && !vnode->alias)
> + varpool_node_set_add (vset, vnode);
>
> - ipa_write_summaries_1 (set);
> + ipa_write_summaries_1 (set, vset);
>
> free (order);
> ggc_free (set);
> + ggc_free (vset);
> }
>
> /* Same as execute_pass_list but assume that subpasses of IPA passes
> @@ -1751,6 +1770,7 @@ ipa_write_summaries (void)
>
> static void
> ipa_write_optimization_summaries_1 (struct opt_pass *pass, cgraph_node_set set,
> + varpool_node_set vset,
> struct lto_out_decl_state *state)
> {
> while (pass)
> @@ -1767,7 +1787,7 @@ ipa_write_optimization_summaries_1 (stru
> if (pass->tv_id)
> timevar_push (pass->tv_id);
>
> - ipa_pass->write_optimization_summary (set);
> + ipa_pass->write_optimization_summary (set, vset);
>
> /* If a timevar is present, start it. */
> if (pass->tv_id)
> @@ -1775,7 +1795,7 @@ ipa_write_optimization_summaries_1 (stru
> }
>
> if (pass->sub && pass->sub->type != GIMPLE_PASS)
> - ipa_write_optimization_summaries_1 (pass->sub, set, state);
> + ipa_write_optimization_summaries_1 (pass->sub, set, vset, state);
>
> pass = pass->next;
> }
> @@ -1785,14 +1805,14 @@ ipa_write_optimization_summaries_1 (stru
> NULL, write out all summaries of all nodes. */
>
> void
> -ipa_write_optimization_summaries (cgraph_node_set set)
> +ipa_write_optimization_summaries (cgraph_node_set set, varpool_node_set vset)
> {
> struct lto_out_decl_state *state = lto_new_out_decl_state ();
> lto_push_out_decl_state (state);
>
> gcc_assert (flag_wpa);
> - ipa_write_optimization_summaries_1 (all_regular_ipa_passes, set, state);
> - ipa_write_optimization_summaries_1 (all_lto_gen_passes, set, state);
> + ipa_write_optimization_summaries_1 (all_regular_ipa_passes, set, vset, state);
> + ipa_write_optimization_summaries_1 (all_lto_gen_passes, set, vset, state);
>
> gcc_assert (lto_get_out_decl_state () == state);
> lto_pop_out_decl_state ();
> @@ -1917,6 +1937,7 @@ execute_ipa_pass_list (struct opt_pass *
> }
> gcc_assert (!current_function_decl);
> cgraph_process_new_functions ();
> + varpool_analyze_pending_decls ();
> pass = pass->next;
> }
> while (pass);
> Index: varpool.c
> ===================================================================
> --- varpool.c (revision 158825)
> +++ varpool.c (working copy)
> @@ -105,6 +105,22 @@ eq_varpool_node (const void *p1, const v
> return DECL_UID (n1->decl) == DECL_UID (n2->decl);
> }
>
> +/* Return varpool node assigned to DECL without creating new one. */
> +struct varpool_node *
> +varpool_get_node (tree decl)
> +{
> + struct varpool_node key, **slot;
> +
> + gcc_assert (DECL_P (decl) && TREE_CODE (decl) != FUNCTION_DECL);
> +
> + if (!varpool_hash)
> + return NULL;
> + key.decl = decl;
> + slot = (struct varpool_node **)
> + htab_find_slot (varpool_hash, &key, INSERT);
> + return *slot;
> +}
> +
> /* Return varpool node assigned to DECL. Create new one when needed. */
> struct varpool_node *
> varpool_node (tree decl)
> @@ -125,11 +141,51 @@ varpool_node (tree decl)
> node->decl = decl;
> node->order = cgraph_order++;
> node->next = varpool_nodes;
> + if (varpool_nodes)
> + varpool_nodes->prev = node;
> varpool_nodes = node;
> *slot = node;
> return node;
> }
>
> +/* Remove node from the varpool. */
> +void
> +varpool_remove_node (struct varpool_node *node)
> +{
> + void **slot;
> + slot = htab_find_slot (varpool_hash, node, NO_INSERT);
> + gcc_assert (*slot == node);
> + htab_clear_slot (varpool_hash, slot);
> + *slot = NULL;
> + gcc_assert (!varpool_assembled_nodes_queue);
> + if (node->next)
> + node->next->prev = node->prev;
> + if (node->prev)
> + node->prev->next = node->next;
> + else if (node->next)
> + {
> + gcc_assert (varpool_nodes == node);
> + varpool_nodes = node->next;
> + }
> + if (varpool_first_unanalyzed_node == node)
> + varpool_first_unanalyzed_node = node->next_needed;
> + if (node->next_needed)
> + node->next->prev_needed = node->prev_needed;
> + else if (node->prev_needed)
> + {
> + gcc_assert (varpool_last_needed_node);
> + varpool_last_needed_node = node->prev_needed;
> + }
> + if (node->prev_needed)
> + node->prev->next_needed = node->next_needed;
> + else if (node->next_needed)
> + {
> + gcc_assert (varpool_nodes_queue == node);
> + varpool_nodes_queue = node->next_needed;
> + }
> + ggc_free (node);
> +}
> +
> /* Dump given cgraph node. */
> void
> dump_varpool_node (FILE *f, struct varpool_node *node)
> @@ -139,8 +195,12 @@ dump_varpool_node (FILE *f, struct varpo
> cgraph_function_flags_ready
> ? cgraph_availability_names[cgraph_variable_initializer_availability (node)]
> : "not-ready");
> + if (DECL_ASSEMBLER_NAME_SET_P (node->decl))
> + fprintf (f, " (asm: %s)", IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (node->decl)));
> if (DECL_INITIAL (node->decl))
> fprintf (f, " initialized");
> + if (TREE_ASM_WRITTEN (node->decl))
> + fprintf (f, " (asm written)");
> if (node->needed)
> fprintf (f, " needed");
> if (node->analyzed)
> @@ -151,6 +211,10 @@ dump_varpool_node (FILE *f, struct varpo
> fprintf (f, " output");
> if (node->externally_visible)
> fprintf (f, " externally_visible");
> + if (node->in_other_partition)
> + fprintf (f, " in_other_partition");
> + else if (node->used_from_other_partition)
> + fprintf (f, " used_from_other_partition");
> fprintf (f, "\n");
> }
>
> @@ -192,7 +256,10 @@ static void
> varpool_enqueue_needed_node (struct varpool_node *node)
> {
> if (varpool_last_needed_node)
> - varpool_last_needed_node->next_needed = node;
> + {
> + varpool_last_needed_node->next_needed = node;
> + node->prev_needed = varpool_last_needed_node;
> + }
> varpool_last_needed_node = node;
> node->next_needed = NULL;
> if (!varpool_nodes_queue)
> @@ -230,11 +297,7 @@ varpool_reset_queue (void)
> bool
> decide_is_variable_needed (struct varpool_node *node, tree decl)
> {
> - /* We do not track variable references at all and thus have no idea if the
> - variable was referenced in some other partition or not.
> - FIXME: We really need address taken edges in callgraph and varpool to
> - drive WPA and decide whether other partition might reference it or not. */
> - if (flag_ltrans)
> + if (node->used_from_other_partition)
> return true;
> /* If the user told us it is used, then it must be so. */
> if ((node->externally_visible && !DECL_COMDAT (decl))
> @@ -288,17 +351,6 @@ varpool_finalize_decl (tree decl)
> {
> struct varpool_node *node = varpool_node (decl);
>
> - /* FIXME: We don't really stream varpool datastructure and instead rebuild it
> - by varpool_finalize_decl. This is not quite correct since this way we can't
> - attach any info to varpool. Eventually we will want to stream varpool nodes
> - and the flags.
> -
> - For the moment just prevent analysis of varpool nodes to happen again, so
> - we will re-try to compute "address_taken" flag of varpool that breaks
> - in presence of clones. */
> - if (in_lto_p)
> - node->analyzed = true;
> -
> /* The first declaration of a variable that comes through this function
> decides whether it is global (in C, has external linkage)
> or local (in C, has internal linkage). So do nothing more
> @@ -385,6 +437,7 @@ varpool_assemble_decl (struct varpool_no
>
> if (!TREE_ASM_WRITTEN (decl)
> && !node->alias
> + && !node->in_other_partition
> && !DECL_EXTERNAL (decl)
> && (TREE_CODE (decl) != VAR_DECL || !DECL_HAS_VALUE_EXPR_P (decl)))
> {
> @@ -394,6 +447,9 @@ varpool_assemble_decl (struct varpool_no
> struct varpool_node *alias;
>
> node->next_needed = varpool_assembled_nodes_queue;
> + node->prev_needed = NULL;
> + if (varpool_assembled_nodes_queue)
> + varpool_assembled_nodes_queue->prev_needed = node;
> varpool_assembled_nodes_queue = node;
> node->finalized = 1;
>
> @@ -476,7 +532,10 @@ varpool_assemble_pending_decls (void)
> if (varpool_assemble_decl (node))
> changed = true;
> else
> - node->next_needed = NULL;
> + {
> + node->prev_needed = NULL;
> + node->next_needed = NULL;
> + }
> }
> /* varpool_nodes_queue is now empty, clear the pointer to the last element
> in the queue. */
> @@ -498,6 +557,7 @@ varpool_empty_needed_queue (void)
> struct varpool_node *node = varpool_nodes_queue;
> varpool_nodes_queue = varpool_nodes_queue->next_needed;
> node->next_needed = NULL;
> + node->prev_needed = NULL;
> }
> /* varpool_nodes_queue is now empty, clear the pointer to the last element
> in the queue. */
> @@ -559,6 +619,7 @@ varpool_extra_name_alias (tree alias, tr
> alias_node->alias = 1;
> alias_node->extra_name = decl_node;
> alias_node->next = decl_node->extra_name;
> + decl_node->extra_name->prev = alias_node;
> decl_node->extra_name = alias_node;
> *slot = alias_node;
> return true;
> Index: lto-streamer.c
> ===================================================================
> --- lto-streamer.c (revision 158825)
> +++ lto-streamer.c (working copy)
> @@ -160,6 +160,9 @@ lto_get_section_name (int section_type,
> case LTO_section_cgraph:
> return concat (LTO_SECTION_NAME_PREFIX, ".cgraph", NULL);
>
> + case LTO_section_varpool:
> + return concat (LTO_SECTION_NAME_PREFIX, ".vars", NULL);
> +
> case LTO_section_jump_functions:
> return concat (LTO_SECTION_NAME_PREFIX, ".jmpfuncs", NULL);
>
> Index: lto-streamer.h
> ===================================================================
> --- lto-streamer.h (revision 158825)
> +++ lto-streamer.h (working copy)
> @@ -259,11 +259,11 @@ enum lto_section_type
> LTO_section_function_body,
> LTO_section_static_initializer,
> LTO_section_cgraph,
> + LTO_section_varpool,
> LTO_section_jump_functions,
> LTO_section_ipa_pure_const,
> LTO_section_ipa_reference,
> LTO_section_symtab,
> - LTO_section_wpa_fixup,
> LTO_section_opts,
> LTO_N_SECTION_TYPES /* Must be last. */
> };
> @@ -834,6 +834,8 @@ int lto_cgraph_encoder_encode (lto_cgrap
> void lto_cgraph_encoder_delete (lto_cgraph_encoder_t encoder);
> void output_cgraph (cgraph_node_set);
> void input_cgraph (void);
> +void output_varpool (varpool_node_set);
> +void input_varpool (void);
>
>
> /* In lto-symtab.c. */
>
>
--
Richard Guenther <rguenther@suse.de>
Novell / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 - GF: Markus Rex