This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: LTO/WHOPR streaming of varpool


On Wed, 28 Apr 2010, Jan Hubicka wrote:

> Hi,
> this patch implements streaming of varpool and fixes several correctness issues
> with whopr (and makes LTO correct wrt unused variable removing too).
> Main issue with whopr was that it shipped variable declarations into multiple units
> and turned them into static vars creating large binarries (and invalid programs
> too).  This is now fixed and WHOPR produce pretty much same size of GCC binary
> as LTO.
> 
> Most of stuff in unexciting clonning of what we do for cgraph nodes already.
> I removed original logic from lto.c to figure out if variable can be accessed
> by other partition since it can not work well with WPA anyway (function bodies
> are not around and thus we don't know if inlining function will carry reference
> to variable to other unit). This is also what caused link error on several SPEC
> programs and consequently lead me to temporarily disable unused variable removal
> that result in current explossion of whopr binary size (since all variables
> are copied to every unit).
> 
> There is room for cleanups left, especially in WPA, but I will do it incrementally.
> Also we should start streaming variable initializers in separate sections as we do
> for function bodies as well as list of references for each cgraph node/varpool
> to drive unused variable/function removal.
> 
> Bootstrapped/regtested x86_64-linux, OK?
> 
> 	* lto-symtab.c (lto_symtab_entry_def) Add vnode.
> 	(lto_varpool_replace_node): New.
> 	(lto_symtab_resolve_symbols): Resolve varpool nodes.
> 	(lto_symtab_merge_decls_1): Prefer decls with varpool node.
> 	(lto_symtab_merge_cgraph_nodes_1): Merge varpools.
> 	* cgraph.h (varpool_node_ptr): New type.
> 	(varpool_node_ptr): New vector.
> 	(varpool_node_set_def): New structure.
> 	(varpool_node_set): New type.
> 	(varpool_node_set): New vector.
> 	(varpool_node_set_element_def): New structure.
> 	(varpool_node_set_element, const_varpool_node_set_element): New types.
> 	(varpool_node_set_iterator): New type.
> 	(varpool_node): Add prev pointers, add used_from_other_partition,
> 	in_other_partition.
> 	(varpool_node_set_new, varpool_node_set_find, varpool_node_set_add,
> 	varpool_node_set_remove, dump_varpool_node_set, debug_varpool_node_set,
> 	varpool_get_node, varpool_remove_node): Declare.
> 	(vsi_end_p, vsi_next, vsi_node, vsi_start, varpool_node_in_set_p,
> 	varpool_node_set_size): New inlines.
> 	* tree-pass.h (varpool_node_set_def): Forward declare.
> 	(ipa_opt_pass_d): Summary writting takes vnode sets too.
> 	(ipa_write_optimization_summaries): Update prototype.
> 	* ipa-cp.c (ipcp_write_summary): Update.
> 	* ipa-reference.c (ipa_reference_write_summary): Update.
> 	* lto-cgraph.c (lto_output_varpool_node): New static function.
> 	(output_varpool): New function.
> 	(input_varpool_node): New static function.
> 	(input_varpool_1): New function.
> 	(input_cgraph): Input varpool.
> 	* ipa-pure-const.c (pure_const_write_summary): Update.
> 	* lto-streamer-out.c (lto_output): Update, output varpool too.
> 	(write_global_stream): Kill WPA hack.
> 	(produce_asm_for_decls): Update.
> 	* ipa-inline.c (inline_write_summary): Update.
> 	* lto-streamer-in.c (lto_input_tree_ref, lto_input_tree): Do not build cgraph.
> 	* lto-section-in.c (lto_section_name): Add varpool and jump funcs.
> 	* ipa.c (hash_varpool_node_set_element, eq_varpool_node_set_element,
> 	varpool_node_set_new, varpool_node_set_add,
> 	varpool_node_set_remove, varpool_node_set_find, dump_varpool_node_set,
> 	debug_varpool_node_set): New functions.
> 	* passes.c (rest_of_decl_compilation): when in LTO do not finalize.
> 	(execute_one_pass): Process new decls too.
> 	(ipa_write_summaries_2): Pass around vsets.
> 	(ipa_write_summaries_1): Likewise.
> 	(ipa_write_summaries): Build vset; be more selective about cgraph nodes
> 	to add.
> 	(ipa_write_optimization_summaries_1): Pass around vsets.
> 	(ipa_write_optimization_summaries): Likewise.
> 	* varpool.c (varpool_get_node): New.
> 	(varpool_node): Update doubly linked lists.
> 	(varpool_remove_node): New.
> 	(dump_varpool_node): More dumping.
> 	(varpool_enqueue_needed_node): Update doubly linked lists.
> 	(decide_is_variable_needed): Kill ltrans hack.
> 	(varpool_finalize_decl): Kill lto hack.
> 	(varpool_assemble_decl): Skip decls in other partitions.
> 	(varpool_assemble_pending_decls): Update doubly linkes lists.
> 	(varpool_empty_needed_queue): Likewise.
> 	(varpool_extra_name_alias): Likewise.
> 	* lto-streamer.c (lto_get_section_name): Add vars section.
> 	* lto-streamer.h (lto_section_type): Update.
> 	(output_varpool, input_varpool): Declare.
> 
> 	* lto.c (lto_varpool_node_sets): New.
> 	(lto_1_to_1_map): Partition varpool too.
> 	(globalize_context_t, globalize_cross_file_statics,
> 	lto_scan_statics_in_ref_table, lto_scan_statics_in_cgraph_node,
> 	lto_scan_statics_in_remaining_global_vars): Remove.
> 	(lto_promote_cross_file_statics): Rewrite.
> 	(get_filename_for_set): Take vset argument.
> 	(lto_wpa_write_files): Pass around vsets.
> 
> Index: lto-symtab.c
> ===================================================================
> --- lto-symtab.c	(revision 158825)
> +++ lto-symtab.c	(working copy)
> @@ -44,6 +44,9 @@ struct GTY(()) lto_symtab_entry_def
>    /* The cgraph node if decl is a function decl.  Filled in during the
>       merging process.  */
>    struct cgraph_node *node;
> +  /* The varpool node if decl is a variable decl.  Filled in during the
> +     merging process.  */
> +  struct varpool_node *vnode;
>    /* LTO file-data and symbol resolution for this decl.  */
>    struct lto_file_decl_data * GTY((skip (""))) file_data;
>    enum ld_plugin_symbol_resolution resolution;
> @@ -244,6 +247,23 @@ lto_cgraph_replace_node (struct cgraph_n
>    cgraph_remove_node (node);
>  }
>  
> +/* Replace the cgraph node NODE with PREVAILING_NODE in the cgraph, merging
> +   all edges and removing the old node.  */
> +
> +static void
> +lto_varpool_replace_node (struct varpool_node *node,
> +			  struct varpool_node *prevailing_node)
> +{
> +  /* Merge node flags.  */
> +  if (node->needed)
> +    varpool_mark_needed_node (prevailing_node);
> +  gcc_assert (!node->finalized || prevailing_node->finalized);
> +  gcc_assert (!node->analyzed || prevailing_node->analyzed);
> +
> +  /* Finally remove the replaced node.  */
> +  varpool_remove_node (node);
> +}
> +
>  /* Merge two variable or function symbol table entries PREVAILING and ENTRY.
>     Return false if the symbols are not fully compatible and a diagnostic
>     should be emitted.  */
> @@ -406,6 +426,8 @@ lto_symtab_resolve_symbols (void **slot)
>      {
>        if (TREE_CODE (e->decl) == FUNCTION_DECL)
>  	e->node = cgraph_get_node (e->decl);
> +      else if (TREE_CODE (e->decl) == VAR_DECL)
> +	e->vnode = varpool_get_node (e->decl);
>      }
>  
>    e = (lto_symtab_entry_t) *slot;
> @@ -559,6 +581,10 @@ lto_symtab_merge_decls_1 (void **slot, v
>  	while (!prevailing->node
>  	       && prevailing->next)
>  	  prevailing = prevailing->next;
> +      if (TREE_CODE (prevailing->decl) == VAR_DECL)
> +	while (!prevailing->vnode
> +	       && prevailing->next)
> +	  prevailing = prevailing->next;

Err, instead of duplicating the loop please just make the existing
one unconditional.

>        /* We do not stream varpool nodes, so the first decl has to
>  	 be good enough for now.
>  	 ???  For QOI choose a variable with readonly initializer

And update this comment - well, remove the following grok
and instead make vars with an initializer not replaceable?

> @@ -672,6 +698,8 @@ lto_symtab_merge_cgraph_nodes_1 (void **
>  	    }
>  	  lto_cgraph_replace_node (e->node, prevailing->node);
>  	}
> +      if (e->vnode != NULL)
> +	lto_varpool_replace_node (e->vnode, prevailing->vnode);

Update the comment before the loop.

The rest looks good.  Thanks for working on this.

I agree we should have some testing coverage here - it should
be easy to add a testcase that we'd previously miscompile, no?

Thanks,
Richard.

>      }
>  
>    /* Drop all but the prevailing decl from the symtab.  */
> Index: cgraph.h
> ===================================================================
> --- cgraph.h	(revision 158825)
> +++ cgraph.h	(working copy)
> @@ -302,12 +302,33 @@ struct GTY(()) cgraph_node_set_def
>    PTR GTY ((skip)) aux;
>  };
>  
> +typedef struct varpool_node *varpool_node_ptr;
> +
> +DEF_VEC_P(varpool_node_ptr);
> +DEF_VEC_ALLOC_P(varpool_node_ptr,heap);
> +DEF_VEC_ALLOC_P(varpool_node_ptr,gc);
> +
> +/* A varpool node set is a collection of varpool nodes.  A varpool node
> +   can appear in multiple sets.  */
> +struct GTY(()) varpool_node_set_def
> +{
> +  htab_t GTY((param_is (struct varpool_node_set_element_def))) hashtab;
> +  VEC(varpool_node_ptr, gc) *nodes;
> +  PTR GTY ((skip)) aux;
> +};
> +
>  typedef struct cgraph_node_set_def *cgraph_node_set;
>  
>  DEF_VEC_P(cgraph_node_set);
>  DEF_VEC_ALLOC_P(cgraph_node_set,gc);
>  DEF_VEC_ALLOC_P(cgraph_node_set,heap);
>  
> +typedef struct varpool_node_set_def *varpool_node_set;
> +
> +DEF_VEC_P(varpool_node_set);
> +DEF_VEC_ALLOC_P(varpool_node_set,gc);
> +DEF_VEC_ALLOC_P(varpool_node_set,heap);
> +
>  /* A cgraph node set element contains an index in the vector of nodes in
>     the set.  */
>  struct GTY(()) cgraph_node_set_element_def
> @@ -326,6 +347,24 @@ typedef struct
>    unsigned index;
>  } cgraph_node_set_iterator;
>  
> +/* A varpool node set element contains an index in the vector of nodes in
> +   the set.  */
> +struct GTY(()) varpool_node_set_element_def
> +{
> +  struct varpool_node *node;
> +  HOST_WIDE_INT index;
> +};
> +
> +typedef struct varpool_node_set_element_def *varpool_node_set_element;
> +typedef const struct varpool_node_set_element_def *const_varpool_node_set_element;
> +
> +/* Iterator structure for varpool node sets.  */
> +typedef struct
> +{
> +  varpool_node_set set;
> +  unsigned index;
> +} varpool_node_set_iterator;
> +
>  #define DEFCIFCODE(code, string)	CIF_ ## code,
>  /* Reasons for inlining failures.  */
>  typedef enum {
> @@ -380,9 +419,9 @@ DEF_VEC_ALLOC_P(cgraph_edge_p,heap);
>  struct GTY((chain_next ("%h.next"))) varpool_node {
>    tree decl;
>    /* Pointer to the next function in varpool_nodes.  */
> -  struct varpool_node *next;
> +  struct varpool_node *next, *prev;
>    /* Pointer to the next function in varpool_nodes_queue.  */
> -  struct varpool_node *next_needed;
> +  struct varpool_node *next_needed, *prev_needed;
>    /* For normal nodes a pointer to the first extra name alias.  For alias
>       nodes a pointer to the normal node.  */
>    struct varpool_node *extra_name;
> @@ -407,6 +446,10 @@ struct GTY((chain_next ("%h.next"))) var
>    /* Set for aliases once they got through assemble_alias.  Also set for
>       extra name aliases in varpool_extra_name_alias.  */
>    unsigned alias : 1;
> +  /* Set when variable is used from other LTRANS partition.  */
> +  unsigned used_from_other_partition : 1;
> +  /* Set when variable is available in the other LTO partition.  */
> +  unsigned in_other_partition : 1;
>  };
>  
>  /* Every top level asm statement is put into a cgraph_asm_node.  */
> @@ -574,6 +617,13 @@ void cgraph_node_set_remove (cgraph_node
>  void dump_cgraph_node_set (FILE *, cgraph_node_set);
>  void debug_cgraph_node_set (cgraph_node_set);
>  
> +varpool_node_set varpool_node_set_new (void);
> +varpool_node_set_iterator varpool_node_set_find (varpool_node_set,
> +					       struct varpool_node *);
> +void varpool_node_set_add (varpool_node_set, struct varpool_node *);
> +void varpool_node_set_remove (varpool_node_set, struct varpool_node *);
> +void dump_varpool_node_set (FILE *, varpool_node_set);
> +void debug_varpool_node_set (varpool_node_set);
>  
>  /* In predict.c  */
>  bool cgraph_maybe_hot_edge_p (struct cgraph_edge *e);
> @@ -596,6 +646,9 @@ void cgraph_make_decl_local (tree);
>  void cgraph_make_node_local (struct cgraph_node *);
>  bool cgraph_node_can_be_local_p (struct cgraph_node *);
>  
> +
> +struct varpool_node * varpool_get_node (tree decl);
> +void varpool_remove_node (struct varpool_node *node);
>  bool varpool_assemble_pending_decls (void);
>  bool varpool_assemble_decl (struct varpool_node *node);
>  bool varpool_analyze_pending_decls (void);
> @@ -713,6 +766,54 @@ cgraph_node_set_size (cgraph_node_set se
>    return htab_elements (set->hashtab);
>  }
>  
> +/* Return true if iterator CSI points to nothing.  */
> +static inline bool
> +vsi_end_p (varpool_node_set_iterator vsi)
> +{
> +  return vsi.index >= VEC_length (varpool_node_ptr, vsi.set->nodes);
> +}
> +
> +/* Advance iterator CSI.  */
> +static inline void
> +vsi_next (varpool_node_set_iterator *vsi)
> +{
> +  vsi->index++;
> +}
> +
> +/* Return the node pointed to by CSI.  */
> +static inline struct varpool_node *
> +vsi_node (varpool_node_set_iterator vsi)
> +{
> +  return VEC_index (varpool_node_ptr, vsi.set->nodes, vsi.index);
> +}
> +
> +/* Return an iterator to the first node in SET.  */
> +static inline varpool_node_set_iterator
> +vsi_start (varpool_node_set set)
> +{
> +  varpool_node_set_iterator vsi;
> +
> +  vsi.set = set;
> +  vsi.index = 0;
> +  return vsi;
> +}
> +
> +/* Return true if SET contains NODE.  */
> +static inline bool
> +varpool_node_in_set_p (struct varpool_node *node, varpool_node_set set)
> +{
> +  varpool_node_set_iterator vsi;
> +  vsi = varpool_node_set_find (set, node);
> +  return !vsi_end_p (vsi);
> +}
> +
> +/* Return number of nodes in SET.  */
> +static inline size_t
> +varpool_node_set_size (varpool_node_set set)
> +{
> +  return htab_elements (set->hashtab);
> +}
> +
>  /* Uniquize all constants that appear in memory.
>     Each constant in memory thus far output is recorded
>     in `const_desc_table'.  */
> Index: tree-pass.h
> ===================================================================
> --- tree-pass.h	(revision 158825)
> +++ tree-pass.h	(working copy)
> @@ -168,6 +168,7 @@ struct rtl_opt_pass
>  struct varpool_node;
>  struct cgraph_node;
>  struct cgraph_node_set_def;
> +struct varpool_node_set_def;
>  
>  /* Description of IPA pass with generate summary, write, execute, read and
>     transform stages.  */
> @@ -180,13 +181,15 @@ struct ipa_opt_pass_d
>    void (*generate_summary) (void);
>  
>    /* This hook is used to serialize IPA summaries on disk.  */
> -  void (*write_summary) (struct cgraph_node_set_def *);
> +  void (*write_summary) (struct cgraph_node_set_def *,
> +			 struct varpool_node_set_def *);
>  
>    /* This hook is used to deserialize IPA summaries from disk.  */
>    void (*read_summary) (void);
>  
>    /* This hook is used to serialize IPA optimization summaries on disk.  */
> -  void (*write_optimization_summary) (struct cgraph_node_set_def *);
> +  void (*write_optimization_summary) (struct cgraph_node_set_def *,
> +				      struct varpool_node_set_def *);
>  
>    /* This hook is used to deserialize IPA summaries from disk.  */
>    void (*read_optimization_summary) (void);
> @@ -607,7 +610,8 @@ extern const char *get_current_pass_name
>  extern void print_current_pass (FILE *);
>  extern void debug_pass (void);
>  extern void ipa_write_summaries (void);
> -extern void ipa_write_optimization_summaries (struct cgraph_node_set_def *);
> +extern void ipa_write_optimization_summaries (struct cgraph_node_set_def *,
> +					      struct varpool_node_set_def *);
>  extern void ipa_read_summaries (void);
>  extern void ipa_read_optimization_summaries (void);
>  extern void register_one_dump_file (struct opt_pass *);
> Index: ipa-cp.c
> ===================================================================
> --- ipa-cp.c	(revision 158825)
> +++ ipa-cp.c	(working copy)
> @@ -1304,7 +1304,8 @@ ipcp_generate_summary (void)
>  
>  /* Write ipcp summary for nodes in SET.  */
>  static void
> -ipcp_write_summary (cgraph_node_set set)
> +ipcp_write_summary (cgraph_node_set set,
> +		    varpool_node_set vset ATTRIBUTE_UNUSED)
>  {
>    ipa_prop_write_jump_functions (set);
>  }
> Index: ipa-reference.c
> ===================================================================
> --- ipa-reference.c	(revision 158825)
> +++ ipa-reference.c	(working copy)
> @@ -1040,7 +1040,8 @@ write_node_summary_p (struct cgraph_node
>  /* Serialize the ipa info for lto.  */
>  
>  static void
> -ipa_reference_write_summary (cgraph_node_set set)
> +ipa_reference_write_summary (cgraph_node_set set,
> +			     varpool_node_set vset ATTRIBUTE_UNUSED)
>  {
>    struct cgraph_node *node;
>    struct lto_simple_output_block *ob
> Index: lto-cgraph.c
> ===================================================================
> --- lto-cgraph.c	(revision 158825)
> +++ lto-cgraph.c	(working copy)
> @@ -372,6 +372,50 @@ lto_output_node (struct lto_simple_outpu
>      lto_output_uleb128_stream (ob->main_stream, 0);
>  }
>  
> +/* Output the cgraph NODE to OB.  ENCODER is used to find the
> +   reference number of NODE->inlined_to.  SET is the set of nodes we
> +   are writing to the current file.  If NODE is not in SET, then NODE
> +   is a boundary of a cgraph_node_set and we pretend NODE just has a
> +   decl and no callees.  WRITTEN_DECLS is the set of FUNCTION_DECLs
> +   that have had their callgraph node written so far.  This is used to
> +   determine if NODE is a clone of a previously written node.  */
> +
> +static void
> +lto_output_varpool_node (struct lto_simple_output_block *ob, struct varpool_node *node,
> +		         varpool_node_set set)
> +{
> +  bool boundary_p = !varpool_node_in_set_p (node, set) && node->analyzed;
> +  struct bitpack_d *bp;
> +  struct varpool_node *alias;
> +  int count = 0;
> +
> +  lto_output_var_decl_index (ob->decl_state, ob->main_stream, node->decl);
> +  bp = bitpack_create ();
> +  bp_pack_value (bp, node->externally_visible, 1);
> +  bp_pack_value (bp, node->force_output, 1);
> +  bp_pack_value (bp, node->finalized, 1);
> +  gcc_assert (node->finalized || !node->analyzed);
> +  gcc_assert (node->needed);
> +  gcc_assert (!node->alias);
> +  /* FIXME: We have no idea how we move references around.  For moment assume that
> +     everything is used externally.  */
> +  bp_pack_value (bp, flag_wpa, 1);  /* used_from_other_parition.  */
> +  bp_pack_value (bp, boundary_p, 1);  /* in_other_partition.  */
> +  /* Also emit any extra name aliases.  */
> +  for (alias = node->extra_name; alias; alias = alias->next)
> +    count++;
> +  bp_pack_value (bp, count != 0, 1);
> +  lto_output_bitpack (ob->main_stream, bp);
> +  bitpack_delete (bp);
> +
> +  if (count)
> +    {
> +      lto_output_uleb128_stream (ob->main_stream, count);
> +      for (alias = node->extra_name; alias; alias = alias->next)
> +	lto_output_var_decl_index (ob->decl_state, ob->main_stream, alias->decl);
> +    }
> +}
> +
>  /* Stream out profile_summary to OB.  */
>  
>  static void
> @@ -548,6 +592,32 @@ input_overwrite_node (struct lto_file_de
>    node->frequency = (enum node_frequency)bp_unpack_value (bp, 2);
>  }
>  
> +/* Output the part of the cgraph in SET.  */
> +
> +void
> +output_varpool (varpool_node_set set)
> +{
> +  struct varpool_node *node;
> +  struct lto_simple_output_block *ob;
> +  int len = 0;
> +
> +  ob = lto_create_simple_output_block (LTO_section_varpool);
> +
> +  for (node = varpool_nodes; node; node = node->next)
> +    if (node->needed && node->analyzed)
> +      len++;
> +
> +  lto_output_uleb128_stream (ob->main_stream, len);
> +
> +  /* Write out the nodes.  We must first output a node and then its clones,
> +     otherwise at a time reading back the node there would be nothing to clone
> +     from.  */
> +  for (node = varpool_nodes; node; node = node->next)
> +    if (node->needed && node->analyzed)
> +      lto_output_varpool_node (ob, node, set);
> +
> +  lto_destroy_simple_output_block (ob);
> +}
>  
>  /* Read a node from input_block IB.  TAG is the node's tag just read.
>     Return the node read or overwriten.  */
> @@ -667,6 +737,48 @@ input_node (struct lto_file_decl_data *f
>    return node;
>  }
>  
> +/* Read a node from input_block IB.  TAG is the node's tag just read.
> +   Return the node read or overwriten.  */
> +
> +static struct varpool_node *
> +input_varpool_node (struct lto_file_decl_data *file_data,
> +		    struct lto_input_block *ib)
> +{
> +  int decl_index;
> +  tree var_decl;
> +  struct varpool_node *node;
> +  struct bitpack_d *bp;
> +  bool aliases_p;
> +  int count;
> +
> +  decl_index = lto_input_uleb128 (ib);
> +  var_decl = lto_file_decl_data_get_var_decl (file_data, decl_index);
> +  node = varpool_node (var_decl);
> +
> +  bp = lto_input_bitpack (ib);
> +  node->externally_visible = bp_unpack_value (bp, 1);
> +  node->force_output = bp_unpack_value (bp, 1);
> +  node->finalized = bp_unpack_value (bp, 1);
> +  node->analyzed = 1; 
> +  node->used_from_other_partition = bp_unpack_value (bp, 1);
> +  node->in_other_partition = bp_unpack_value (bp, 1);
> +  aliases_p = bp_unpack_value (bp, 1);
> +  if (node->finalized)
> +    varpool_mark_needed_node (node);
> +  bitpack_delete (bp);
> +  if (aliases_p)
> +    {
> +      count = lto_input_uleb128 (ib);
> +      for (; count > 0; count --)
> +	{
> +	  tree decl = lto_file_decl_data_get_var_decl (file_data,
> +						       lto_input_uleb128 (ib));
> +	  varpool_extra_name_alias (decl, var_decl);
> +	}
> +    }
> +  return node;
> +}
> +
>  
>  /* Read an edge from IB.  NODES points to a vector of previously read
>     nodes for decoding caller and callee of the edge to be read.  */
> @@ -782,6 +894,22 @@ input_cgraph_1 (struct lto_file_decl_dat
>    VEC_free (cgraph_node_ptr, heap, nodes);
>  }
>  
> +/* Read a varpool from IB using the info in FILE_DATA.  */
> +
> +static void
> +input_varpool_1 (struct lto_file_decl_data *file_data,
> +		struct lto_input_block *ib)
> +{
> +  unsigned HOST_WIDE_INT len;
> +
> +  len = lto_input_uleb128 (ib);
> +  while (len)
> +    {
> +      input_varpool_node (file_data, ib);
> +      len--;
> +    }
> +}
> +
>  static struct gcov_ctr_summary lto_gcov_summary;
>  
>  /* Input profile_info from IB.  */
> @@ -837,6 +965,12 @@ input_cgraph (void)
>        lto_destroy_simple_input_block (file_data, LTO_section_cgraph,
>  				      ib, data, len);
>  
> +      ib = lto_create_simple_input_block (file_data, LTO_section_varpool,
> +					  &data, &len);
> +      input_varpool_1 (file_data, ib);
> +      lto_destroy_simple_input_block (file_data, LTO_section_varpool,
> +				      ib, data, len);
> +
>        /* Assume that every file read needs to be processed by LTRANS.  */
>        if (flag_wpa)
>  	lto_mark_file_for_ltrans (file_data);
> Index: ipa-pure-const.c
> ===================================================================
> --- ipa-pure-const.c	(revision 158825)
> +++ ipa-pure-const.c	(working copy)
> @@ -771,7 +771,8 @@ generate_summary (void)
>  /* Serialize the ipa info for lto.  */
>  
>  static void
> -pure_const_write_summary (cgraph_node_set set)
> +pure_const_write_summary (cgraph_node_set set,
> +			  varpool_node_set vset ATTRIBUTE_UNUSED)
>  {
>    struct cgraph_node *node;
>    struct lto_simple_output_block *ob
> Index: lto-streamer-out.c
> ===================================================================
> --- lto-streamer-out.c	(revision 158825)
> +++ lto-streamer-out.c	(working copy)
> @@ -2089,7 +2089,7 @@ lto_writer_init (void)
>  /* Main entry point from the pass manager.  */
>  
>  static void
> -lto_output (cgraph_node_set set)
> +lto_output (cgraph_node_set set, varpool_node_set vset)
>  {
>    struct cgraph_node *node;
>    struct lto_out_decl_state *decl_state;
> @@ -2122,6 +2122,7 @@ lto_output (cgraph_node_set set)
>       have been renumbered so that edges can be associated with call
>       statements using the statement UIDs.  */
>    output_cgraph (set);
> +  output_varpool (vset);
>  
>    lto_bitmap_free (output);
>  }
> @@ -2176,20 +2177,6 @@ write_global_stream (struct output_block
>        t = lto_tree_ref_encoder_get_tree (encoder, index);
>        if (!lto_streamer_cache_lookup (ob->writer_cache, t, NULL))
>  	lto_output_tree (ob, t, false);
> -
> -      if (flag_wpa)
> -	{
> -	  /* In WPA we should not emit multiple definitions of the
> -	     same symbol to all the files in the link set.  If
> -	     T had already been emitted as the pervailing definition
> -	     in one file, do not emit it in the others.  */
> -	  /* FIXME lto.  We should check if T belongs to the
> -	     file we are writing to.  */
> -	  if (TREE_CODE (t) == VAR_DECL
> -	      && TREE_PUBLIC (t)
> -	      && !DECL_EXTERNAL (t))
> -	    TREE_ASM_WRITTEN (t) = 1;
> -	}
>      }
>  }
>  
> @@ -2442,7 +2429,7 @@ produce_symtab (struct lto_streamer_cach
>     recover these on other side.  */
>  
>  static void
> -produce_asm_for_decls (cgraph_node_set set)
> +produce_asm_for_decls (cgraph_node_set set, varpool_node_set vset ATTRIBUTE_UNUSED)
>  {
>    struct lto_out_decl_state *out_state;
>    struct lto_out_decl_state *fn_out_state;
> Index: ipa-inline.c
> ===================================================================
> --- ipa-inline.c	(revision 158825)
> +++ ipa-inline.c	(working copy)
> @@ -2095,7 +2095,8 @@ inline_read_summary (void)
>     active, we don't need to write them twice.  */
>  
>  static void
> -inline_write_summary (cgraph_node_set set)
> +inline_write_summary (cgraph_node_set set,
> +		      varpool_node_set vset ATTRIBUTE_UNUSED)
>  {
>    if (flag_indirect_inlining && !flag_ipa_cp)
>      ipa_prop_write_jump_functions (set);
> Index: lto-streamer-in.c
> ===================================================================
> --- lto-streamer-in.c	(revision 158825)
> +++ lto-streamer-in.c	(working copy)
> @@ -358,8 +358,6 @@ lto_input_tree_ref (struct lto_input_blo
>      case LTO_label_decl_ref:
>        ix_u = lto_input_uleb128 (ib);
>        result = lto_file_decl_data_get_var_decl (data_in->file_data, ix_u);
> -      if (TREE_CODE (result) == VAR_DECL)
> -	varpool_mark_needed_node (varpool_node (result));
>        break;
>  
>      default:
> @@ -2732,7 +2730,6 @@ lto_input_tree (struct lto_input_block *
>        result = lto_file_decl_data_get_var_decl (data_in->file_data, ix);
>        ix = lto_input_uleb128 (ib);
>        target = lto_file_decl_data_get_var_decl (data_in->file_data, ix);
> -      varpool_extra_name_alias (result, target);
>      }
>    else if (tag == lto_tree_code_to_tag (INTEGER_CST))
>      {
> Index: lto-section-in.c
> ===================================================================
> --- lto-section-in.c	(revision 158825)
> +++ lto-section-in.c	(working copy)
> @@ -52,6 +52,8 @@ const char *lto_section_name[LTO_N_SECTI
>    "function_body",
>    "static_initializer",
>    "cgraph",
> +  "varpool",
> +  "jump_funcs"
>    "ipa_pure_const",
>    "ipa_reference",
>    "symtab",
> Index: ipa.c
> ===================================================================
> --- ipa.c	(revision 158825)
> +++ ipa.c	(working copy)
> @@ -762,6 +762,164 @@ debug_cgraph_node_set (cgraph_node_set s
>    dump_cgraph_node_set (stderr, set);
>  }
>  
> +/* Hash a varpool node set element.  */
> +
> +static hashval_t
> +hash_varpool_node_set_element (const void *p)
> +{
> +  const_varpool_node_set_element element = (const_varpool_node_set_element) p;
> +  return htab_hash_pointer (element->node);
> +}
> +
> +/* Compare two varpool node set elements.  */
> +
> +static int
> +eq_varpool_node_set_element (const void *p1, const void *p2)
> +{
> +  const_varpool_node_set_element e1 = (const_varpool_node_set_element) p1;
> +  const_varpool_node_set_element e2 = (const_varpool_node_set_element) p2;
> +
> +  return e1->node == e2->node;
> +}
> +
> +/* Create a new varpool node set.  */
> +
> +varpool_node_set
> +varpool_node_set_new (void)
> +{
> +  varpool_node_set new_node_set;
> +
> +  new_node_set = GGC_NEW (struct varpool_node_set_def);
> +  new_node_set->hashtab = htab_create_ggc (10,
> +					   hash_varpool_node_set_element,
> +					   eq_varpool_node_set_element,
> +					   NULL);
> +  new_node_set->nodes = NULL;
> +  return new_node_set;
> +}
> +
> +/* Add varpool_node NODE to varpool_node_set SET.  */
> +
> +void
> +varpool_node_set_add (varpool_node_set set, struct varpool_node *node)
> +{
> +  void **slot;
> +  varpool_node_set_element element;
> +  struct varpool_node_set_element_def dummy;
> +
> +  dummy.node = node;
> +  slot = htab_find_slot (set->hashtab, &dummy, INSERT);
> +
> +  if (*slot != HTAB_EMPTY_ENTRY)
> +    {
> +      element = (varpool_node_set_element) *slot;
> +      gcc_assert (node == element->node
> +		  && (VEC_index (varpool_node_ptr, set->nodes, element->index)
> +		      == node));
> +      return;
> +    }
> +
> +  /* Insert node into hash table.  */
> +  element =
> +    (varpool_node_set_element) GGC_NEW (struct varpool_node_set_element_def);
> +  element->node = node;
> +  element->index = VEC_length (varpool_node_ptr, set->nodes);
> +  *slot = element;
> +
> +  /* Insert into node vector.  */
> +  VEC_safe_push (varpool_node_ptr, gc, set->nodes, node);
> +}
> +
> +/* Remove varpool_node NODE from varpool_node_set SET.  */
> +
> +void
> +varpool_node_set_remove (varpool_node_set set, struct varpool_node *node)
> +{
> +  void **slot, **last_slot;
> +  varpool_node_set_element element, last_element;
> +  struct varpool_node *last_node;
> +  struct varpool_node_set_element_def dummy;
> +
> +  dummy.node = node;
> +  slot = htab_find_slot (set->hashtab, &dummy, NO_INSERT);
> +  if (slot == NULL)
> +    return;
> +
> +  element = (varpool_node_set_element) *slot;
> +  gcc_assert (VEC_index (varpool_node_ptr, set->nodes, element->index)
> +	      == node);
> +
> +  /* Remove from vector. We do this by swapping node with the last element
> +     of the vector.  */
> +  last_node = VEC_pop (varpool_node_ptr, set->nodes);
> +  if (last_node != node)
> +    {
> +      dummy.node = last_node;
> +      last_slot = htab_find_slot (set->hashtab, &dummy, NO_INSERT);
> +      last_element = (varpool_node_set_element) *last_slot;
> +      gcc_assert (last_element);
> +
> +      /* Move the last element to the original spot of NODE.  */
> +      last_element->index = element->index;
> +      VEC_replace (varpool_node_ptr, set->nodes, last_element->index,
> +		   last_node);
> +    }
> +
> +  /* Remove element from hash table.  */
> +  htab_clear_slot (set->hashtab, slot);
> +  ggc_free (element);
> +}
> +
> +/* Find NODE in SET and return an iterator to it if found.  A null iterator
> +   is returned if NODE is not in SET.  */
> +
> +varpool_node_set_iterator
> +varpool_node_set_find (varpool_node_set set, struct varpool_node *node)
> +{
> +  void **slot;
> +  struct varpool_node_set_element_def dummy;
> +  varpool_node_set_element element;
> +  varpool_node_set_iterator vsi;
> +
> +  dummy.node = node;
> +  slot = htab_find_slot (set->hashtab, &dummy, NO_INSERT);
> +  if (slot == NULL)
> +    vsi.index = (unsigned) ~0;
> +  else
> +    {
> +      element = (varpool_node_set_element) *slot;
> +      gcc_assert (VEC_index (varpool_node_ptr, set->nodes, element->index)
> +		  == node);
> +      vsi.index = element->index;
> +    }
> +  vsi.set = set;
> +
> +  return vsi;
> +}
> +
> +/* Dump content of SET to file F.  */
> +
> +void
> +dump_varpool_node_set (FILE *f, varpool_node_set set)
> +{
> +  varpool_node_set_iterator iter;
> +
> +  for (iter = vsi_start (set); !vsi_end_p (iter); vsi_next (&iter))
> +    {
> +      struct varpool_node *node = vsi_node (iter);
> +      dump_varpool_node (f, node);
> +    }
> +}
> +
> +/* Dump content of SET to stderr.  */
> +
> +void
> +debug_varpool_node_set (varpool_node_set set)
> +{
> +  dump_varpool_node_set (stderr, set);
> +}
> +
> +
>  /* Simple ipa profile pass propagating frequencies across the callgraph.  */
>  
>  static unsigned int
> Index: lto/lto.c
> ===================================================================
> --- lto/lto.c	(revision 158825)
> +++ lto/lto.c	(working copy)
> @@ -523,6 +523,7 @@ free_section_data (struct lto_file_decl_
>  
>  /* Vector of all cgraph node sets. */
>  static GTY (()) VEC(cgraph_node_set, gc) *lto_cgraph_node_sets;
> +static GTY (()) VEC(varpool_node_set, gc) *lto_varpool_node_sets;
>  
>  
>  /* Group cgrah nodes by input files.  This is used mainly for testing
> @@ -532,25 +533,21 @@ static void
>  lto_1_to_1_map (void)
>  {
>    struct cgraph_node *node;
> +  struct varpool_node *vnode;
>    struct lto_file_decl_data *file_data;
>    struct pointer_map_t *pmap;
> +  struct pointer_map_t *vpmap;
>    cgraph_node_set set;
> +  varpool_node_set vset;
>    void **slot;
>  
>    timevar_push (TV_WHOPR_WPA);
>  
>    lto_cgraph_node_sets = VEC_alloc (cgraph_node_set, gc, 1);
> -
> -  /* If the cgraph is empty, create one cgraph node set so that there is still
> -     an output file for any variables that need to be exported in a DSO.  */
> -  if (!cgraph_nodes)
> -    {
> -      set = cgraph_node_set_new ();
> -      VEC_safe_push (cgraph_node_set, gc, lto_cgraph_node_sets, set);
> -      goto finish;
> -    }
> +  lto_varpool_node_sets = VEC_alloc (varpool_node_set, gc, 1);
>  
>    pmap = pointer_map_create ();
> +  vpmap = pointer_map_create ();
>  
>    for (node = cgraph_nodes; node; node = node->next)
>      {
> @@ -558,6 +555,9 @@ lto_1_to_1_map (void)
>  	 cloned from.  */
>        if (node->global.inlined_to || node->clone_of)
>  	continue;
> +      /* Nodes without a body don not need partitioning.  */
> +      if (!node->analyzed)
> +	continue;
>        /* We only need to partition the nodes that we read from the
>  	 gimple bytecode files.  */
>        file_data = node->local.lto_file_data;
> @@ -573,14 +573,50 @@ lto_1_to_1_map (void)
>  	  slot = pointer_map_insert (pmap, file_data);
>  	  *slot = set;
>  	  VEC_safe_push (cgraph_node_set, gc, lto_cgraph_node_sets, set);
> +	  vset = varpool_node_set_new ();
> +	  slot = pointer_map_insert (vpmap, file_data);
> +	  *slot = vset;
> +	  VEC_safe_push (varpool_node_set, gc, lto_varpool_node_sets, vset);
>  	}
>  
>        cgraph_node_set_add (set, node);
>      }
>  
> +  for (vnode = varpool_nodes; vnode; vnode = vnode->next)
> +    {
> +      if (vnode->alias)
> +	continue;
> +      slot = pointer_map_contains (vpmap, file_data);
> +      if (slot)
> +	vset = (varpool_node_set) *slot;
> +      else
> +	{
> +	  set = cgraph_node_set_new ();
> +	  slot = pointer_map_insert (pmap, file_data);
> +	  *slot = set;
> +	  VEC_safe_push (cgraph_node_set, gc, lto_cgraph_node_sets, set);
> +	  vset = varpool_node_set_new ();
> +	  slot = pointer_map_insert (vpmap, file_data);
> +	  *slot = vset;
> +	  VEC_safe_push (varpool_node_set, gc, lto_varpool_node_sets, vset);
> +	}
> +
> +      varpool_node_set_add (vset, vnode);
> +    }
> +
> +  /* If the cgraph is empty, create one cgraph node set so that there is still
> +     an output file for any variables that need to be exported in a DSO.  */
> +  if (!lto_cgraph_node_sets)
> +    {
> +      set = cgraph_node_set_new ();
> +      VEC_safe_push (cgraph_node_set, gc, lto_cgraph_node_sets, set);
> +      vset = varpool_node_set_new ();
> +      VEC_safe_push (varpool_node_set, gc, lto_varpool_node_sets, vset);
> +    }
> +
>    pointer_map_destroy (pmap);
> +  pointer_map_destroy (vpmap);
>  
> -finish:
>    timevar_pop (TV_WHOPR_WPA);
>  
>    lto_stats.num_cgraph_partitions += VEC_length (cgraph_node_set, 
> @@ -672,174 +708,6 @@ lto_add_all_inlinees (cgraph_node_set se
>    lto_bitmap_free (original_decls);
>  }
>  
> -/* Owing to inlining, we may need to promote a file-scope variable
> -   to a global variable.  Consider this case:
> -
> -   a.c:
> -   static int var;
> -
> -   void
> -   foo (void)
> -   {
> -     var++;
> -   }
> -
> -   b.c:
> -
> -   extern void foo (void);
> -
> -   void
> -   bar (void)
> -   {
> -     foo ();
> -   }
> -
> -   If WPA inlines FOO inside BAR, then the static variable VAR needs to
> -   be promoted to global because BAR and VAR may be in different LTRANS
> -   files. */
> -
> -/* This struct keeps track of states used in globalization.  */
> -
> -typedef struct
> -{
> -  /* Current cgraph node set.  */  
> -  cgraph_node_set set;
> -
> -  /* Function DECLs of cgraph nodes seen.  */
> -  bitmap seen_node_decls;
> -
> -  /* Use in walk_tree to avoid multiple visits of a node.  */
> -  struct pointer_set_t *visited;
> -
> -  /* static vars in this set.  */
> -  bitmap static_vars_in_set;
> -
> -  /* static vars in all previous set.  */
> -  bitmap all_static_vars;
> -
> -  /* all vars in all previous set.  */
> -  bitmap all_vars;
> -} globalize_context_t;
> -
> -/* Callback for walk_tree.  Examine the tree pointer to by TP and see if
> -   if its a file-scope static variable of function that need to be turned
> -   into a global.  */
> -
> -static tree
> -globalize_cross_file_statics (tree *tp, int *walk_subtrees ATTRIBUTE_UNUSED,
> -			      void *data)
> -{
> -  globalize_context_t *context = (globalize_context_t *) data;
> -  tree t = *tp;
> -
> -  if (t == NULL_TREE)
> -    return NULL;
> -
> -  /* The logic for globalization of VAR_DECLs and FUNCTION_DECLs are
> -     different.  For functions, we can simply look at the cgraph node sets
> -     to tell if there are references to static functions outside the set.
> -     The cgraph node sets do not keep track of vars, we need to traverse
> -     the trees to determine what vars need to be globalized.  */
> -  if (TREE_CODE (t) == VAR_DECL)
> -    {
> -      if (!TREE_PUBLIC (t))
> -	{
> -	  /* This file-scope static variable is reachable from more
> -	     that one set.  Make it global but with hidden visibility
> -	     so that we do not export it in dynamic linking.  */
> -	  if (bitmap_bit_p (context->all_static_vars, DECL_UID (t)))
> -	    {
> -	      TREE_PUBLIC (t) = 1;
> -	      DECL_VISIBILITY (t) = VISIBILITY_HIDDEN;
> -	    }
> -	  bitmap_set_bit (context->static_vars_in_set, DECL_UID (t));
> -	}
> -      bitmap_set_bit (context->all_vars, DECL_UID (t));
> -      walk_tree (&DECL_INITIAL (t), globalize_cross_file_statics, context,
> -		 context->visited);
> -    }
> -  else if (TREE_CODE (t) == FUNCTION_DECL && !TREE_PUBLIC (t))
> -    {
> -      if (!cgraph_node_in_set_p (cgraph_node (t), context->set)
> -	  || cgraph_node (t)->address_taken)
> -	{
> -	  /* This file-scope static function is reachable from a set
> -	     which does not contain the function DECL.  Make it global
> -	     but with hidden visibility.  */
> -	  TREE_PUBLIC (t) = 1;
> -	  DECL_VISIBILITY (t) = VISIBILITY_HIDDEN;
> -	}
> -    }
> -
> -  return NULL; 
> -}
> -
> -/* Helper of lto_scan_statics_in_cgraph_node below.  Scan TABLE for
> -   static decls that may be used in more than one LTRANS file.
> -   CONTEXT is a globalize_context_t for storing scanning states.  */
> -
> -static void
> -lto_scan_statics_in_ref_table (struct lto_tree_ref_table *table,
> -			       globalize_context_t *context)
> -{
> -  unsigned i;
> -
> -  for (i = 0; i < table->size; i++)
> -    walk_tree (&table->trees[i], globalize_cross_file_statics, context,
> -	       context->visited);
> -}
> -
> -/* Promote file-scope decl reachable from NODE if necessary to global.
> -   CONTEXT is a globalize_context_t storing scanning states.  */
> -
> -static void
> -lto_scan_statics_in_cgraph_node (struct cgraph_node *node,
> -				 globalize_context_t *context)
> -{
> -  struct lto_in_decl_state *state;
> -  
> -  /* Do nothing if NODE has no function body.  */
> -  if (!node->analyzed)
> -    return;
> -  
> -  /* Return if the DECL of nodes has been visited before.  */
> -  if (bitmap_bit_p (context->seen_node_decls, DECL_UID (node->decl)))
> -    return;
> -
> -  bitmap_set_bit (context->seen_node_decls, DECL_UID (node->decl));
> -
> -  state = lto_get_function_in_decl_state (node->local.lto_file_data,
> -					  node->decl);
> -  gcc_assert (state);
> -
> -  lto_scan_statics_in_ref_table (&state->streams[LTO_DECL_STREAM_VAR_DECL],
> -				 context);
> -  lto_scan_statics_in_ref_table (&state->streams[LTO_DECL_STREAM_FN_DECL],
> -				 context);
> -}
> -
> -/* Scan all global variables that we have not yet seen so far.  CONTEXT
> -   is a globalize_context_t storing scanning states.  */
> -
> -static void
> -lto_scan_statics_in_remaining_global_vars (globalize_context_t *context)
> -{
> -  tree var, var_context;
> -  struct varpool_node *vnode;
> -
> -  FOR_EACH_STATIC_VARIABLE (vnode)
> -    {
> -      var = vnode->decl;
> -      var_context = DECL_CONTEXT (var);
> -      if (TREE_STATIC (var)
> -	  && TREE_PUBLIC (var)
> -          && (!var_context || TREE_CODE (var_context) != FUNCTION_DECL)
> -          && !bitmap_bit_p (context->all_vars, DECL_UID (var)))
> -	walk_tree (&var, globalize_cross_file_statics, context,
> -		   context->visited);
> -    }
> -}
> -
>  /* Find out all static decls that need to be promoted to global because
>     of cross file sharing.  This function must be run in the WPA mode after
>     all inlinees are added.  */
> @@ -847,39 +715,49 @@ lto_scan_statics_in_remaining_global_var
>  static void
>  lto_promote_cross_file_statics (void)
>  {
> +  struct varpool_node *vnode;
>    unsigned i, n_sets;
>    cgraph_node_set set;
>    cgraph_node_set_iterator csi;
> -  globalize_context_t context;
>  
> -  memset (&context, 0, sizeof (context));
> -  context.all_vars = lto_bitmap_alloc ();
> -  context.all_static_vars = lto_bitmap_alloc ();
> +  gcc_assert (flag_wpa);
>  
> +  /* At moment we make no attempt to figure out who is refering the variables,
> +     so all must become global.  */
> +  for (vnode = varpool_nodes; vnode; vnode = vnode->next)
> +    if (!vnode->externally_visible)
> +       {
> +	  TREE_PUBLIC (vnode->decl) = 1;
> +	  DECL_VISIBILITY (vnode->decl) = VISIBILITY_HIDDEN;
> +       }
>    n_sets = VEC_length (cgraph_node_set, lto_cgraph_node_sets);
>    for (i = 0; i < n_sets; i++)
>      {
>        set = VEC_index (cgraph_node_set, lto_cgraph_node_sets, i);
> -      context.set = set;
> -      context.visited = pointer_set_create ();
> -      context.static_vars_in_set = lto_bitmap_alloc ();
> -      context.seen_node_decls = lto_bitmap_alloc ();
>  
> +      /* If node has either address taken (and we have no clue from where)
> +	 or it is called from other partition, it needs to be globalized.  */
>        for (csi = csi_start (set); !csi_end_p (csi); csi_next (&csi))
> -	lto_scan_statics_in_cgraph_node (csi_node (csi), &context);
> -
> -      if (i == n_sets - 1)
> -        lto_scan_statics_in_remaining_global_vars (&context);
> -
> -      bitmap_ior_into (context.all_static_vars, context.static_vars_in_set);
> +	{
> +	  struct cgraph_node *node = csi_node (csi);
> +	  bool globalize = node->address_taken;
> +	  struct cgraph_edge *e;
> +	  for (e = node->callers; e && !globalize; e = e->next_caller)
> +	    {
> +	      struct cgraph_node *caller = e->caller;
> +	      if (caller->global.inlined_to)
> +		caller = caller->global.inlined_to;
> +	      if (!cgraph_node_in_set_p (caller, set))
> +		globalize = true;
> +	    }
> +	  if (globalize)
> +	     {
> +		TREE_PUBLIC (node->decl) = 1;
> +		DECL_VISIBILITY (node->decl) = VISIBILITY_HIDDEN;
> +	     }
> +	}
>  
> -      pointer_set_destroy (context.visited);
> -      lto_bitmap_free (context.static_vars_in_set);
> -      lto_bitmap_free (context.seen_node_decls);
>      }
> -
> -  lto_bitmap_free (context.all_vars);
> -  lto_bitmap_free (context.all_static_vars);
>  }
>  
>  
> @@ -918,7 +796,7 @@ strip_extension (const char *fname)
>     the same input file.  */
>  
>  static char *
> -get_filename_for_set (cgraph_node_set set)
> +get_filename_for_set (cgraph_node_set set, varpool_node_set vset ATTRIBUTE_UNUSED)
>  {
>    char *fname = NULL;
>    static const size_t max_fname_len = 100;
> @@ -999,6 +877,7 @@ lto_wpa_write_files (void)
>    unsigned i, n_sets, last_out_file_ix, num_out_files;
>    lto_file *file;
>    cgraph_node_set set;
> +  varpool_node_set vset;
>  
>    timevar_push (TV_WHOPR_WPA);
>  
> @@ -1034,7 +913,8 @@ lto_wpa_write_files (void)
>        char *temp_filename;
>  
>        set = VEC_index (cgraph_node_set, lto_cgraph_node_sets, i);
> -      temp_filename = get_filename_for_set (set);
> +      vset = VEC_index (varpool_node_set, lto_varpool_node_sets, i);
> +      temp_filename = get_filename_for_set (set, vset);
>        output_files[i] = temp_filename;
>  
>        if (cgraph_node_set_needs_ltrans_p (set))
> @@ -1046,7 +926,7 @@ lto_wpa_write_files (void)
>  
>  	  lto_set_current_out_file (file);
>  
> -	  ipa_write_optimization_summaries (set);
> +	  ipa_write_optimization_summaries (set, vset);
>  
>  	  lto_set_current_out_file (NULL);
>  	  lto_obj_file_close (file);
> Index: passes.c
> ===================================================================
> --- passes.c	(revision 158825)
> +++ passes.c	(working copy)
> @@ -191,7 +191,11 @@ rest_of_decl_compilation (tree decl,
>  	   || DECL_INITIAL (decl))
>  	  && !DECL_EXTERNAL (decl))
>  	{
> -	  if (TREE_CODE (decl) != FUNCTION_DECL)
> +	  /* When reading LTO unit, we also read varpool, so do not
> +	     rebuild it.  */
> +	  if (in_lto_p && !cgraph_function_flags_ready)
> +	    ;
> +	  else if (TREE_CODE (decl) != FUNCTION_DECL)
>  	    varpool_finalize_decl (decl);
>  	  else
>  	    assemble_variable (decl, top_level, at_end, 0);
> @@ -218,7 +222,9 @@ rest_of_decl_compilation (tree decl,
>      }
>  
>    /* Let cgraph know about the existence of variables.  */
> -  if (TREE_CODE (decl) == VAR_DECL && !DECL_EXTERNAL (decl))
> +  if (in_lto_p && !cgraph_function_flags_ready)
> +    ;
> +  else if (TREE_CODE (decl) == VAR_DECL && !DECL_EXTERNAL (decl))
>      varpool_node (decl);
>  }
>  
> @@ -1616,7 +1622,10 @@ execute_one_pass (struct opt_pass *pass)
>      }
>  
>    if (!current_function_decl)
> -    cgraph_process_new_functions ();
> +    {
> +      cgraph_process_new_functions ();
> +      varpool_analyze_pending_decls ();
> +    }
>  
>    pass_fini_dump_file (pass);
>  
> @@ -1649,6 +1658,7 @@ execute_pass_list (struct opt_pass *pass
>  
>  static void
>  ipa_write_summaries_2 (struct opt_pass *pass, cgraph_node_set set,
> +		       varpool_node_set vset,
>  		       struct lto_out_decl_state *state)
>  {
>    while (pass)
> @@ -1665,7 +1675,7 @@ ipa_write_summaries_2 (struct opt_pass *
>  	  if (pass->tv_id)
>  	    timevar_push (pass->tv_id);
>  
> -	  ipa_pass->write_summary (set);
> +	  ipa_pass->write_summary (set,vset);
>  
>  	  /* If a timevar is present, start it.  */
>  	  if (pass->tv_id)
> @@ -1673,7 +1683,7 @@ ipa_write_summaries_2 (struct opt_pass *
>  	}
>  
>        if (pass->sub && pass->sub->type != GIMPLE_PASS)
> -	ipa_write_summaries_2 (pass->sub, set, state);
> +	ipa_write_summaries_2 (pass->sub, set, vset, state);
>  
>        pass = pass->next;
>      }
> @@ -1684,14 +1694,14 @@ ipa_write_summaries_2 (struct opt_pass *
>     summaries.  SET is the set of nodes to be written.  */
>  
>  static void
> -ipa_write_summaries_1 (cgraph_node_set set)
> +ipa_write_summaries_1 (cgraph_node_set set, varpool_node_set vset)
>  {
>    struct lto_out_decl_state *state = lto_new_out_decl_state ();
>    lto_push_out_decl_state (state);
>  
>    gcc_assert (!flag_wpa);
> -  ipa_write_summaries_2 (all_regular_ipa_passes, set, state);
> -  ipa_write_summaries_2 (all_lto_gen_passes, set, state);
> +  ipa_write_summaries_2 (all_regular_ipa_passes, set, vset, state);
> +  ipa_write_summaries_2 (all_lto_gen_passes, set, vset, state);
>  
>    gcc_assert (lto_get_out_decl_state () == state);
>    lto_pop_out_decl_state ();
> @@ -1704,7 +1714,9 @@ void
>  ipa_write_summaries (void)
>  {
>    cgraph_node_set set;
> +  varpool_node_set vset;
>    struct cgraph_node **order;
> +  struct varpool_node *vnode;
>    int i, order_pos;
>  
>    if (!flag_generate_lto || errorcount || sorrycount)
> @@ -1736,13 +1748,20 @@ ipa_write_summaries (void)
>  	  renumber_gimple_stmt_uids ();
>  	  pop_cfun ();
>  	}
> -      cgraph_node_set_add (set, node);
> +      if (node->needed || node->reachable || node->address_taken)
> +	cgraph_node_set_add (set, node);
>      }
> +  vset = varpool_node_set_new ();
> +
> +  for (vnode = varpool_nodes; vnode; vnode = vnode->next)
> +    if (vnode->needed && !vnode->alias)
> +      varpool_node_set_add (vset, vnode);
>  
> -  ipa_write_summaries_1 (set);
> +  ipa_write_summaries_1 (set, vset);
>  
>    free (order);
>    ggc_free (set);
> +  ggc_free (vset);
>  }
>  
>  /* Same as execute_pass_list but assume that subpasses of IPA passes
> @@ -1751,6 +1770,7 @@ ipa_write_summaries (void)
>  
>  static void
>  ipa_write_optimization_summaries_1 (struct opt_pass *pass, cgraph_node_set set,
> +		       varpool_node_set vset,
>  		       struct lto_out_decl_state *state)
>  {
>    while (pass)
> @@ -1767,7 +1787,7 @@ ipa_write_optimization_summaries_1 (stru
>  	  if (pass->tv_id)
>  	    timevar_push (pass->tv_id);
>  
> -	  ipa_pass->write_optimization_summary (set);
> +	  ipa_pass->write_optimization_summary (set, vset);
>  
>  	  /* If a timevar is present, start it.  */
>  	  if (pass->tv_id)
> @@ -1775,7 +1795,7 @@ ipa_write_optimization_summaries_1 (stru
>  	}
>  
>        if (pass->sub && pass->sub->type != GIMPLE_PASS)
> -	ipa_write_optimization_summaries_1 (pass->sub, set, state);
> +	ipa_write_optimization_summaries_1 (pass->sub, set, vset, state);
>  
>        pass = pass->next;
>      }
> @@ -1785,14 +1805,14 @@ ipa_write_optimization_summaries_1 (stru
>     NULL, write out all summaries of all nodes. */
>  
>  void
> -ipa_write_optimization_summaries (cgraph_node_set set)
> +ipa_write_optimization_summaries (cgraph_node_set set, varpool_node_set vset)
>  {
>    struct lto_out_decl_state *state = lto_new_out_decl_state ();
>    lto_push_out_decl_state (state);
>  
>    gcc_assert (flag_wpa);
> -  ipa_write_optimization_summaries_1 (all_regular_ipa_passes, set, state);
> -  ipa_write_optimization_summaries_1 (all_lto_gen_passes, set, state);
> +  ipa_write_optimization_summaries_1 (all_regular_ipa_passes, set, vset, state);
> +  ipa_write_optimization_summaries_1 (all_lto_gen_passes, set, vset, state);
>  
>    gcc_assert (lto_get_out_decl_state () == state);
>    lto_pop_out_decl_state ();
> @@ -1917,6 +1937,7 @@ execute_ipa_pass_list (struct opt_pass *
>  	}
>        gcc_assert (!current_function_decl);
>        cgraph_process_new_functions ();
> +      varpool_analyze_pending_decls ();
>        pass = pass->next;
>      }
>    while (pass);
> Index: varpool.c
> ===================================================================
> --- varpool.c	(revision 158825)
> +++ varpool.c	(working copy)
> @@ -105,6 +105,22 @@ eq_varpool_node (const void *p1, const v
>    return DECL_UID (n1->decl) == DECL_UID (n2->decl);
>  }
>  
> +/* Return varpool node assigned to DECL without creating new one.  */
> +struct varpool_node *
> +varpool_get_node (tree decl)
> +{
> +  struct varpool_node key, **slot;
> +
> +  gcc_assert (DECL_P (decl) && TREE_CODE (decl) != FUNCTION_DECL);
> +
> +  if (!varpool_hash)
> +    return NULL;
> +  key.decl = decl;
> +  slot = (struct varpool_node **)
> +    htab_find_slot (varpool_hash, &key, INSERT);
> +  return *slot;
> +}
> +
>  /* Return varpool node assigned to DECL.  Create new one when needed.  */
>  struct varpool_node *
>  varpool_node (tree decl)
> @@ -125,11 +141,51 @@ varpool_node (tree decl)
>    node->decl = decl;
>    node->order = cgraph_order++;
>    node->next = varpool_nodes;
> +  if (varpool_nodes)
> +    varpool_nodes->prev = node;
>    varpool_nodes = node;
>    *slot = node;
>    return node;
>  }
>  
> +/* Remove node from the varpool.  */
> +void
> +varpool_remove_node (struct varpool_node *node)
> +{
> +  void **slot;
> +  slot = htab_find_slot (varpool_hash, node, NO_INSERT);
> +  gcc_assert (*slot == node);
> +  htab_clear_slot (varpool_hash, slot);
> +  *slot = NULL;
> +  gcc_assert (!varpool_assembled_nodes_queue);
> +  if (node->next)
> +    node->next->prev = node->prev;
> +  if (node->prev)
> +    node->prev->next = node->next;
> +  else if (node->next)
> +    {
> +      gcc_assert (varpool_nodes == node);
> +      varpool_nodes = node->next;
> +    }
> +  if (varpool_first_unanalyzed_node == node)
> +    varpool_first_unanalyzed_node = node->next_needed;
> +  if (node->next_needed)
> +    node->next->prev_needed = node->prev_needed;
> +  else if (node->prev_needed)
> +    {
> +      gcc_assert (varpool_last_needed_node);
> +      varpool_last_needed_node = node->prev_needed;
> +    }
> +  if (node->prev_needed)
> +    node->prev->next_needed = node->next_needed;
> +  else if (node->next_needed)
> +    {
> +      gcc_assert (varpool_nodes_queue == node);
> +      varpool_nodes_queue = node->next_needed;
> +    }
> +  ggc_free (node);
> +}
> +
>  /* Dump given cgraph node.  */
>  void
>  dump_varpool_node (FILE *f, struct varpool_node *node)
> @@ -139,8 +195,12 @@ dump_varpool_node (FILE *f, struct varpo
>  	   cgraph_function_flags_ready
>  	   ? cgraph_availability_names[cgraph_variable_initializer_availability (node)]
>  	   : "not-ready");
> +  if (DECL_ASSEMBLER_NAME_SET_P (node->decl))
> +    fprintf (f, " (asm: %s)", IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (node->decl)));
>    if (DECL_INITIAL (node->decl))
>      fprintf (f, " initialized");
> +  if (TREE_ASM_WRITTEN (node->decl))
> +    fprintf (f, " (asm written)");
>    if (node->needed)
>      fprintf (f, " needed");
>    if (node->analyzed)
> @@ -151,6 +211,10 @@ dump_varpool_node (FILE *f, struct varpo
>      fprintf (f, " output");
>    if (node->externally_visible)
>      fprintf (f, " externally_visible");
> +  if (node->in_other_partition)
> +    fprintf (f, " in_other_partition");
> +  else if (node->used_from_other_partition)
> +    fprintf (f, " used_from_other_partition");
>    fprintf (f, "\n");
>  }
>  
> @@ -192,7 +256,10 @@ static void
>  varpool_enqueue_needed_node (struct varpool_node *node)
>  {
>    if (varpool_last_needed_node)
> -    varpool_last_needed_node->next_needed = node;
> +    {
> +      varpool_last_needed_node->next_needed = node;
> +      node->prev_needed = varpool_last_needed_node;
> +    }
>    varpool_last_needed_node = node;
>    node->next_needed = NULL;
>    if (!varpool_nodes_queue)
> @@ -230,11 +297,7 @@ varpool_reset_queue (void)
>  bool
>  decide_is_variable_needed (struct varpool_node *node, tree decl)
>  {
> -  /* We do not track variable references at all and thus have no idea if the
> -     variable was referenced in some other partition or not.  
> -     FIXME: We really need address taken edges in callgraph and varpool to
> -     drive WPA and decide whether other partition might reference it or not.  */
> -  if (flag_ltrans)
> +  if (node->used_from_other_partition)
>      return true;
>    /* If the user told us it is used, then it must be so.  */
>    if ((node->externally_visible && !DECL_COMDAT (decl))
> @@ -288,17 +351,6 @@ varpool_finalize_decl (tree decl)
>  {
>    struct varpool_node *node = varpool_node (decl);
>  
> -  /* FIXME: We don't really stream varpool datastructure and instead rebuild it
> -     by varpool_finalize_decl.  This is not quite correct since this way we can't
> -     attach any info to varpool.  Eventually we will want to stream varpool nodes
> -     and the flags.
> -
> -     For the moment just prevent analysis of varpool nodes to happen again, so
> -     we will re-try to compute "address_taken" flag of varpool that breaks
> -     in presence of clones.  */
> -  if (in_lto_p)
> -    node->analyzed = true;
> -
>    /* The first declaration of a variable that comes through this function
>       decides whether it is global (in C, has external linkage)
>       or local (in C, has internal linkage).  So do nothing more
> @@ -385,6 +437,7 @@ varpool_assemble_decl (struct varpool_no
>  
>    if (!TREE_ASM_WRITTEN (decl)
>        && !node->alias
> +      && !node->in_other_partition
>        && !DECL_EXTERNAL (decl)
>        && (TREE_CODE (decl) != VAR_DECL || !DECL_HAS_VALUE_EXPR_P (decl)))
>      {
> @@ -394,6 +447,9 @@ varpool_assemble_decl (struct varpool_no
>  	  struct varpool_node *alias;
>  
>  	  node->next_needed = varpool_assembled_nodes_queue;
> +	  node->prev_needed = NULL;
> +	  if (varpool_assembled_nodes_queue)
> +	    varpool_assembled_nodes_queue->prev_needed = node;
>  	  varpool_assembled_nodes_queue = node;
>  	  node->finalized = 1;
>  
> @@ -476,7 +532,10 @@ varpool_assemble_pending_decls (void)
>        if (varpool_assemble_decl (node))
>  	changed = true;
>        else
> -        node->next_needed = NULL;
> +	{
> +	  node->prev_needed = NULL;
> +          node->next_needed = NULL;
> +	}
>      }
>    /* varpool_nodes_queue is now empty, clear the pointer to the last element
>       in the queue.  */
> @@ -498,6 +557,7 @@ varpool_empty_needed_queue (void)
>        struct varpool_node *node = varpool_nodes_queue;
>        varpool_nodes_queue = varpool_nodes_queue->next_needed;
>        node->next_needed = NULL;
> +      node->prev_needed = NULL;
>      }
>    /* varpool_nodes_queue is now empty, clear the pointer to the last element
>       in the queue.  */
> @@ -559,6 +619,7 @@ varpool_extra_name_alias (tree alias, tr
>    alias_node->alias = 1;
>    alias_node->extra_name = decl_node;
>    alias_node->next = decl_node->extra_name;
> +  decl_node->extra_name->prev = alias_node;
>    decl_node->extra_name = alias_node;
>    *slot = alias_node;
>    return true;
> Index: lto-streamer.c
> ===================================================================
> --- lto-streamer.c	(revision 158825)
> +++ lto-streamer.c	(working copy)
> @@ -160,6 +160,9 @@ lto_get_section_name (int section_type, 
>      case LTO_section_cgraph:
>        return concat (LTO_SECTION_NAME_PREFIX, ".cgraph", NULL);
>  
> +    case LTO_section_varpool:
> +      return concat (LTO_SECTION_NAME_PREFIX, ".vars", NULL);
> +
>      case LTO_section_jump_functions:
>        return concat (LTO_SECTION_NAME_PREFIX, ".jmpfuncs", NULL);
>  
> Index: lto-streamer.h
> ===================================================================
> --- lto-streamer.h	(revision 158825)
> +++ lto-streamer.h	(working copy)
> @@ -259,11 +259,11 @@ enum lto_section_type
>    LTO_section_function_body,
>    LTO_section_static_initializer,
>    LTO_section_cgraph,
> +  LTO_section_varpool,
>    LTO_section_jump_functions,
>    LTO_section_ipa_pure_const,
>    LTO_section_ipa_reference,
>    LTO_section_symtab,
> -  LTO_section_wpa_fixup,
>    LTO_section_opts,
>    LTO_N_SECTION_TYPES		/* Must be last.  */
>  };
> @@ -834,6 +834,8 @@ int lto_cgraph_encoder_encode (lto_cgrap
>  void lto_cgraph_encoder_delete (lto_cgraph_encoder_t encoder);
>  void output_cgraph (cgraph_node_set);
>  void input_cgraph (void);
> +void output_varpool (varpool_node_set);
> +void input_varpool (void);
>  
>  
>  /* In lto-symtab.c.  */
> 
> 

-- 
Richard Guenther <rguenther@suse.de>
Novell / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 - GF: Markus Rex


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]