This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Fix -fwhole-program on LTO


Hi,
this is first part of changes needed to get cgraph/varpool right in LTO
and WHOPR.  The patch also makes -fwhole-program effective when it is
passed to the lto1 (so patch I sent earlier today is needed).

The patch went through significand snowballing effect, but I don't think
I can decompose it to incremental parts. There are several things we get
wrong and fixing one reveals more problems.

Following are the main changes:

  1) needed flag was wrong WRT -fwhole-program.
     The node->needed flag marks functions that needs to be output to final
     program for other reason than direct calls from other functions output
     to final program.  The reasons can be external visibility, fact that
     address was taken, used attribute or many other side cases.
     Once node becomes needed it always stays so, since reason for it
     becoming needed is lost and thus it must stay so.

     This does not work well with -fwhole-program:  -fwhole-program switch
     is ignored at compile time when functions are marked needed based on
     visibility.  So even if -fwhole-program switch is on at link time
     everything is already needed and there is nothing to do.

     The patch solves the problem by making decide_is_function_needed to behave
     as if whole program mode when LTO/WHOPR is active.  This decision is later
     revisited by new whole-program pass schedule as first IPA pass run after
     LTO read in.

     I am working in direction of removing need for this flag and replacing it
     with detailed reason why node is needed.  address_taken is there to mark
     nodes with addresses taken and I now added new predicates
     cgraph_only_called_directly_p used by IPA passes that originally tested
     "needed" flag to decide if it can assume that all calls to function are
     seen; cgraph_can_remove_if_no_direct_calls_p is predicate used by inliner
     and dead function removal to work out if function can be removed.

     The flags differs for COMDAT; these functions can be removed if they
     are not needed for other reasons but can not be considered to be always
     only called directly because if they stay they might be called from
     elsewhere.

  2) externally_visible was wrong WRT whole-program
     Similar problem to needed flag.  We decide on external visibility during
     compile time and we re-run visibility pass at link time.  However at this
     time everythign is externally visible (and must be so to make compie time
     small IPA passes not take overly agressive assumptions about externally
     visible things) so whole program has no effect.

     I've moved all decisions on external visiblity to the visibility
     pass and it is now run twice; once as whole-program pass early.
     First time we compute external visibility as needed for compile time
     and later we revisit it and bring stuff local at link time.

  3) Inline clones was messed up.
     In whopr we read back the inline clones and confuse them with real
     functions doing random stuff like marking them as address taken or
     as needed.  I've added couple asserts for this and modified
     varpool to not re-analyze initializers when reading back.

There are several issues I am aware of that are still wrong.  I've added FIXMEs
for those and intend to look into them incrementally.  They affect primarily
WHOPR: here we pass wrong callgraph to the ltrans stage and we mess up info
about what is needed.  As a result lto/lto.c is re-deciding what is needed but
it can not do so correctly.  Also optimization queue is restarted from wrong
point so we re-run IPA passes in ltrans and mess up summaries.

Also we are not making any attempt to properly store varpool, simply
dump the nodes and re-build it.  This won't work for WHOPR.

Bootstrapped/regtested x86_64-linux, will commit it tomorrow if there are no
complains.

Honza
	* lto-symtab.c (lto_cgraph_replace_node): Assert that inline clones has
	no address taken.
	* cgraph.c (cgraph_mark_needed_node): Assert that inline clones are
	never needed.
	(cgraph_clone_node): Clear externally_visible flag for clones.
	* cgraph.h (cgraph_only_called_directly_p,
	cgraph_can_remove_if_no_direct_calls_p): New predicates.
	* tree-pass.h (pass_ipa_whole_program_visibility): Declare.
	* ipa-cp.c (ipcp_cloning_candidate_p): Use new predicate.
	(ipcp_initialize_node_lattices, ipcp_estimate_growth,
	ipcp_insert_stage): Likwise.
	* cgraphunit.c (cgraph_decide_is_function_needed): Do not compute
	externally_visible flag.
	(verify_cgraph_node): Verify that inline clones look right.
	(process_function_and_variable_attributes): Do not set
	externally_visible flags.
	(ipa_passes): Avoid executing small_ipa_passes at LTO stage; they've
	been already run.
	* lto-cgraph.c (lto_output_node): Assert that inline clones are not
	boundaries.
	* ipa-inline.c (cgraph_clone_inlined_nodes): Use new predicates;
	clear externally_visible when turning into inline clones
	(cgraph_mark_inline_edge): Use new predicates.
	(cgraph_estimate_growth): Likewise.
	(cgraph_decide_inlining): Likewise.
	* ipa.c (cgraph_postorder): Likewise.
	(cgraph_remove_unreachable_nodes): Likewise; sanity check
	that inline clones are not needed.
	(cgraph_externally_visible_p): New predicate.
	(function_and_variable_visibility): Add whole_program parameter;
	always set externally_visible flag; handle COMDAT function
	privatization.
	(local_function_and_variable_visibility): New function.
	(gate_whole_program_function_and_variable_visibility): New function.
	(whole_program_function_and_variable_visibility): New function.
	(pass_ipa_whole_program_visibility): New function.
	* passes.c  (init_optimization_passes): Add whole program visibility
	pass.
	(do_per_function_toporder, function_called_by_processed_nodes_p): Do
	not care about needed/reachable flags.
	* varpool.c: Include flags.h
	(decide_is_variable_needed): When doing LTO assume whole-program mode.
	(varpool_finalize_decl): When we are in LTO read-back, all variables
	are analyzed.
	(varpool_analyze_pending_decls): Skip analyzis of analyzed vars.

	* lto/lto.c (read_cgraph_and_symbols): Mark functions neccesary only at
	ltrans stage; explain why this is needed and should not.
Index: lto-symtab.c
===================================================================
*** lto-symtab.c	(revision 152496)
--- lto-symtab.c	(working copy)
*************** lto_cgraph_replace_node (struct cgraph_n
*** 388,394 ****
    if (old_node->reachable)
      cgraph_mark_reachable_node (new_node);
    if (old_node->address_taken)
!     cgraph_mark_address_taken_node (new_node);
  
    /* Redirect all incoming edges.  */
    for (e = old_node->callers; e; e = next)
--- 388,397 ----
    if (old_node->reachable)
      cgraph_mark_reachable_node (new_node);
    if (old_node->address_taken)
!     {
!       gcc_assert (!new_node->global.inlined_to);
!       cgraph_mark_address_taken_node (new_node);
!     }
  
    /* Redirect all incoming edges.  */
    for (e = old_node->callers; e; e = next)
Index: cgraph.c
===================================================================
*** cgraph.c	(revision 152496)
--- cgraph.c	(working copy)
*************** void
*** 1359,1364 ****
--- 1359,1365 ----
  cgraph_mark_needed_node (struct cgraph_node *node)
  {
    node->needed = 1;
+   gcc_assert (!node->global.inlined_to);
    cgraph_mark_reachable_node (node);
  }
  
*************** cgraph_clone_node (struct cgraph_node *n
*** 1682,1687 ****
--- 1683,1689 ----
      }
    new_node->analyzed = n->analyzed;
    new_node->local = n->local;
+   new_node->local.externally_visible = false;
    new_node->global = n->global;
    new_node->rtl = n->rtl;
    new_node->count = count;
Index: cgraph.h
===================================================================
*** cgraph.h	(revision 152496)
--- cgraph.h	(working copy)
*************** struct GTY(()) constant_descriptor_tree 
*** 658,663 ****
--- 658,683 ----
    hashval_t hash;
  };
  
+ /* Return true when function NODE is only called directly.
+    i.e. it is not externally visible, address was not taken and
+    it is not used in any other non-standard way.  */
+ 
+ static inline bool
+ cgraph_only_called_directly_p (struct cgraph_node *node)
+ {
+   return !node->needed && !node->local.externally_visible;
+ }
+ 
+ /* Return true when function NODE can be removed from callgraph
+    if all direct calls are eliminated.  */
+ 
+ static inline bool
+ cgraph_can_remove_if_no_direct_calls_p (struct cgraph_node *node)
+ {
+   return (!node->needed
+   	  && (DECL_COMDAT (node->decl) || !node->local.externally_visible));
+ }
+ 
  /* Constant pool accessor function.  */
  htab_t constant_pool_htab (void);
  
Index: tree-pass.h
===================================================================
*** tree-pass.h	(revision 152496)
--- tree-pass.h	(working copy)
*************** extern struct simple_ipa_opt_pass pass_i
*** 437,442 ****
--- 437,443 ----
  
  extern struct simple_ipa_opt_pass pass_early_local_passes;
  
+ extern struct ipa_opt_pass_d pass_ipa_whole_program_visibility;
  extern struct ipa_opt_pass_d pass_ipa_lto_gimple_out;
  extern struct simple_ipa_opt_pass pass_ipa_increase_alignment;
  extern struct simple_ipa_opt_pass pass_ipa_matrix_reorg;
Index: ipa-cp.c
===================================================================
*** ipa-cp.c	(revision 152496)
--- ipa-cp.c	(working copy)
*************** ipcp_cloning_candidate_p (struct cgraph_
*** 442,448 ****
       FIXME: in future we should clone such functions when they are called with
       different constants, but current ipcp implementation is not good on this.
       */
!   if (!node->needed || !node->analyzed)
      return false;
  
    if (cgraph_function_body_availability (node) <= AVAIL_OVERWRITABLE)
--- 442,448 ----
       FIXME: in future we should clone such functions when they are called with
       different constants, but current ipcp implementation is not good on this.
       */
!   if (cgraph_only_called_directly_p (node) || !node->analyzed)
      return false;
  
    if (cgraph_function_body_availability (node) <= AVAIL_OVERWRITABLE)
*************** ipcp_initialize_node_lattices (struct cg
*** 536,542 ****
  
    if (ipa_is_called_with_var_arguments (info))
      type = IPA_BOTTOM;
!   else if (!node->needed)
      type = IPA_TOP;
    /* When cloning is allowed, we can assume that externally visible functions
       are not called.  We will compensate this by cloning later.  */
--- 536,542 ----
  
    if (ipa_is_called_with_var_arguments (info))
      type = IPA_BOTTOM;
!   else if (cgraph_only_called_directly_p (node))
      type = IPA_TOP;
    /* When cloning is allowed, we can assume that externally visible functions
       are not called.  We will compensate this by cloning later.  */
*************** ipcp_estimate_growth (struct cgraph_node
*** 954,960 ****
    struct cgraph_edge *cs;
    int redirectable_node_callers = 0;
    int removable_args = 0;
!   bool need_original = node->needed;
    struct ipa_node_params *info;
    int i, count;
    int growth;
--- 954,960 ----
    struct cgraph_edge *cs;
    int redirectable_node_callers = 0;
    int removable_args = 0;
!   bool need_original = !cgraph_only_called_directly_p (node);
    struct ipa_node_params *info;
    int i, count;
    int growth;
*************** ipcp_insert_stage (void)
*** 1143,1149 ****
        for (cs = node->callers; cs != NULL; cs = cs->next_caller)
  	if (cs->caller == node || ipcp_need_redirect_p (cs))
  	  break;
!       if (!cs && !node->needed)
  	bitmap_set_bit (dead_nodes, node->uid);
  
        info = IPA_NODE_REF (node);
--- 1143,1149 ----
        for (cs = node->callers; cs != NULL; cs = cs->next_caller)
  	if (cs->caller == node || ipcp_need_redirect_p (cs))
  	  break;
!       if (!cs && cgraph_only_called_directly_p (node))
  	bitmap_set_bit (dead_nodes, node->uid);
  
        info = IPA_NODE_REF (node);
Index: cgraphunit.c
===================================================================
*** cgraphunit.c	(revision 152496)
--- cgraphunit.c	(working copy)
*************** cgraph_build_cdtor_fns (void)
*** 316,328 ****
  bool
  cgraph_decide_is_function_needed (struct cgraph_node *node, tree decl)
  {
-   if (MAIN_NAME_P (DECL_NAME (decl))
-       && TREE_PUBLIC (decl))
-     {
-       node->local.externally_visible = true;
-       return true;
-     }
- 
    /* If the user told us it is used, then it must be so.  */
    if (node->local.externally_visible)
      return true;
--- 316,321 ----
*************** cgraph_decide_is_function_needed (struct
*** 360,366 ****
  	|| (!optimize && !node->local.disregard_inline_limits
  	    && !DECL_DECLARED_INLINE_P (decl)
  	    && !node->origin))
!       && !flag_whole_program)
        && !DECL_COMDAT (decl) && !DECL_EXTERNAL (decl))
      return true;
  
--- 353,361 ----
  	|| (!optimize && !node->local.disregard_inline_limits
  	    && !DECL_DECLARED_INLINE_P (decl)
  	    && !node->origin))
!        && !flag_whole_program
!        && !flag_lto
!        && !flag_whopr)
        && !DECL_COMDAT (decl) && !DECL_EXTERNAL (decl))
      return true;
  
*************** verify_cgraph_node (struct cgraph_node *
*** 593,598 ****
--- 588,608 ----
        error ("Execution count is negative");
        error_found = true;
      }
+   if (node->global.inlined_to && node->local.externally_visible)
+     {
+       error ("Externally visible inline clone");
+       error_found = true;
+     }
+   if (node->global.inlined_to && node->address_taken)
+     {
+       error ("Inline clone with address taken");
+       error_found = true;
+     }
+   if (node->global.inlined_to && node->needed)
+     {
+       error ("Inline clone is needed");
+       error_found = true;
+     }
    for (e = node->callers; e; e = e->next_caller)
      {
        if (e->count < 0)
*************** process_function_and_variable_attributes
*** 864,875 ****
  	    warning_at (DECL_SOURCE_LOCATION (node->decl), OPT_Wattributes,
  			"%<externally_visible%>"
  			" attribute have effect only on public objects");
! 	  else
! 	    {
! 	      if (node->local.finalized)
! 		cgraph_mark_needed_node (node);
! 	      node->local.externally_visible = true;
! 	    }
  	}
      }
    for (vnode = varpool_nodes; vnode != first_var; vnode = vnode->next)
--- 874,881 ----
  	    warning_at (DECL_SOURCE_LOCATION (node->decl), OPT_Wattributes,
  			"%<externally_visible%>"
  			" attribute have effect only on public objects");
! 	  else if (node->local.finalized)
! 	     cgraph_mark_needed_node (node);
  	}
      }
    for (vnode = varpool_nodes; vnode != first_var; vnode = vnode->next)
*************** process_function_and_variable_attributes
*** 887,898 ****
  	    warning_at (DECL_SOURCE_LOCATION (vnode->decl), OPT_Wattributes,
  			"%<externally_visible%>"
  			" attribute have effect only on public objects");
! 	  else
! 	    {
! 	      if (vnode->finalized)
! 		varpool_mark_needed_node (vnode);
! 	      vnode->externally_visible = true;
! 	    }
  	}
      }
  }
--- 893,900 ----
  	    warning_at (DECL_SOURCE_LOCATION (vnode->decl), OPT_Wattributes,
  			"%<externally_visible%>"
  			" attribute have effect only on public objects");
! 	  else if (vnode->finalized)
! 	    varpool_mark_needed_node (vnode);
  	}
      }
  }
*************** ipa_passes (void)
*** 1355,1361 ****
    current_function_decl = NULL;
    gimple_register_cfg_hooks ();
    bitmap_obstack_initialize (NULL);
!   execute_ipa_pass_list (all_small_ipa_passes);
  
    /* If pass_all_early_optimizations was not scheduled, the state of
       the cgraph will not be properly updated.  Update it now.  */
--- 1357,1365 ----
    current_function_decl = NULL;
    gimple_register_cfg_hooks ();
    bitmap_obstack_initialize (NULL);
! 
!   if (!in_lto_p)
!     execute_ipa_pass_list (all_small_ipa_passes);
  
    /* If pass_all_early_optimizations was not scheduled, the state of
       the cgraph will not be properly updated.  Update it now.  */
Index: lto-cgraph.c
===================================================================
*** lto-cgraph.c	(revision 152496)
--- lto-cgraph.c	(working copy)
*************** lto_output_node (struct lto_simple_outpu
*** 227,232 ****
--- 227,234 ----
       local static nodes to prevent clashes with other local statics.  */
    if (boundary_p)
      {
+       /* Inline clones can not be part of boundary.  */
+       gcc_assert (!node->global.inlined_to);
        local = 0;
        externally_visible = 1;
        inlinable = 0;
Index: ipa-inline.c
===================================================================
*** ipa-inline.c	(revision 152496)
--- ipa-inline.c	(working copy)
*************** cgraph_clone_inlined_nodes (struct cgrap
*** 223,229 ****
        /* We may eliminate the need for out-of-line copy to be output.
  	 In that case just go ahead and re-use it.  */
        if (!e->callee->callers->next_caller
! 	  && !e->callee->needed
  	  && !cgraph_new_nodes)
  	{
  	  gcc_assert (!e->callee->global.inlined_to);
--- 223,229 ----
        /* We may eliminate the need for out-of-line copy to be output.
  	 In that case just go ahead and re-use it.  */
        if (!e->callee->callers->next_caller
! 	  && cgraph_can_remove_if_no_direct_calls_p (e->callee)
  	  && !cgraph_new_nodes)
  	{
  	  gcc_assert (!e->callee->global.inlined_to);
*************** cgraph_clone_inlined_nodes (struct cgrap
*** 233,238 ****
--- 233,239 ----
  	      nfunctions_inlined++;
  	    }
  	  duplicate = false;
+ 	  e->callee->local.externally_visible = false;
  	}
        else
  	{
*************** cgraph_mark_inline_edge (struct cgraph_e
*** 286,292 ****
    e->callee->global.inlined = true;
  
    if (e->callee->callers->next_caller
!       || e->callee->needed)
      duplicate = true;
    cgraph_clone_inlined_nodes (e, true, update_original);
  
--- 287,293 ----
    e->callee->global.inlined = true;
  
    if (e->callee->callers->next_caller
!       || !cgraph_can_remove_if_no_direct_calls_p (e->callee))
      duplicate = true;
    cgraph_clone_inlined_nodes (e, true, update_original);
  
*************** cgraph_estimate_growth (struct cgraph_no
*** 368,374 ****
       we decide to not inline for different reasons, but it is not big deal
       as in that case we will keep the body around, but we will also avoid
       some inlining.  */
!   if (!node->needed && !DECL_EXTERNAL (node->decl) && !self_recursive)
      growth -= node->global.size;
  
    node->global.estimated_growth = growth;
--- 369,376 ----
       we decide to not inline for different reasons, but it is not big deal
       as in that case we will keep the body around, but we will also avoid
       some inlining.  */
!   if (cgraph_only_called_directly_p (node)
!       && !DECL_EXTERNAL (node->decl) && !self_recursive)
      growth -= node->global.size;
  
    node->global.estimated_growth = growth;
*************** cgraph_decide_inlining (void)
*** 1226,1232 ****
  
  	  if (node->callers
  	      && !node->callers->next_caller
! 	      && !node->needed
  	      && node->local.inlinable
  	      && node->callers->inline_failed
  	      && node->callers->caller != node
--- 1228,1234 ----
  
  	  if (node->callers
  	      && !node->callers->next_caller
! 	      && cgraph_only_called_directly_p (node)
  	      && node->local.inlinable
  	      && node->callers->inline_failed
  	      && node->callers->caller != node
Index: ipa.c
===================================================================
*** ipa.c	(revision 152496)
--- ipa.c	(working copy)
*************** cgraph_postorder (struct cgraph_node **o
*** 52,58 ****
    for (pass = 0; pass < 2; pass++)
      for (node = cgraph_nodes; node; node = node->next)
        if (!node->aux
! 	  && (pass || (node->needed && !node->address_taken)))
  	{
  	  node2 = node;
  	  if (!node->callers)
--- 52,60 ----
    for (pass = 0; pass < 2; pass++)
      for (node = cgraph_nodes; node; node = node->next)
        if (!node->aux
! 	  && (pass
! 	      || (!cgraph_only_called_directly_p (node)
! 	  	  && !node->address_taken)))
  	{
  	  node2 = node;
  	  if (!node->callers)
*************** cgraph_remove_unreachable_nodes (bool be
*** 132,142 ****
      gcc_assert (!node->aux);
  #endif
    for (node = cgraph_nodes; node; node = node->next)
!     if (node->needed && !node->global.inlined_to
  	&& ((!DECL_EXTERNAL (node->decl)) 
              || !node->analyzed
              || before_inlining_p))
        {
  	node->aux = first;
  	first = node;
        }
--- 134,145 ----
      gcc_assert (!node->aux);
  #endif
    for (node = cgraph_nodes; node; node = node->next)
!     if (!cgraph_can_remove_if_no_direct_calls_p (node)
  	&& ((!DECL_EXTERNAL (node->decl)) 
              || !node->analyzed
              || before_inlining_p))
        {
+         gcc_assert (!node->global.inlined_to);
  	node->aux = first;
  	first = node;
        }
*************** cgraph_remove_unreachable_nodes (bool be
*** 248,253 ****
--- 251,276 ----
    return changed;
  }
  
+ static bool
+ cgraph_externally_visible_p (struct cgraph_node *node, bool whole_program)
+ {
+   if (!DECL_COMDAT (node->decl)
+       && (!TREE_PUBLIC (node->decl) || DECL_EXTERNAL (node->decl)))
+     return false;
+   if (!whole_program)
+     return true;
+   /* COMDAT functions must be shared only if they have address taken,
+      otherwise we can produce our own private implementation with
+      -fwhole-program.  */
+   if (DECL_COMDAT (node->decl) && (node->address_taken || !node->analyzed))
+     return true;
+   if (MAIN_NAME_P (DECL_NAME (node->decl)))
+     return true;
+   if (lookup_attribute ("externally_visible", DECL_ATTRIBUTES (node->decl)))
+     return true;
+   return false;
+ }
+ 
  /* Mark visibility of all functions.
  
     A local function is one whose calls can occur only in the current
*************** cgraph_remove_unreachable_nodes (bool be
*** 260,284 ****
     via visibilities for the backend point of view.  */
  
  static unsigned int
! function_and_variable_visibility (void)
  {
    struct cgraph_node *node;
    struct varpool_node *vnode;
  
    for (node = cgraph_nodes; node; node = node->next)
      {
!       if (node->reachable
! 	  && (DECL_COMDAT (node->decl)
! 	      || (!flag_whole_program
! 		  && TREE_PUBLIC (node->decl) && !DECL_EXTERNAL (node->decl))))
! 	node->local.externally_visible = true;
        if (!node->local.externally_visible && node->analyzed
  	  && !DECL_EXTERNAL (node->decl))
  	{
! 	  gcc_assert (flag_whole_program || !TREE_PUBLIC (node->decl));
  	  TREE_PUBLIC (node->decl) = 0;
  	}
!       node->local.local = (!node->needed
  			   && node->analyzed
  			   && !DECL_EXTERNAL (node->decl)
  			   && !node->local.externally_visible);
--- 283,311 ----
     via visibilities for the backend point of view.  */
  
  static unsigned int
! function_and_variable_visibility (bool whole_program)
  {
    struct cgraph_node *node;
    struct varpool_node *vnode;
  
    for (node = cgraph_nodes; node; node = node->next)
      {
!       if (cgraph_externally_visible_p (node, whole_program))
!         {
! 	  gcc_assert (!node->global.inlined_to);
! 	  node->local.externally_visible = true;
! 	}
!       else
! 	node->local.externally_visible = false;
        if (!node->local.externally_visible && node->analyzed
  	  && !DECL_EXTERNAL (node->decl))
  	{
! 	  gcc_assert (whole_program || !TREE_PUBLIC (node->decl));
  	  TREE_PUBLIC (node->decl) = 0;
+ 	  DECL_COMDAT (node->decl) = 0;
+ 	  DECL_WEAK (node->decl) = 0;
  	}
!       node->local.local = (cgraph_only_called_directly_p (node)
  			   && node->analyzed
  			   && !DECL_EXTERNAL (node->decl)
  			   && !node->local.externally_visible);
*************** function_and_variable_visibility (void)
*** 286,297 ****
    for (vnode = varpool_nodes_queue; vnode; vnode = vnode->next_needed)
      {
        if (vnode->needed
! 	  && !flag_whole_program
! 	  && (DECL_COMDAT (vnode->decl) || TREE_PUBLIC (vnode->decl)))
! 	vnode->externally_visible = 1;
        if (!vnode->externally_visible)
  	{
! 	  gcc_assert (flag_whole_program || !TREE_PUBLIC (vnode->decl));
  	  TREE_PUBLIC (vnode->decl) = 0;
  	}
       gcc_assert (TREE_STATIC (vnode->decl));
--- 313,328 ----
    for (vnode = varpool_nodes_queue; vnode; vnode = vnode->next_needed)
      {
        if (vnode->needed
! 	  && (DECL_COMDAT (vnode->decl) || TREE_PUBLIC (vnode->decl))
! 	  && (!whole_program
! 	      || lookup_attribute ("externally_visible",
! 				   DECL_ATTRIBUTES (vnode->decl))))
! 	vnode->externally_visible = true;
!       else
!         vnode->externally_visible = false;
        if (!vnode->externally_visible)
  	{
! 	  gcc_assert (whole_program || !TREE_PUBLIC (vnode->decl));
  	  TREE_PUBLIC (vnode->decl) = 0;
  	}
       gcc_assert (TREE_STATIC (vnode->decl));
*************** function_and_variable_visibility (void)
*** 314,326 ****
    return 0;
  }
  
  struct simple_ipa_opt_pass pass_ipa_function_and_variable_visibility = 
  {
   {
    SIMPLE_IPA_PASS,
    "visibility",				/* name */
    NULL,					/* gate */
!   function_and_variable_visibility,	/* execute */
    NULL,					/* sub */
    NULL,					/* next */
    0,					/* static_pass_number */
--- 345,366 ----
    return 0;
  }
  
+ /* Local function pass handling visibilities.  This happens before LTO streaming
+    so in particular -fwhole-program should be ignored at this level.  */
+ 
+ static unsigned int
+ local_function_and_variable_visibility (void)
+ {
+   return function_and_variable_visibility (flag_whole_program && !flag_lto && !flag_whopr);
+ }
+ 
  struct simple_ipa_opt_pass pass_ipa_function_and_variable_visibility = 
  {
   {
    SIMPLE_IPA_PASS,
    "visibility",				/* name */
    NULL,					/* gate */
!   local_function_and_variable_visibility,/* execute */
    NULL,					/* sub */
    NULL,					/* next */
    0,					/* static_pass_number */
*************** struct simple_ipa_opt_pass pass_ipa_func
*** 333,338 ****
--- 373,430 ----
   }
  };
  
+ /* Do not re-run on ltrans stage.  */
+ 
+ static bool
+ gate_whole_program_function_and_variable_visibility (void)
+ {
+   return !flag_ltrans;
+ }
+ 
+ /* Bring functionss local at LTO time whith -fwhole-program.  */
+ 
+ static unsigned int
+ whole_program_function_and_variable_visibility (void)
+ {
+   struct cgraph_node *node;
+   struct varpool_node *vnode;
+ 
+   function_and_variable_visibility (flag_whole_program);
+ 
+   for (node = cgraph_nodes; node; node = node->next)
+     if (node->local.externally_visible)
+       cgraph_mark_needed_node (node);
+   for (vnode = varpool_nodes_queue; vnode; vnode = vnode->next_needed)
+     if (vnode->externally_visible)
+       varpool_mark_needed_node (vnode);
+   return 0;
+ }
+ 
+ struct ipa_opt_pass_d pass_ipa_whole_program_visibility =
+ {
+  {
+   IPA_PASS,
+   "whole-program",			/* name */
+   gate_whole_program_function_and_variable_visibility,/* gate */
+   whole_program_function_and_variable_visibility,/* execute */
+   NULL,					/* sub */
+   NULL,					/* next */
+   0,					/* static_pass_number */
+   TV_CGRAPHOPT,				/* tv_id */
+   0,	                                /* properties_required */
+   0,					/* properties_provided */
+   0,					/* properties_destroyed */
+   0,					/* todo_flags_start */
+   TODO_dump_cgraph | TODO_remove_functions/* todo_flags_finish */
+  },
+  NULL,					/* generate_summary */
+  NULL,					/* write_summary */
+  NULL,					/* read_summary */
+  NULL,					/* function_read_summary */
+  0,					/* TODOs */
+  NULL,					/* function_transform */
+  NULL,					/* variable_transform */
+ };
  
  /* Hash a cgraph node set element.  */
  
Index: lto/lto.c
===================================================================
*** lto/lto.c	(revision 152496)
--- lto/lto.c	(working copy)
*************** read_cgraph_and_symbols (unsigned nfiles
*** 1824,1834 ****
    /* Merge global decls.  */
    lto_symtab_merge_decls ();
  
!   /* Mark cgraph nodes needed in the merged cgraph.
!      ???  Is this really necessary?  */
!   for (node = cgraph_nodes; node; node = node->next)
!     if (cgraph_decide_is_function_needed (node, node->decl))
!       cgraph_mark_needed_node (node);
  
    timevar_push (TV_IPA_LTO_DECL_IO);
  
--- 1824,1841 ----
    /* Merge global decls.  */
    lto_symtab_merge_decls ();
  
!   /* Mark cgraph nodes needed in the merged cgraph
!      This normally happens in whole-program pass, but for
!      ltrans the pass was already run at WPA phase.
!      
!      FIXME:  This is not valid way to do so; nodes can be needed
!      for non-obvious reasons.  We should stream the flags from WPA
!      phase. */
!   if (flag_ltrans)
!     for (node = cgraph_nodes; node; node = node->next)
!       if (!node->global.inlined_to
! 	  && cgraph_decide_is_function_needed (node, node->decl))
!         cgraph_mark_needed_node (node);
  
    timevar_push (TV_IPA_LTO_DECL_IO);
  
Index: passes.c
===================================================================
*** passes.c	(revision 152496)
--- passes.c	(working copy)
*************** init_optimization_passes (void)
*** 759,764 ****
--- 759,765 ----
    *p = NULL;
  
    p = &all_regular_ipa_passes;
+   NEXT_PASS (pass_ipa_whole_program_visibility);
    NEXT_PASS (pass_ipa_cp);
    NEXT_PASS (pass_ipa_inline);
    NEXT_PASS (pass_ipa_reference);
*************** do_per_function_toporder (void (*callbac
*** 1099,1105 ****
  	  /* Allow possibly removed nodes to be garbage collected.  */
  	  order[i] = NULL;
  	  node->process = 0;
! 	  if (node->analyzed && (node->needed || node->reachable))
  	    {
  	      push_cfun (DECL_STRUCT_FUNCTION (node->decl));
  	      current_function_decl = node->decl;
--- 1100,1106 ----
  	  /* Allow possibly removed nodes to be garbage collected.  */
  	  order[i] = NULL;
  	  node->process = 0;
! 	  if (node->analyzed)
  	    {
  	      push_cfun (DECL_STRUCT_FUNCTION (node->decl));
  	      current_function_decl = node->decl;
*************** function_called_by_processed_nodes_p (vo
*** 1783,1789 ****
      {
        if (e->caller->decl == current_function_decl)
          continue;
!       if (!e->caller->analyzed || (!e->caller->needed && !e->caller->reachable))
          continue;
        if (TREE_ASM_WRITTEN (e->caller->decl))
          continue;
--- 1784,1790 ----
      {
        if (e->caller->decl == current_function_decl)
          continue;
!       if (!e->caller->analyzed)
          continue;
        if (TREE_ASM_WRITTEN (e->caller->decl))
          continue;
Index: varpool.c
===================================================================
*** varpool.c	(revision 152496)
--- varpool.c	(working copy)
*************** along with GCC; see the file COPYING3.  
*** 35,40 ****
--- 35,41 ----
  #include "output.h"
  #include "gimple.h"
  #include "tree-flow.h"
+ #include "flags.h"
  
  /*  This file contains basic routines manipulating variable pool.
  
*************** decide_is_variable_needed (struct varpoo
*** 245,251 ****
  
    /* Externally visible variables must be output.  The exception is
       COMDAT variables that must be output only when they are needed.  */
!   if (TREE_PUBLIC (decl) && !flag_whole_program && !DECL_COMDAT (decl)
        && !DECL_EXTERNAL (decl))
      return true;
  
--- 246,256 ----
  
    /* Externally visible variables must be output.  The exception is
       COMDAT variables that must be output only when they are needed.  */
!   if (TREE_PUBLIC (decl)
!       && !flag_whole_program
!       && !flag_lto
!       && !flag_whopr
!       && !DECL_COMDAT (decl)
        && !DECL_EXTERNAL (decl))
      return true;
  
*************** varpool_finalize_decl (tree decl)
*** 279,284 ****
--- 284,300 ----
  {
    struct varpool_node *node = varpool_node (decl);
  
+   /* FIXME: We don't really stream varpool datastructure and instead rebuild it
+      by varpool_finalize_decl.  This is not quite correct since this way we can't
+      attach any info to varpool.  Eventually we will want to stream varpool nodes
+      and the flags.
+ 
+      For the moment just prevent analysis of varpool nodes to happen again, so
+      we will re-try to compute "address_taken" flag of varpool that breaks
+      in presence of clones.  */
+   if (in_lto_p)
+     node->analyzed = true;
+ 
    /* The first declaration of a variable that comes through this function
       decides whether it is global (in C, has external linkage)
       or local (in C, has internal linkage).  So do nothing more
*************** varpool_analyze_pending_decls (void)
*** 333,349 ****
    while (varpool_first_unanalyzed_node)
      {
        tree decl = varpool_first_unanalyzed_node->decl;
  
        varpool_first_unanalyzed_node->analyzed = true;
  
        varpool_first_unanalyzed_node = varpool_first_unanalyzed_node->next_needed;
  
!       /* Compute the alignment early so function body expanders are
! 	 already informed about increased alignment.  */
!       align_variable (decl, 0);
  
!       if (DECL_INITIAL (decl))
! 	record_references_in_initializer (decl);
        changed = true;
      }
    timevar_pop (TV_CGRAPH);
--- 349,373 ----
    while (varpool_first_unanalyzed_node)
      {
        tree decl = varpool_first_unanalyzed_node->decl;
+       bool analyzed = varpool_first_unanalyzed_node->analyzed;
  
        varpool_first_unanalyzed_node->analyzed = true;
  
        varpool_first_unanalyzed_node = varpool_first_unanalyzed_node->next_needed;
  
!       /* When reading back varpool at LTO time, we re-construct the queue in order
!          to have "needed" list right by inserting all needed nodes into varpool.
! 	 We however don't want to re-analyze already analyzed nodes.  */
!       if (!analyzed)
! 	{
! 	  gcc_assert (!in_lto_p);
!           /* Compute the alignment early so function body expanders are
! 	     already informed about increased alignment.  */
!           align_variable (decl, 0);
  
!           if (DECL_INITIAL (decl))
! 	    record_references_in_initializer (decl);
! 	}
        changed = true;
      }
    timevar_pop (TV_CGRAPH);


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]