This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: LTO plugin and comdat symbols


On Wed, 6 Oct 2010, Jan Hubicka wrote:

> Hi,
> this patch fixes three problems:
>   1) Mozilla dynsym section is about 3 times bigger when mozilla is built
>      with LTO than when it is built without.  This cause noticeable
>      binary size growth and startup time problems.
>      Looking into the objdump -T, the extra symbols are all of the form:
> 
>      0000000000000000  w   D  *UND*  0000000000000000  Base        _ZNSaIPN7mozilla6layers15ShadowableLayerEED1Ev
> 
>      Normal Mozilla binary has no dynsym entry matching UND.*Base pattern.
>      Those are stale entries not really used anywhere in the binary later.
> 
>   2) When bisecting a problem, I noticed that linking part of mozilla
>      without LTO with other part with LTO leads to undefined symbols.
>      Again comdat inlines
> 
>   3) Mozilla -O3 performance with LTO is worse than one without LTO this is
>      due to bad inlining decisions.
> 
> The problem can be demonstreated at:
> 
> #include <stdio.h>
> extern void big (void);
> inline int
> test ()
> {
>   printf ("big\n");
>   printf ("big\n");
> }
> 
> int
> main (void)
> {
>   while (true)
>     {
>       test ();
>       test ();
>     }
> }
> 
> Now we have COMDAT function TEST that is not inlined during early optimization.
> At streaming time, we thus include it in symtab as comdat function (correctly).
> This is used by linker and plugin pass us resolution file as:
> 
>  2                                                                                                                                                                                  
> 85 faa3fa3d PREVAILING_DEF _Z4testv                                                                                                                                                 
> 91 faa3fa3d PREVAILING_DEF main
> 
> it says that both main and the comdat function are prevalining definition. According
> to linker plugin specification:
> 
> PREVAILING_DEF (this is the prevailing definition of the symbol, with references from regular objects) 
> 
> ....
> 
> Any symbol marked PREVAILING_DEF must be defined in one object file added to
> the link after WPA is done, or an undefined symbol error will result. Any
> symbol marked PREVAILING_DEF_IRONLY may be left undefined (provided all
> references to the symbol have been removed), and the linker will not issue an
> error. 
> 
> So GCC is now required to define test even if it inline it to main() leading to
> extra unnecesary function.
> 
> GCC does not exactly behave this way:
>   1) for hidden symbols it does honnor it with -flto.  It believes that external
>      symbol is binding to it and thus it is not optimizing section away.
>   2) for non-hidden symbols it looks if address was taken. As an optimization
>      at LTO we bring comdats without address taken as static.  This is becuase
>      we know we will end up at worst case with two copies of the same COMDAT.
>      This is different from single unit compilation where possibly many copies
>      would result from same transfrom.
>   3) In whopr mode, GCC might just forget about the function as it is not
>      partitioning COMDATs.
> 
> As a result we either get unnecesary function in binary (with -flto) or we get
> that aforementioned stale dynamic symbol table entry by violating plugin
> specification.
> 
> Now this patch attempts to improve situation by first making GCC to correctly
> obey PREVAILING_DEF and also by actually not inserting COMDAT symbols into
> the symbol table when their address is not taken.
> 
> It may seem odd to do so, but just obeying PREVAILING_DEF further increase
> sizes of binaries of C++ programs (28% in Mozilla) so it is not really acceptable.
> I believe not inserting the symbols is safe as we have two options
> 
>  1) address of the symbol is never taken in any of LTO modules.
> 
>     Than none of LTO modules exports the symbol and GCC will bring it static
>     because of the rule I described earlier. Thus linker will never care
> 
>  2) address of the symbol is taken in one of LTO modules.
> 
>     Then linker will see its definition and will either mark it as PREVAILING_DEF
>     or PREEMTED_DEF (it is defined in earlier non-lTO object file).
> 
>     Now GCC will avoid removing the symbol believing it is used from object files.
>     This lose when we manage out all uses of the symbol and also for PREEMTED_DEF
>     we don't really need to output the definition, but doing so is harmful.
> 
> As an followup I would like to make GCC to handle PREEMTED_DEF correctly too.
> 
> Now it would be nice if we solved the case when address of symbol is taken,
> the symbol is not used by actual object files but it is externally visible
> and it is optimized out.
> 
> The patch also fixes handling symbols with internal visibility that is equivalent
> to hidden here.
> 
> I think in this case it would make sense to change gold's behaviour by marking
> externally visible symbol that are not explicitely used by other object files
> at PREVAILING_IRONLY instead of PREVAILING_DEF. I always assumed gold behaving
> thisw way and it would give GCC freedom to take the symbol again.
> GCC by itself has all the info it needs to know if the symbol is exported from DSO
> so it will not remove non-COMDATs. Would that be possible?
> 
> Bootstrapped/regtested x86_64-linux, OK?

Changelog missing.

Otherwise ok.

It would be nice if we had a high-level overview of how the LTO - linker
plugin interaction works and what we do with regards to symbols at
WPA / LTRANS stage with/without -fwhole-program.

Thanks,
Richard.

> Index: lto-streamer-out.c
> ===================================================================
> --- lto-streamer-out.c	(revision 164995)
> +++ lto-streamer-out.c	(working copy)
> @@ -2477,8 +2487,11 @@
>    for (i = 0; i < lto_cgraph_encoder_size (encoder); i++)
>      {
>        node = lto_cgraph_encoder_deref (encoder, i);
> -      if (node->alias)
> +      if (DECL_COMDAT (node->decl)
> +	  && cgraph_can_remove_if_no_direct_calls_p (node))
>  	continue;
> +      if (node->alias || node->global.inlined_to)
> +	continue;
>        write_symbol (cache, &stream, node->decl, seen, false);
>        for (alias = node->same_body; alias; alias = alias->next)
>          write_symbol (cache, &stream, alias->decl, seen, true);
> Index: ipa.c
> ===================================================================
> --- ipa.c	(revision 164995)
> +++ ipa.c	(working copy)
> @@ -238,15 +238,20 @@
>  #endif
>    varpool_reset_queue ();
>    for (node = cgraph_nodes; node; node = node->next)
> -    if ((!cgraph_can_remove_if_no_direct_calls_and_refs_p (node)
> -	 /* Keep around virtual functions for possible devirtualization.  */
> -	 || (!before_inlining_p
> -	     && !node->global.inlined_to
> -	     && DECL_VIRTUAL_P (node->decl)
> -	     && (DECL_COMDAT (node->decl) || DECL_EXTERNAL (node->decl))))
> -	&& ((!DECL_EXTERNAL (node->decl))
> -            || before_inlining_p))
> +    if (!node->analyzed)
>        {
> +        gcc_assert (!node->aux);
> +	node->reachable = false;
> +      }
> +    else if ((!cgraph_can_remove_if_no_direct_calls_and_refs_p (node)
> +	      /* Keep around virtual functions for possible devirtualization.  */
> +	      || (!before_inlining_p
> +		  && !node->global.inlined_to
> +		  && DECL_VIRTUAL_P (node->decl)
> +		  && (DECL_COMDAT (node->decl) || DECL_EXTERNAL (node->decl))))
> +	     && ((!DECL_EXTERNAL (node->decl))
> +		 || before_inlining_p))
> +      {
>          gcc_assert (!node->global.inlined_to);
>  	enqueue_cgraph_node (node, &first);
>  	node->reachable = true;
> @@ -592,8 +597,13 @@
>    if (aliased)
>      return true;
>  
> +  /* If linker counts on us, we must preserve the function.  */
> +  if (cgraph_used_from_object_file_p (node))
> +    return true;
>    /* When doing link time optimizations, hidden symbols become local.  */
> -  if (in_lto_p && DECL_VISIBILITY (node->decl) == VISIBILITY_HIDDEN
> +  if (in_lto_p
> +      && (DECL_VISIBILITY (node->decl) == VISIBILITY_HIDDEN
> +	  || DECL_VISIBILITY (node->decl) == VISIBILITY_INTERNAL)
>        /* Be sure that node is defined in IR file, not in other object
>  	 file.  In that case we don't set used_from_other_object_file.  */
>        && node->analyzed)
> @@ -621,8 +631,6 @@
>  	      return true;
>  	}
>      }
> -  if (cgraph_used_from_object_file_p (node))
> -    return true;
>    if (DECL_PRESERVE_P (node->decl))
>      return true;
>    if (MAIN_NAME_P (DECL_NAME (node->decl)))
> @@ -794,7 +802,8 @@
>  	       /* When doing linktime optimizations, all hidden symbols will
>  		  become local.  */
>  	       && (!in_lto_p
> -		   || DECL_VISIBILITY (vnode->decl) != VISIBILITY_HIDDEN
> +		   || (DECL_VISIBILITY (vnode->decl) != VISIBILITY_HIDDEN
> +		       && DECL_VISIBILITY (vnode->decl) != VISIBILITY_INTERNAL)
>  		   /* We can get prevailing decision in other object file.
>  		      In this case we do not sed used_from_object_file.  */
>  		   || !vnode->finalized))
> Index: lto/lto.c
> ===================================================================
> --- lto/lto.c	(revision 164995)
> +++ lto/lto.c	(working copy)
> @@ -839,7 +839,8 @@
>      return false;
>    /* Extern inlines and comdat are always only in partitions they are needed.  */
>    if (DECL_EXTERNAL (node->decl)
> -      || DECL_COMDAT (node->decl))
> +      || (DECL_COMDAT (node->decl)
> +	  && !cgraph_used_from_object_file_p (node)))
>      return false;
>    return true;
>  }
> @@ -854,7 +855,8 @@
>      return false;
>    /* Constant pool and comdat are always only in partitions they are needed.  */
>    if (DECL_IN_CONSTANT_POOL (vnode->decl)
> -      || DECL_COMDAT (vnode->decl))
> +      || (DECL_COMDAT (vnode->decl)
> +	  && !varpool_used_from_object_file_p (vnode)))
>      return false;
>    return true;
>  }
> 
> 

-- 
Richard Guenther <rguenther@suse.de>
Novell / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 - GF: Markus Rex


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]