This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: LTO plugin and comdat symbols


On Wed, Oct 6, 2010 at 5:46 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
> Hi,
> this patch fixes three problems:
> ?1) Mozilla dynsym section is about 3 times bigger when mozilla is built
> ? ? with LTO than when it is built without. ?This cause noticeable
> ? ? binary size growth and startup time problems.
> ? ? Looking into the objdump -T, the extra symbols are all of the form:
>
> ? ? 0000000000000000 ?w ? D ?*UND* ?0000000000000000 ?Base ? ? ? ?_ZNSaIPN7mozilla6layers15ShadowableLayerEED1Ev
>
> ? ? Normal Mozilla binary has no dynsym entry matching UND.*Base pattern.
> ? ? Those are stale entries not really used anywhere in the binary later.
>
> ?2) When bisecting a problem, I noticed that linking part of mozilla
> ? ? without LTO with other part with LTO leads to undefined symbols.
> ? ? Again comdat inlines
>
> ?3) Mozilla -O3 performance with LTO is worse than one without LTO this is
> ? ? due to bad inlining decisions.
>
> The problem can be demonstreated at:
>
> #include <stdio.h>
> extern void big (void);
> inline int
> test ()
> {
> ?printf ("big\n");
> ?printf ("big\n");
> }
>
> int
> main (void)
> {
> ?while (true)
> ? ?{
> ? ? ?test ();
> ? ? ?test ();
> ? ?}
> }
>
> Now we have COMDAT function TEST that is not inlined during early optimization.
> At streaming time, we thus include it in symtab as comdat function (correctly).
> This is used by linker and plugin pass us resolution file as:
>
> ?2
> 85 faa3fa3d PREVAILING_DEF _Z4testv
> 91 faa3fa3d PREVAILING_DEF main
>
> it says that both main and the comdat function are prevalining definition. According
> to linker plugin specification:
>
> PREVAILING_DEF (this is the prevailing definition of the symbol, with references from regular objects)
>
> ....
>
> Any symbol marked PREVAILING_DEF must be defined in one object file added to
> the link after WPA is done, or an undefined symbol error will result. Any
> symbol marked PREVAILING_DEF_IRONLY may be left undefined (provided all
> references to the symbol have been removed), and the linker will not issue an
> error.
>
> So GCC is now required to define test even if it inline it to main() leading to
> extra unnecesary function.
>
> GCC does not exactly behave this way:
> ?1) for hidden symbols it does honnor it with -flto. ?It believes that external
> ? ? symbol is binding to it and thus it is not optimizing section away.
> ?2) for non-hidden symbols it looks if address was taken. As an optimization
> ? ? at LTO we bring comdats without address taken as static. ?This is becuase
> ? ? we know we will end up at worst case with two copies of the same COMDAT.
> ? ? This is different from single unit compilation where possibly many copies
> ? ? would result from same transfrom.
> ?3) In whopr mode, GCC might just forget about the function as it is not
> ? ? partitioning COMDATs.
>
> As a result we either get unnecesary function in binary (with -flto) or we get
> that aforementioned stale dynamic symbol table entry by violating plugin
> specification.
>
> Now this patch attempts to improve situation by first making GCC to correctly
> obey PREVAILING_DEF and also by actually not inserting COMDAT symbols into
> the symbol table when their address is not taken.
>
> It may seem odd to do so, but just obeying PREVAILING_DEF further increase
> sizes of binaries of C++ programs (28% in Mozilla) so it is not really acceptable.
> I believe not inserting the symbols is safe as we have two options
>
> ?1) address of the symbol is never taken in any of LTO modules.
>
> ? ?Than none of LTO modules exports the symbol and GCC will bring it static
> ? ?because of the rule I described earlier. Thus linker will never care
>
> ?2) address of the symbol is taken in one of LTO modules.
>
> ? ?Then linker will see its definition and will either mark it as PREVAILING_DEF
> ? ?or PREEMTED_DEF (it is defined in earlier non-lTO object file).
>
> ? ?Now GCC will avoid removing the symbol believing it is used from object files.
> ? ?This lose when we manage out all uses of the symbol and also for PREEMTED_DEF
> ? ?we don't really need to output the definition, but doing so is harmful.
>
> As an followup I would like to make GCC to handle PREEMTED_DEF correctly too.
>
> Now it would be nice if we solved the case when address of symbol is taken,
> the symbol is not used by actual object files but it is externally visible
> and it is optimized out.
>
> The patch also fixes handling symbols with internal visibility that is equivalent
> to hidden here.
>
> I think in this case it would make sense to change gold's behaviour by marking
> externally visible symbol that are not explicitely used by other object files
> at PREVAILING_IRONLY instead of PREVAILING_DEF. I always assumed gold behaving
> thisw way and it would give GCC freedom to take the symbol again.
> GCC by itself has all the info it needs to know if the symbol is exported from DSO
> so it will not remove non-COMDATs. Would that be possible?
>
> Bootstrapped/regtested x86_64-linux, OK?
>
>
> Index: lto-streamer-out.c
> ===================================================================
> --- lto-streamer-out.c ?(revision 164995)
> +++ lto-streamer-out.c ?(working copy)
> @@ -2477,8 +2487,11 @@
> ? for (i = 0; i < lto_cgraph_encoder_size (encoder); i++)
> ? ? {
> ? ? ? node = lto_cgraph_encoder_deref (encoder, i);
> - ? ? ?if (node->alias)
> + ? ? ?if (DECL_COMDAT (node->decl)
> + ? ? ? ? && cgraph_can_remove_if_no_direct_calls_p (node))
> ? ? ? ?continue;
> + ? ? ?if (node->alias || node->global.inlined_to)
> + ? ? ? continue;
> ? ? ? write_symbol (cache, &stream, node->decl, seen, false);
> ? ? ? for (alias = node->same_body; alias; alias = alias->next)
> ? ? ? ? write_symbol (cache, &stream, alias->decl, seen, true);
> Index: ipa.c
> ===================================================================
> --- ipa.c ? ? ? (revision 164995)
> +++ ipa.c ? ? ? (working copy)
> @@ -238,15 +238,20 @@
> ?#endif
> ? varpool_reset_queue ();
> ? for (node = cgraph_nodes; node; node = node->next)
> - ? ?if ((!cgraph_can_remove_if_no_direct_calls_and_refs_p (node)
> - ? ? ? ?/* Keep around virtual functions for possible devirtualization. ?*/
> - ? ? ? ?|| (!before_inlining_p
> - ? ? ? ? ? ?&& !node->global.inlined_to
> - ? ? ? ? ? ?&& DECL_VIRTUAL_P (node->decl)
> - ? ? ? ? ? ?&& (DECL_COMDAT (node->decl) || DECL_EXTERNAL (node->decl))))
> - ? ? ? && ((!DECL_EXTERNAL (node->decl))
> - ? ? ? ? ? ?|| before_inlining_p))
> + ? ?if (!node->analyzed)
> ? ? ? {
> + ? ? ? ?gcc_assert (!node->aux);
> + ? ? ? node->reachable = false;
> + ? ? ?}
> + ? ?else if ((!cgraph_can_remove_if_no_direct_calls_and_refs_p (node)
> + ? ? ? ? ? ? /* Keep around virtual functions for possible devirtualization. ?*/
> + ? ? ? ? ? ? || (!before_inlining_p
> + ? ? ? ? ? ? ? ? && !node->global.inlined_to
> + ? ? ? ? ? ? ? ? && DECL_VIRTUAL_P (node->decl)
> + ? ? ? ? ? ? ? ? && (DECL_COMDAT (node->decl) || DECL_EXTERNAL (node->decl))))
> + ? ? ? ? ? ?&& ((!DECL_EXTERNAL (node->decl))
> + ? ? ? ? ? ? ? ?|| before_inlining_p))
> + ? ? ?{
> ? ? ? ? gcc_assert (!node->global.inlined_to);
> ? ? ? ?enqueue_cgraph_node (node, &first);
> ? ? ? ?node->reachable = true;
> @@ -592,8 +597,13 @@
> ? if (aliased)
> ? ? return true;
>
> + ?/* If linker counts on us, we must preserve the function. ?*/
> + ?if (cgraph_used_from_object_file_p (node))
> + ? ?return true;
> ? /* When doing link time optimizations, hidden symbols become local. ?*/
> - ?if (in_lto_p && DECL_VISIBILITY (node->decl) == VISIBILITY_HIDDEN
> + ?if (in_lto_p
> + ? ? ?&& (DECL_VISIBILITY (node->decl) == VISIBILITY_HIDDEN
> + ? ? ? ? || DECL_VISIBILITY (node->decl) == VISIBILITY_INTERNAL)
> ? ? ? /* Be sure that node is defined in IR file, not in other object
> ? ? ? ? file. ?In that case we don't set used_from_other_object_file. ?*/
> ? ? ? && node->analyzed)
> @@ -621,8 +631,6 @@
> ? ? ? ? ? ? ?return true;
> ? ? ? ?}
> ? ? }
> - ?if (cgraph_used_from_object_file_p (node))
> - ? ?return true;
> ? if (DECL_PRESERVE_P (node->decl))
> ? ? return true;
> ? if (MAIN_NAME_P (DECL_NAME (node->decl)))
> @@ -794,7 +802,8 @@
> ? ? ? ? ? ? ? /* When doing linktime optimizations, all hidden symbols will
> ? ? ? ? ? ? ? ? ?become local. ?*/
> ? ? ? ? ? ? ? && (!in_lto_p
> - ? ? ? ? ? ? ? ? ?|| DECL_VISIBILITY (vnode->decl) != VISIBILITY_HIDDEN
> + ? ? ? ? ? ? ? ? ?|| (DECL_VISIBILITY (vnode->decl) != VISIBILITY_HIDDEN
> + ? ? ? ? ? ? ? ? ? ? ?&& DECL_VISIBILITY (vnode->decl) != VISIBILITY_INTERNAL)
> ? ? ? ? ? ? ? ? ? /* We can get prevailing decision in other object file.
> ? ? ? ? ? ? ? ? ? ? ?In this case we do not sed used_from_object_file. ?*/
> ? ? ? ? ? ? ? ? ? || !vnode->finalized))

The ipa.c changes caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45926


H.J.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]