This is the mail archive of the
mailing list for the GCC project.
LTO plugin and comdat symbols
- From: Jan Hubicka <hubicka at ucw dot cz>
- To: ccoutant at google dot com, iant at google dot com, tglek at mozilla dot com, rguenther at suse dot de, gcc-patches at gcc dot gnu dot org, dnovillo at google dot com
- Date: Wed, 6 Oct 2010 14:46:20 +0200
- Subject: LTO plugin and comdat symbols
this patch fixes three problems:
1) Mozilla dynsym section is about 3 times bigger when mozilla is built
with LTO than when it is built without. This cause noticeable
binary size growth and startup time problems.
Looking into the objdump -T, the extra symbols are all of the form:
0000000000000000 w D *UND* 0000000000000000 Base _ZNSaIPN7mozilla6layers15ShadowableLayerEED1Ev
Normal Mozilla binary has no dynsym entry matching UND.*Base pattern.
Those are stale entries not really used anywhere in the binary later.
2) When bisecting a problem, I noticed that linking part of mozilla
without LTO with other part with LTO leads to undefined symbols.
Again comdat inlines
3) Mozilla -O3 performance with LTO is worse than one without LTO this is
due to bad inlining decisions.
The problem can be demonstreated at:
extern void big (void);
Now we have COMDAT function TEST that is not inlined during early optimization.
At streaming time, we thus include it in symtab as comdat function (correctly).
This is used by linker and plugin pass us resolution file as:
85 faa3fa3d PREVAILING_DEF _Z4testv
91 faa3fa3d PREVAILING_DEF main
it says that both main and the comdat function are prevalining definition. According
to linker plugin specification:
PREVAILING_DEF (this is the prevailing definition of the symbol, with references from regular objects)
Any symbol marked PREVAILING_DEF must be defined in one object file added to
the link after WPA is done, or an undefined symbol error will result. Any
symbol marked PREVAILING_DEF_IRONLY may be left undefined (provided all
references to the symbol have been removed), and the linker will not issue an
So GCC is now required to define test even if it inline it to main() leading to
extra unnecesary function.
GCC does not exactly behave this way:
1) for hidden symbols it does honnor it with -flto. It believes that external
symbol is binding to it and thus it is not optimizing section away.
2) for non-hidden symbols it looks if address was taken. As an optimization
at LTO we bring comdats without address taken as static. This is becuase
we know we will end up at worst case with two copies of the same COMDAT.
This is different from single unit compilation where possibly many copies
would result from same transfrom.
3) In whopr mode, GCC might just forget about the function as it is not
As a result we either get unnecesary function in binary (with -flto) or we get
that aforementioned stale dynamic symbol table entry by violating plugin
Now this patch attempts to improve situation by first making GCC to correctly
obey PREVAILING_DEF and also by actually not inserting COMDAT symbols into
the symbol table when their address is not taken.
It may seem odd to do so, but just obeying PREVAILING_DEF further increase
sizes of binaries of C++ programs (28% in Mozilla) so it is not really acceptable.
I believe not inserting the symbols is safe as we have two options
1) address of the symbol is never taken in any of LTO modules.
Than none of LTO modules exports the symbol and GCC will bring it static
because of the rule I described earlier. Thus linker will never care
2) address of the symbol is taken in one of LTO modules.
Then linker will see its definition and will either mark it as PREVAILING_DEF
or PREEMTED_DEF (it is defined in earlier non-lTO object file).
Now GCC will avoid removing the symbol believing it is used from object files.
This lose when we manage out all uses of the symbol and also for PREEMTED_DEF
we don't really need to output the definition, but doing so is harmful.
As an followup I would like to make GCC to handle PREEMTED_DEF correctly too.
Now it would be nice if we solved the case when address of symbol is taken,
the symbol is not used by actual object files but it is externally visible
and it is optimized out.
The patch also fixes handling symbols with internal visibility that is equivalent
to hidden here.
I think in this case it would make sense to change gold's behaviour by marking
externally visible symbol that are not explicitely used by other object files
at PREVAILING_IRONLY instead of PREVAILING_DEF. I always assumed gold behaving
thisw way and it would give GCC freedom to take the symbol again.
GCC by itself has all the info it needs to know if the symbol is exported from DSO
so it will not remove non-COMDATs. Would that be possible?
Bootstrapped/regtested x86_64-linux, OK?
--- lto-streamer-out.c (revision 164995)
+++ lto-streamer-out.c (working copy)
@@ -2477,8 +2487,11 @@
for (i = 0; i < lto_cgraph_encoder_size (encoder); i++)
node = lto_cgraph_encoder_deref (encoder, i);
- if (node->alias)
+ if (DECL_COMDAT (node->decl)
+ && cgraph_can_remove_if_no_direct_calls_p (node))
+ if (node->alias || node->global.inlined_to)
write_symbol (cache, &stream, node->decl, seen, false);
for (alias = node->same_body; alias; alias = alias->next)
write_symbol (cache, &stream, alias->decl, seen, true);
--- ipa.c (revision 164995)
+++ ipa.c (working copy)
@@ -238,15 +238,20 @@
for (node = cgraph_nodes; node; node = node->next)
- if ((!cgraph_can_remove_if_no_direct_calls_and_refs_p (node)
- /* Keep around virtual functions for possible devirtualization. */
- || (!before_inlining_p
- && !node->global.inlined_to
- && DECL_VIRTUAL_P (node->decl)
- && (DECL_COMDAT (node->decl) || DECL_EXTERNAL (node->decl))))
- && ((!DECL_EXTERNAL (node->decl))
- || before_inlining_p))
+ if (!node->analyzed)
+ gcc_assert (!node->aux);
+ node->reachable = false;
+ else if ((!cgraph_can_remove_if_no_direct_calls_and_refs_p (node)
+ /* Keep around virtual functions for possible devirtualization. */
+ || (!before_inlining_p
+ && !node->global.inlined_to
+ && DECL_VIRTUAL_P (node->decl)
+ && (DECL_COMDAT (node->decl) || DECL_EXTERNAL (node->decl))))
+ && ((!DECL_EXTERNAL (node->decl))
+ || before_inlining_p))
enqueue_cgraph_node (node, &first);
node->reachable = true;
@@ -592,8 +597,13 @@
+ /* If linker counts on us, we must preserve the function. */
+ if (cgraph_used_from_object_file_p (node))
+ return true;
/* When doing link time optimizations, hidden symbols become local. */
- if (in_lto_p && DECL_VISIBILITY (node->decl) == VISIBILITY_HIDDEN
+ if (in_lto_p
+ && (DECL_VISIBILITY (node->decl) == VISIBILITY_HIDDEN
+ || DECL_VISIBILITY (node->decl) == VISIBILITY_INTERNAL)
/* Be sure that node is defined in IR file, not in other object
file. In that case we don't set used_from_other_object_file. */
@@ -621,8 +631,6 @@
- if (cgraph_used_from_object_file_p (node))
- return true;
if (DECL_PRESERVE_P (node->decl))
if (MAIN_NAME_P (DECL_NAME (node->decl)))
@@ -794,7 +802,8 @@
/* When doing linktime optimizations, all hidden symbols will
become local. */
- || DECL_VISIBILITY (vnode->decl) != VISIBILITY_HIDDEN
+ || (DECL_VISIBILITY (vnode->decl) != VISIBILITY_HIDDEN
+ && DECL_VISIBILITY (vnode->decl) != VISIBILITY_INTERNAL)
/* We can get prevailing decision in other object file.
In this case we do not sed used_from_object_file. */
--- lto/lto.c (revision 164995)
+++ lto/lto.c (working copy)
@@ -839,7 +839,8 @@
/* Extern inlines and comdat are always only in partitions they are needed. */
if (DECL_EXTERNAL (node->decl)
- || DECL_COMDAT (node->decl))
+ || (DECL_COMDAT (node->decl)
+ && !cgraph_used_from_object_file_p (node)))
@@ -854,7 +855,8 @@
/* Constant pool and comdat are always only in partitions they are needed. */
if (DECL_IN_CONSTANT_POOL (vnode->decl)
- || DECL_COMDAT (vnode->decl))
+ || (DECL_COMDAT (vnode->decl)
+ && !varpool_used_from_object_file_p (vnode)))