This is the mail archive of the
mailing list for the libstdc++ project.
Re: Generic questions on inline functions and linking
- From: Jonathan Wakely <jwakely at redhat dot com>
- To: Tim Shen <timshen at google dot com>
- Cc: libstdc++ <libstdc++ at gcc dot gnu dot org>
- Date: Thu, 23 Jul 2015 11:08:12 +0100
- Subject: Re: Generic questions on inline functions and linking
- Authentication-results: sourceware.org; auth=none
- References: <CAG4ZjNmGfTcZcuZoo=Z8mV7J+RHW1ZN0NbDrrMSPR_6+mi9zQQ at mail dot gmail dot com> <20150722094510 dot GW21787 at redhat dot com> <CAG4ZjN=_3PstvbFTPLQHtu74KGTZhMX4zQZWiWT4aATCSf=g5Q at mail dot gmail dot com>
On 22/07/15 21:36 -0700, Tim Shen wrote:
My observation is: adding inline specifier to a *template* member
function definition outside of its class scope improves the
performance and reduces the binary size.
Taking trunk as an exmaple. I added inline for each function in
include/bits/regex_executor.tcc, and recompiled a file that is similar
to our check-performance test file, then I observed a reduced binary
size (with 3 less exported symbols) and slight performance
I tried to read the standard for whether a template member function
definition is implicitly inline, but failed :P. If it's true, then why
do we have different generated code here? If it's false, what's the
Templates are not implicitly inline, but similar to inline functions
they have what is known as "vague linkage". Inline functions and
templates can both get emitted in multiple object files requiring the
linker to discard the duplicate symbols.
I guess the difference you're seeing is due to the 'inline' keyword
affecting the compiler's decisions about which functions to actually
inline, and the fact that it doesn't need to emit the function
definition for an inline function if all calls to it get inlined.
Remember that there's a difference between what the C++ language
defines as an 'inline function' and which functions the compiler
actually chooses to inline into their callers.
What the C++ standard defines as an inline function is unambiguous
(either it's defined in the class body, or is marked 'inline' or
'constexpr') and affects whether it is allowed to be defined in more
than one translation unit. So as far as language semantics go, a
function is either inline or it isn't. There's no middle ground. For
the rest of this email I'll refer to these as Inline Functions.
What the compiler decides to inline is far more fluid, it depends on
heuristics, optimisation level, the context the code is called in (the
same function might get inlined in one case and not in another) and
the compiler can also decide to inline functions that aren't 'Inline
Functions' in C++ language terms.
So by adding the 'inline' specifier to those functions you have
definitely made them Inline Functions, but that might also have
affected how the compiler optimises them.
Finally, when the compiler chooses to inline all calls to an Inline
Function from a given translation unit, it doesn't need to emit any
definition of that function as a symbol in the object file, because
nothing in that object file makes a call to that symbol (all calls
were inlined). Any callers in other object files will see a definition
of the Inline Function (that's required by [dcl.fct.spec]/4) so the
compiler will emit a definition in those other objects if needed. So
by adding 'inline' and turning them into Inline Functions you have let
the compiler know that it doesn't need to emit extern symbols for
those functions, which explains why you see three fewer symbols in the