Bug 60936 - [5 Regression] Binary code bloat with std::string
Summary: [5 Regression] Binary code bloat with std::string
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: libstdc++ (show other bugs)
Version: 4.9.0
: P2 normal
Target Milestone: 6.0
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-04-23 11:10 UTC by __vic
Modified: 2017-10-10 14:57 UTC (History)
3 users (show)

See Also:
Host: x86_64-linux
Target: x86_64-linux
Build: x86_64-linux
Known to work: 4.8.5, 6.4.1, 7.1.0
Known to fail: 4.9.4, 5.4.0, 5.5.0, 6.4.0
Last reconfirmed: 2017-09-18 00:00:00


Attachments
Using __throw_out_of_range (instead of __throw_out_of_range_fmt), if configured with --disable-libstdcxx-verbose (2.12 KB, patch)
2015-05-20 08:44 UTC, Markus Eisenmann
Details | Diff
Lightweight __throw_out_of_range_fmt for non-verbose builds (2.34 KB, patch)
2015-05-20 11:31 UTC, Jonathan Wakely
Details | Diff
Dirty patch for GCC 5/6 (1.48 KB, patch)
2016-04-21 08:03 UTC, __vic
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description __vic 2014-04-23 11:10:12 UTC
Test program:

#include<string>
int hello()
{
    std::string st("abc");
    return st.length();
}


Build:

$ g++ -shared -fPIC -static-libgcc -static-libstdc++ -Wl,-s -o $@ $?


Sizes of result:

gcc-4_7-string.so: 171744
gcc-4_8-string.so: 185808
gcc-4_9-string.so: 635960
Comment 1 __vic 2014-04-23 12:12:53 UTC
If we use iostream classes (without std::string) the difference isn't so dramatic:

4.7: 800320
4.8: 838944
4.9: 868664

May be it's connected with locales? Has std::string any dependences on it in 4.9?
Comment 2 __vic 2014-04-23 12:37:48 UTC
Non-stripped binary built with 4.9 has many symbols from locale. 4.8 - doesn't.
How std::string uses locales???
Comment 3 __vic 2014-07-17 10:00:00 UTC
4.9.1 - same results
Comment 4 Jonathan Wakely 2014-07-19 09:57:36 UTC
The __throw_range_error_fmt function added to basic_string::at() depends on locales.

We could avoid that dependency when the library is configured with --disable-libstdcxx-verbose
Comment 5 Jonathan Wakely 2014-07-19 10:32:50 UTC
Alternatively, we could just move the __int_to_char instantiations to a separate file from the locale facets.
Comment 6 __vic 2015-04-15 11:36:53 UTC
5.1-RC (gcc-5.1.0-RC-20150412) - the same problem. Suppose in GCC 6 too?
Comment 7 Jonathan Wakely 2015-04-15 11:38:38 UTC
Yes, because nothing has changed in this regard.
Comment 8 Jonathan Wakely 2015-04-15 11:40:33 UTC
I'll try to do something for 6.0
Comment 9 __vic 2015-04-15 13:40:58 UTC
For 4.9 this change was enough for me:

--- libstdc++-v3/src/c++11/functexcept.cc	2014-01-03 02:30:10.000000000 +0400
+++ libstdc++-v3/src/c++11/functexcept.cc	2014-11-06 18:40:20.000000000 +0300
@@ -89,6 +89,7 @@
   void
   __throw_out_of_range_fmt(const char* __fmt, ...)
   {
+    __throw_out_of_range(__fmt);/*
     const size_t __len = __builtin_strlen(__fmt);
     // We expect at most 2 numbers, and 1 short string. The additional
     // 512 bytes should provide more than enough space for expansion.
@@ -100,6 +101,7 @@
     __gnu_cxx::__snprintf_lite(__s, __alloca_size, __fmt, __ap);
     _GLIBCXX_THROW_OR_ABORT(out_of_range(_(__s)));
     va_end(__ap);  // Not reached.
+*/
   }
 
   void

But now it doesn't solve the problem.
Comment 10 __vic 2015-04-15 14:35:04 UTC
What brings new dependences on locales?
Comment 11 __vic 2015-04-23 07:38:47 UTC
Main problem hides in src/c++11/cow-string-inst.cc here:

namespace std _GLIBCXX_VISIBILITY(default)
{
_GLIBCXX_BEGIN_NAMESPACE_VERSION

  // These came from c++98/misc-inst.cc, repeat them for COW string
  // string related to iostreams.
  template 
    basic_istream<char>& 
    operator>>(basic_istream<char>&, string&);
  template 
    basic_ostream<char>& 
    operator<<(basic_ostream<char>&, const string&);
  template 
    basic_istream<char>& 
    getline(basic_istream<char>&, string&, char);
  template 
    basic_istream<char>& 
    getline(basic_istream<char>&, string&);

_GLIBCXX_END_NAMESPACE_VERSION
} // namespace

It pulls all iostream + locale.

On the whole, dependences between objects within libstdc++ a organized terribly. When I just use std::string almost all lib will be linked into my binary!!! If I remove snippet mentioned above + apply fix from Comment 9 this set of objects is linked:

condition_variable.o
cow-stdexcept.o
cow-string-inst.o
eh_throw.o
functexcept.o
functional.o
futex.o
future.o
ios_failure.o
regex.o
stdexcept.o
string-inst.o
system_error.o

WHY?! I just want std::string!
I use no futures, no regex, no ios::failure, etc... Is all this stuff realy necessary for my trivial program?
Comment 12 Markus Eisenmann 2015-05-20 08:44:13 UTC
Created attachment 35573 [details]
Using __throw_out_of_range (instead of __throw_out_of_range_fmt), if configured with --disable-libstdcxx-verbose

My patch (file gcc-4.9-pr60936.patch) is a fix/work-around as suggested by Jonathan in Comment 4. Calling __throw_out_of_range_fmt is replaced by the (simpler) function __throw_out_of_range(), if the gcc-build is configured with the option --disable-libstdcxx-verbose.

Note: I have used the previous call to __throw_out_of_range as used in GCC-release 4.8.4. Maybe the patch has to be applied with the option -p1 (or change the patch-file), because the path begins with 'gcc-4.9.2/' ...

Following source-files will be changed (by this patch):
 [gcc-4.9.2/] libstdc++-v3/include/bits/basic_string.h
 [gcc-4.9.2/] libstdc++-v3/include/bits/functexcept.h
 [gcc-4.9.2/] libstdc++-v3/include/bits/stl_bvector.h
 [gcc-4.9.2/] libstdc++-v3/include/bits/stl_deque.h
 [gcc-4.9.2/] libstdc++-v3/include/bits/stl_vector.h
 [gcc-4.9.2/] libstdc++-v3/include/debug/array
 [gcc-4.9.2/] libstdc++-v3/include/experimental/string_view
 [gcc-4.9.2/] libstdc++-v3/include/ext/vstring.h
 [gcc-4.9.2/] libstdc++-v3/include/profile/array
 [gcc-4.9.2/] libstdc++-v3/include/std/array
 [gcc-4.9.2/] libstdc++-v3/include/std/bitset
 [gcc-4.9.2/] libstdc++-v3/src/c++11/functexcept.cc
 [gcc-4.9.2/] libstdc++-v3/testsuite/util/exception/safety.h

Best regards,
Markus
Comment 13 Jonathan Wakely 2015-05-20 11:31:28 UTC
Created attachment 35575 [details]
Lightweight __throw_out_of_range_fmt for non-verbose builds

This is what I had in mind.
Comment 14 Jakub Jelinek 2015-12-21 15:20:17 UTC
Has this been applied or is it going to be for GCC 6?
Comment 15 Jonathan Wakely 2015-12-21 15:58:36 UTC
It hasn't, but I'll do it for GCC 6 so people who care about the resulting size can use the non-verbose option
Comment 16 Jonathan Wakely 2016-04-16 13:02:37 UTC
For the record, I tried this and didn't see any change in code size, so didn't commit anything.
Comment 17 __vic 2016-04-21 08:03:14 UTC
Created attachment 38319 [details]
Dirty patch for GCC 5/6

This dirty patch created for GCC5 solves the problem for GCC6 as well. (out_of_range will not contain pretty message)
Comment 18 Jakub Jelinek 2016-04-27 10:58:43 UTC
GCC 6.1 has been released.
Comment 19 __vic 2016-08-22 07:41:44 UTC
No plans for 6.2?
Comment 20 __vic 2016-08-23 10:08:53 UTC
Patch attachment 38319 [details] solves the problem for GCC 6.2 as well
Comment 21 Jonathan Wakely 2016-08-23 11:37:31 UTC
Yes, but it removes functionality, and we don't want to do that unconditionally.
Comment 22 __vic 2016-08-23 11:40:14 UTC
Of course. It's comment for people like me who needs solution right now (actually since 2014...)
Comment 23 __vic 2016-12-26 08:18:36 UTC
Jonathan, have you tried to merge you patch with mine? Yours lacks decoupling of string and iostream in c++11/cow-string-inst.o. I think it's a reason why code size was unaffected
Comment 24 Jakub Jelinek 2017-02-01 16:06:02 UTC
Have you tried linking with -Wl,--gc-sections ?
Comment 25 Jonathan Wakely 2017-02-01 16:33:29 UTC
I have, and it doesn't make much difference:

    175800      gcc48.so
    608280      gcc49.so
   1159624      gcc5-cow.so
   1176296      gcc6-cow-gc.so
   1176296      gcc6-cow.so
   1180400      gcc6-gc.so
   1180400      gcc6.so
    258376      gcc7-cow-patched.so
   1188568      gcc7-cow-trunk.so
    258384      gcc7-patched.so
   1188552      gcc7-trunk.so

-cow means using -D_GLIBCXX_CXX11_ABI=0

-gc means using -Wl,--gc-sections

-patched means I've split the explicit instantiations into separate files and changed __snprintf_lite to not use __int_to_char from locale-inst.o

The patch makes a big difference.
Comment 26 Jonathan Wakely 2017-02-02 16:52:54 UTC
(In reply to __vic from comment #11)
> condition_variable.o
> cow-stdexcept.o
> cow-string-inst.o
> eh_throw.o
> functexcept.o
> functional.o
> futex.o
> future.o
> ios_failure.o
> regex.o
> stdexcept.o
> string-inst.o
> system_error.o

These dependencies are because src/c++11/functexcept.o requires the destructors and vtables of all exception types. By moving __throw_bad_function_call and __throw_regex_error and __throw_future_error into separate files we can get the testcase smaller still:

233440 gcc7-splitexcept.so

And splitting up src/c++11/system_error.o gives:

225160 gcc7-splitexcept2.so
Comment 27 Jonathan Wakely 2017-02-03 18:59:37 UTC
Author: redi
Date: Fri Feb  3 18:59:05 2017
New Revision: 245162

URL: https://gcc.gnu.org/viewcvs?rev=245162&root=gcc&view=rev
Log:
PR libstdc++/60936 reduce coupling between objects in libstdc++.a

Move explicit instantiation definitions for string I/O functions into
their own files so that iostream and locale definitions are not needed
for uses of strings without I/O. Move functions for throwing C++11
exceptions into the individual files defining the exception types, so
that using any of the functions from functexcept.cc doesn't pull in
large pieces of the C++11 library. Finally, avoid using __int_to_char in
snprintf_lite.cc to avoid pulling in locale-inst.cc for one function.

	PR libstdc++/60936
	* src/c++11/Makefile.am: Add new files.
	* src/c++11/Makefile.in: Regenerate.
	* src/c++11/cow-string-inst.cc [!_GLIBCXX_USE_CXX11_ABI]
	(operator<<, operator>>, getline): Move explicit instantiations to ...
	* src/c++11/cow-string-io-inst.cc: ... new file.
	* src/c++11/cow-wstring-inst.cc [!_GLIBCXX_USE_CXX11_ABI]
	(operator<<, operator>>, getline): Move explicit instantiations to ...
	* src/c++11/cow-wstring-io-inst.cc: ... new file.
	* src/c++11/functexcept.cc (__throw_ios_failure, __throw_system_error)
	(__throw_future_error, __throw_bad_function_call):
	(__throw_regex_error): Move functions for C++11 exceptions to the
	files that define the exception types.
	* src/c++11/functional.cc (__throw_bad_function_call): Move here.
	* src/c++11/future.cc (__throw_future_error): Likewise.
	* src/c++11/ios.cc (__throw_ios_failure): Likewise.
	* src/c++11/regex.cc (__throw_regex_error): Likewise.
	* src/c++11/snprintf_lite.cc (__concat_size_t): Print decimal
	representation directly instead of calling __int_to_char.
	* src/c++11/sso_string.cc (__sso_string): New file for definition
	of __sso_string type.
	* src/c++11/string-io-inst.cc [_GLIBCXX_USE_CXX11_ABI]: New file for
	explicit instantiations of narrow string I/O functions.
	* src/c++11/system_error.cc (__throw_system_error): Move here.
	(__sso_string): Move to new file.
	* src/c++11/wstring-io-inst.cc [_GLIBCXX_USE_CXX11_ABI]: New file for
	explicit instantiations of wide string I/O functions.
	* src/c++98/misc-inst.cc [_GLIBCXX_USE_CXX11_ABI] (operator<<)
	(operator>>, getline): Remove explicit instantiations from here.

Added:
    trunk/libstdc++-v3/src/c++11/cow-string-io-inst.cc
      - copied, changed from r245159, trunk/libstdc++-v3/src/c++11/cow-wstring-inst.cc
    trunk/libstdc++-v3/src/c++11/cow-wstring-io-inst.cc
      - copied, changed from r245159, trunk/libstdc++-v3/src/c++11/cow-wstring-inst.cc
    trunk/libstdc++-v3/src/c++11/sso_string.cc
      - copied, changed from r245159, trunk/libstdc++-v3/src/c++11/system_error.cc
    trunk/libstdc++-v3/src/c++11/string-io-inst.cc
      - copied, changed from r245159, trunk/libstdc++-v3/src/c++11/functional.cc
    trunk/libstdc++-v3/src/c++11/wstring-io-inst.cc
      - copied, changed from r245159, trunk/libstdc++-v3/src/c++11/cow-wstring-inst.cc
Modified:
    trunk/libstdc++-v3/ChangeLog
    trunk/libstdc++-v3/src/c++11/Makefile.am
    trunk/libstdc++-v3/src/c++11/Makefile.in
    trunk/libstdc++-v3/src/c++11/cow-string-inst.cc
    trunk/libstdc++-v3/src/c++11/cow-wstring-inst.cc
    trunk/libstdc++-v3/src/c++11/functexcept.cc
    trunk/libstdc++-v3/src/c++11/functional.cc
    trunk/libstdc++-v3/src/c++11/future.cc
    trunk/libstdc++-v3/src/c++11/ios.cc
    trunk/libstdc++-v3/src/c++11/regex.cc
    trunk/libstdc++-v3/src/c++11/snprintf_lite.cc
    trunk/libstdc++-v3/src/c++11/system_error.cc
    trunk/libstdc++-v3/src/c++98/misc-inst.cc
Comment 28 Markus Eisenmann 2017-02-10 11:30:28 UTC
Hi!

@Jonathan:
Do you have any plans to backport/migrate these changes to the GCC 5 and/or 6 branch, to be provided/included on a next release?

An "official" fix would be much better (C++-development on embedded targets), than waiting for (stable) GCC 7 or maintain a personal patched variant or work-around.

Best regards from Salzburg,
Markus
Comment 29 Jeffrey A. Law 2017-02-14 18:01:54 UTC
Addressed on the trunk.
Comment 30 __vic 2017-02-15 07:52:44 UTC
Excellent job, Jonathan! With gcc-7-20170212 the binaries are slim again.
Comment 31 Markus Eisenmann 2017-02-16 09:35:10 UTC
Hi!

There's a minor failure in the (patched) function __concat_size_t (within snprintf_lite.cc):

size_t __len = __out - __cs;

Calculates the remaining/unsused characters in the buffer __cs!
Therefore the resulting string is (mostly) malformed/truncated.

... should be ...

size_t __len = __cs + __ilen - __out;

BR,
Markus
Comment 32 Jonathan Wakely 2017-02-16 11:00:04 UTC
Good catch, thanks
Comment 33 Jonathan Wakely 2017-02-16 12:07:01 UTC
Author: redi
Date: Thu Feb 16 12:06:28 2017
New Revision: 245505

URL: https://gcc.gnu.org/viewcvs?rev=245505&root=gcc&view=rev
Log:
PR libstdc++/60936 fix length calculation

	PR libstdc++/60936
	* src/c++11/snprintf_lite.cc (__concat_size_t): Calculate length
	written to buffer, not length remaining in buffer.

Modified:
    trunk/libstdc++-v3/ChangeLog
    trunk/libstdc++-v3/src/c++11/snprintf_lite.cc
Comment 34 __vic 2017-08-21 10:37:26 UTC
Fixed in 7.1.
Shouldn't we close this bug?
Comment 35 Richard Biener 2017-08-21 10:41:33 UTC
We keep regression bugs open until all maintained branches close to be able to correctly set known-to-fail
Comment 36 Jonathan Wakely 2017-09-18 12:57:36 UTC
Author: redi
Date: Mon Sep 18 12:57:05 2017
New Revision: 252925

URL: https://gcc.gnu.org/viewcvs?rev=252925&root=gcc&view=rev
Log:
PR libstdc++/60936 reduce coupling between objects in libstdc++.a

Backport from mainline
2017-02-03  Jonathan Wakely  <jwakely@redhat.com>

	PR libstdc++/60936
	* src/c++11/Makefile.am: Add new files.
	* src/c++11/Makefile.in: Regenerate.
	* src/c++11/cow-string-inst.cc [!_GLIBCXX_USE_CXX11_ABI]
	(operator<<, operator>>, getline): Move explicit instantiations to ...
	* src/c++11/cow-string-io-inst.cc: ... new file.
	* src/c++11/cow-wstring-inst.cc [!_GLIBCXX_USE_CXX11_ABI]
	(operator<<, operator>>, getline): Move explicit instantiations to ...
	* src/c++11/cow-wstring-io-inst.cc: ... new file.
	* src/c++11/functexcept.cc (__throw_ios_failure, __throw_system_error)
	(__throw_future_error, __throw_bad_function_call):
	(__throw_regex_error): Move functions for C++11 exceptions to the
	files that define the exception types.
	* src/c++11/functional.cc (__throw_bad_function_call): Move here.
	* src/c++11/future.cc (__throw_future_error): Likewise.
	* src/c++11/ios.cc (__throw_ios_failure): Likewise.
	* src/c++11/regex.cc (__throw_regex_error): Likewise.
	* src/c++11/snprintf_lite.cc (__concat_size_t): Print decimal
	representation directly instead of calling __int_to_char.
	* src/c++11/sso_string.cc (__sso_string): New file for definition
	of __sso_string type.
	* src/c++11/string-io-inst.cc [_GLIBCXX_USE_CXX11_ABI]: New file for
	explicit instantiations of narrow string I/O functions.
	* src/c++11/system_error.cc (__throw_system_error): Move here.
	(__sso_string): Move to new file.
	* src/c++11/wstring-io-inst.cc [_GLIBCXX_USE_CXX11_ABI]: New file for
	explicit instantiations of wide string I/O functions.
	* src/c++98/misc-inst.cc [_GLIBCXX_USE_CXX11_ABI] (operator<<)
	(operator>>, getline): Remove explicit instantiations from here.

Added:
    branches/gcc-6-branch/libstdc++-v3/src/c++11/cow-string-io-inst.cc
      - copied, changed from r252920, branches/gcc-6-branch/libstdc++-v3/src/c++11/cow-wstring-inst.cc
    branches/gcc-6-branch/libstdc++-v3/src/c++11/cow-wstring-io-inst.cc
      - copied, changed from r252920, branches/gcc-6-branch/libstdc++-v3/src/c++11/cow-wstring-inst.cc
    branches/gcc-6-branch/libstdc++-v3/src/c++11/sso_string.cc
      - copied, changed from r252920, branches/gcc-6-branch/libstdc++-v3/src/c++11/system_error.cc
    branches/gcc-6-branch/libstdc++-v3/src/c++11/string-io-inst.cc
      - copied, changed from r252920, branches/gcc-6-branch/libstdc++-v3/src/c++11/functional.cc
    branches/gcc-6-branch/libstdc++-v3/src/c++11/wstring-io-inst.cc
      - copied, changed from r252920, branches/gcc-6-branch/libstdc++-v3/src/c++11/cow-wstring-inst.cc
Modified:
    branches/gcc-6-branch/libstdc++-v3/ChangeLog
    branches/gcc-6-branch/libstdc++-v3/src/c++11/Makefile.am
    branches/gcc-6-branch/libstdc++-v3/src/c++11/Makefile.in
    branches/gcc-6-branch/libstdc++-v3/src/c++11/cow-string-inst.cc
    branches/gcc-6-branch/libstdc++-v3/src/c++11/cow-wstring-inst.cc
    branches/gcc-6-branch/libstdc++-v3/src/c++11/functexcept.cc
    branches/gcc-6-branch/libstdc++-v3/src/c++11/functional.cc
    branches/gcc-6-branch/libstdc++-v3/src/c++11/future.cc
    branches/gcc-6-branch/libstdc++-v3/src/c++11/ios.cc
    branches/gcc-6-branch/libstdc++-v3/src/c++11/regex.cc
    branches/gcc-6-branch/libstdc++-v3/src/c++11/snprintf_lite.cc
    branches/gcc-6-branch/libstdc++-v3/src/c++11/system_error.cc
    branches/gcc-6-branch/libstdc++-v3/src/c++98/misc-inst.cc
Comment 37 Jonathan Wakely 2017-09-18 13:00:39 UTC
Fixed for 6.5 as well. I'm probably not going to change this on the gcc-5-branch though.
Comment 38 Markus Eisenmann 2017-09-18 13:32:39 UTC
Hi Jonathan!

It seems, that the minor change/fix from comment #31 (rev. 245505) is currently missing in branches/gcc-6-branch/libstdc++-v3/src/c++11/snprintf_lite.cc; needs also to be merged into this.

Best regards from Salzburg,
Markus
Comment 39 Jonathan Wakely 2017-09-20 12:02:16 UTC
Author: redi
Date: Wed Sep 20 12:01:44 2017
New Revision: 253007

URL: https://gcc.gnu.org/viewcvs?rev=253007&root=gcc&view=rev
Log:
PR libstdc++/60936 fix length calculation

Backport from mainline
2017-02-16  Jonathan Wakely  <jwakely@redhat.com>

	PR libstdc++/60936
	* src/c++11/snprintf_lite.cc (__concat_size_t): Calculate length
	written to buffer, not length remaining in buffer.

Backport from mainline
2017-02-08  Gerald Pfeifer  <gerald@pfeifer.com>

	* src/c++11/snprintf_lite.cc (__err): Update bug reporting URL.

Modified:
    branches/gcc-6-branch/libstdc++-v3/ChangeLog
    branches/gcc-6-branch/libstdc++-v3/src/c++11/snprintf_lite.cc
Comment 40 Jakub Jelinek 2017-10-10 13:37:12 UTC
GCC 5 branch has been closed, should be fixed in GCC 6 and later.