This is the mail archive of the
libstdc++@gcc.gnu.org
mailing list for the libstdc++ project.
Re: [PATCH] reduce size penalty for including C++11 <algorithm> on x86 systems
- From: Jonathan Wakely <jwakely at redhat dot com>
- To: Nathan Froyd <froydnj at mozilla dot com>
- Cc: gcc-patches at gcc dot gnu dot org, libstdc++ at gcc dot gnu dot org
- Date: Tue, 13 Oct 2015 19:27:16 +0100
- Subject: Re: [PATCH] reduce size penalty for including C++11 <algorithm> on x86 systems
- Authentication-results: sourceware.org; auth=none
- References: <1444787059-3050-1-git-send-email-froydnj at mozilla dot com>
On 13/10/15 21:44 -0400, Nathan Froyd wrote:
Including <algorithm> in C++11 mode (typically done for
std::{min,max,swap}) includes <random>, for
std::uniform_int_distribution. On x86 platforms, <random> manages to
drag in <x86intrin.h> through x86's opt_random.h header, and
<x86intrin.h> has gotten rather large recently with the addition of AVX
intrinsics. The comparison between C++03 mode and C++11 mode is not
quite exact, but it gives an idea of the penalty we're talking about
here:
froydnj@thor:~/src$ echo '#include <algorithm>' | g++ -x c++ - -o - -E -std=c++11 | wc
53460 127553 1401268
froydnj@thor:~/src$ echo '#include <algorithm>' | g++ -x c++ - -o - -E -std=c++03 | wc
9202 18933 218189
That's approximately a 7x penalty in C++11 mode (granted, C++11 includes
more than just <x86intrin.h>) with GCC 4.9.2 on a Debian system; current
mainline is somewhat worse:
froydnj@thor: gcc-build$ echo '#include <algorithm>' | xgcc [...] -std=c++11 | wc
84851 210475 2369616
froydnj@thor: gcc-build$ echo '#include <algorithm>' | xgcc [...] -std=c++03 | wc
9383 19402 239676
<x86intrin.h> itself clocks in at 1.3MB+ of preprocessed text.
Yep, that's been bothering me for a while.
This patch aims to reduce that size penalty by recognizing that both of
the places that #include <x86intrin.h> do not need the full set of x86
intrinsics, but can get by with a smaller, more focused header in each
case. <ext/random> needs only <emmintrin.h> to declare __m128i, while
x86's opt_random.h must include <pmmintrin.h> for declarations of
various intrinsic functions.
The net result is that the size of mainline's <algorithm> is significantly reduced:
froydnj@thor: gcc-build$ echo '#include <algorithm>' | xgcc [...] -std=c++11 | wc
39174 88538 1015281
which seems like a win.
Indeed!
Bootstrapped on x86_64-pc-linux-gnu with --enable-languages=c,c++,
tested with check-target-libstdc++-v3, no regressions. Also verified
that <algorithm> and <ext/random> pass -fsyntax-check with
-march=native (on a recent Haswell chip); if an -march=native bootstrap
is necessary, I am happy to do that if somebody instructs me in getting
everything properly set up.
OK?
OK, thanks.