Bug 23425 - vector::clear should be manually inlined
Summary: vector::clear should be manually inlined
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: libstdc++ (show other bugs)
Version: 4.1.0
: P2 enhancement
Target Milestone: 4.1.0
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2005-08-16 18:24 UTC by Thomas Kho
Modified: 2005-11-02 10:29 UTC (History)
3 users (show)

See Also:
Host:
Target: i686-*-linux-gnu
Build:
Known to work:
Known to fail:
Last reconfirmed: 2005-08-17 02:23:28


Attachments
change to vector::clear (263 bytes, patch)
2005-08-16 18:25 UTC, Thomas Kho
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Thomas Kho 2005-08-16 18:24:21 UTC
vector::clear calls erase(begin(), end()), which has an unnecessary call to
memmove, in addition to a function call overhead. The fix results in a
2-instruction clear for simple types.

Compiling the following (-O2 -save-temps) shows the difference:

#include <vector>

using std::vector;

int main() {
  vector<int> v;

  asm("#start1");
  v.clear();
  asm("#end1");

  asm("#start2");
  v.erase(v.begin(), v.end());
  asm("#end2");
}

This was a notable performance difference between libstdc++ stl and stlport I
found in profiling a large application.
Comment 1 Thomas Kho 2005-08-16 18:25:52 UTC
Created attachment 9503 [details]
change to vector::clear
Comment 2 Andrew Pinski 2005-08-16 18:44:13 UTC
Subject: Re:  New: vector::clear should be manually inlined

> 
> vector::clear calls erase(begin(), end()), which has an unnecessary call to
> memmove, in addition to a function call overhead. The fix results in a
> 2-instruction clear for simple types.
> 
> Compiling the following (-O2 -save-temps) shows the difference:
> 
> #include <vector>
> 
> using std::vector;
> 
> int main() {
>   vector<int> v;
> 
>   asm("#start1");
>   v.clear();
>   asm("#end1");
> 
>   asm("#start2");
>   v.erase(v.begin(), v.end());
>   asm("#end2");
> }

as I mentioned on IRC, just marking erase as inline or figuring out why erase is not being inlined into clear
will fix this issue.  To the point where we have an empty loop which is removed with -funsafe-loop-optimizations
and a couple extra instructions which should be fixed with the tree combiner (but that is for 4.2 and not for
4.1).

-- Pinski
Comment 3 Andrew Pinski 2005-08-17 02:23:25 UTC
Confirmed, there are two different issues here because it looks like the pool allocator injects code 
which initializes a variable, even though there is a different way of doing the initialization.  It should be 
doing the say initialization as the cout/cin/cerr initializes its streams.
This causes us when we call erase:
  if (*_ZGVZN9__gnu_cxx20__common_pool_policyINS_6__poolELb1EE11_S_get_poolEvE7_S_pool.62 == 
0) goto <L12>; else goto <L2>;

<L12>:;
  _S_pool.D.8154._M_options._M_align = 8;
  _S_pool.D.8154._M_options._M_max_bytes = 128;
  _S_pool.D.8154._M_options._M_min_bin = 8;
  _S_pool.D.8154._M_options._M_chunk_size = 4080;
  _S_pool.D.8154._M_options._M_max_threads = 4096;
  _S_pool.D.8154._M_options._M_freelist_headroom = 10;
  D.12526 = getenv (&"GLIBCXX_FORCE_NEW"[0]);
  _S_pool.D.8154._M_options._M_force_new = D.12526 != 0B;
  _S_pool.D.8154._M_binmap = 0B;
  _S_pool.D.8154._M_init = 0;
  _S_pool._M_bin = 0B;
  _S_pool._M_bin_size = 1;
  _S_pool._M_thread_freelist = 0B;
  _S_pool._M_once = 0;
  *_ZGVZN9__gnu_cxx20__common_pool_policyINS_6__poolELb1EE11_S_get_poolEvE7_S_pool.62 = 1;

which blocks a lot of the other optimizations because there is a function call and there is always a 
branch.  This is not a problem on PPC-darwin where we don't have this "junk".

Second we don't inline erase into clear for some reason.  I could not figure it out from the dumps 
maybe someone else will be able to.

The rest is for the reporter really (and any one who is writting missed optimization bugs)
A better way of seeing the different is:

#include <vector>
std::vector<int> v;
void clear1() { v.clear(); }
void clear2(){  v.erase(v.begin(), v.end()); }

As using asm("#start1"); can be moved around and for your example at -O3, we actually get an empty 
function as v is known to be unused.
Comment 4 James A. Morrison 2005-08-18 07:21:23 UTC
 I see absolutely no difference between the code generated for clear1() and
clear2() on powerpc-linux-gnu.  This includes the final_cleanup tree dump and
the assembly output using g++ -O2.
Comment 5 Paolo Carlini 2005-11-02 10:29:01 UTC
Fixed for 4.1.0.
Comment 6 paolo@gcc.gnu.org 2005-11-02 16:24:28 UTC
Subject: Bug 23425

Author: paolo
Date: Wed Nov  2 10:27:54 2005
New Revision: 106379

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=106379
Log:
2005-11-02  Thomas Kho  <tkho@ucla.edu>

	PR libstdc++/23425
	* include/bits/stl_vector.h (vector<>::clear): Open code
	in terms of _Destroy.

2005-11-02  Paolo Carlini  <pcarlini@suse.de>

	* include/bits/vector.tcc (vector<>::_M_fill_assign): Qualify fill_n.

Modified:
    trunk/libstdc++-v3/ChangeLog
    trunk/libstdc++-v3/include/bits/stl_vector.h
    trunk/libstdc++-v3/include/bits/vector.tcc