vector::clear calls erase(begin(), end()), which has an unnecessary call to memmove, in addition to a function call overhead. The fix results in a 2-instruction clear for simple types. Compiling the following (-O2 -save-temps) shows the difference: #include <vector> using std::vector; int main() { vector<int> v; asm("#start1"); v.clear(); asm("#end1"); asm("#start2"); v.erase(v.begin(), v.end()); asm("#end2"); } This was a notable performance difference between libstdc++ stl and stlport I found in profiling a large application.
Created attachment 9503 [details] change to vector::clear
Subject: Re: New: vector::clear should be manually inlined > > vector::clear calls erase(begin(), end()), which has an unnecessary call to > memmove, in addition to a function call overhead. The fix results in a > 2-instruction clear for simple types. > > Compiling the following (-O2 -save-temps) shows the difference: > > #include <vector> > > using std::vector; > > int main() { > vector<int> v; > > asm("#start1"); > v.clear(); > asm("#end1"); > > asm("#start2"); > v.erase(v.begin(), v.end()); > asm("#end2"); > } as I mentioned on IRC, just marking erase as inline or figuring out why erase is not being inlined into clear will fix this issue. To the point where we have an empty loop which is removed with -funsafe-loop-optimizations and a couple extra instructions which should be fixed with the tree combiner (but that is for 4.2 and not for 4.1). -- Pinski
Confirmed, there are two different issues here because it looks like the pool allocator injects code which initializes a variable, even though there is a different way of doing the initialization. It should be doing the say initialization as the cout/cin/cerr initializes its streams. This causes us when we call erase: if (*_ZGVZN9__gnu_cxx20__common_pool_policyINS_6__poolELb1EE11_S_get_poolEvE7_S_pool.62 == 0) goto <L12>; else goto <L2>; <L12>:; _S_pool.D.8154._M_options._M_align = 8; _S_pool.D.8154._M_options._M_max_bytes = 128; _S_pool.D.8154._M_options._M_min_bin = 8; _S_pool.D.8154._M_options._M_chunk_size = 4080; _S_pool.D.8154._M_options._M_max_threads = 4096; _S_pool.D.8154._M_options._M_freelist_headroom = 10; D.12526 = getenv (&"GLIBCXX_FORCE_NEW"[0]); _S_pool.D.8154._M_options._M_force_new = D.12526 != 0B; _S_pool.D.8154._M_binmap = 0B; _S_pool.D.8154._M_init = 0; _S_pool._M_bin = 0B; _S_pool._M_bin_size = 1; _S_pool._M_thread_freelist = 0B; _S_pool._M_once = 0; *_ZGVZN9__gnu_cxx20__common_pool_policyINS_6__poolELb1EE11_S_get_poolEvE7_S_pool.62 = 1; which blocks a lot of the other optimizations because there is a function call and there is always a branch. This is not a problem on PPC-darwin where we don't have this "junk". Second we don't inline erase into clear for some reason. I could not figure it out from the dumps maybe someone else will be able to. The rest is for the reporter really (and any one who is writting missed optimization bugs) A better way of seeing the different is: #include <vector> std::vector<int> v; void clear1() { v.clear(); } void clear2(){ v.erase(v.begin(), v.end()); } As using asm("#start1"); can be moved around and for your example at -O3, we actually get an empty function as v is known to be unused.
I see absolutely no difference between the code generated for clear1() and clear2() on powerpc-linux-gnu. This includes the final_cleanup tree dump and the assembly output using g++ -O2.
Fixed for 4.1.0.
Subject: Bug 23425 Author: paolo Date: Wed Nov 2 10:27:54 2005 New Revision: 106379 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=106379 Log: 2005-11-02 Thomas Kho <tkho@ucla.edu> PR libstdc++/23425 * include/bits/stl_vector.h (vector<>::clear): Open code in terms of _Destroy. 2005-11-02 Paolo Carlini <pcarlini@suse.de> * include/bits/vector.tcc (vector<>::_M_fill_assign): Qualify fill_n. Modified: trunk/libstdc++-v3/ChangeLog trunk/libstdc++-v3/include/bits/stl_vector.h trunk/libstdc++-v3/include/bits/vector.tcc