While we cannot make std::pair or std::tuple trivial for now for ABI reasons, it should still be safe to use memcpy-type optimizations for them when it is safe for each member. We probably don't want to lie to the user in is_trivial or is_trivially_copyable (?), but we could at least introduce an internal version of those traits, specialized for a few types like pair/tuple, and use them so _GLIBCXX_MOVE_BACKWARD3 would use memmove on std::pair<int,int> for instance. According to some benchmark, this might (they weren't exactly testing this) change the average performance of insert/erase in boost::flat_map<,,std::vector<>> by a factor of 2. Of course, it won't affect the default flat_map, which uses boost's vector and traits, so it isn't a real solution, just a small band-aid. The exact traits to specialize depend on PR 68350. #include <vector> #include <utility> #ifdef FAST struct A { int first,second; A(int a,int b):first(a),second(b){} A()=default; }; #else typedef std::pair<int,int> A; #endif typedef std::vector<A> V; int main(int argc,char**){ V v; for(int i=0;i<100000;++i){ v.insert(v.begin(),{i,i}); } return v[argc].second; } At -O3, I get 3.41s for std::pair, 1.00s for the struct, and an intermediate 1.99s for the struct minus the default constructor.
I suspect we could optimize this at the gimple level too: ``` <bb 9> [local count: 4971102460]: # __last_57 = PHI <_47(9), _45(7)> _47 = __last_57 + 18446744073709551608; _50 = MEM[(int *)_47]; MEM[(int *)_47 + 8B] = _50; _51 = MEM[(int *)_47 + 4B]; MEM[(int *)_47 + 12B] = _51; if (v$_M_start_36 != _47) goto <bb 9>; [89.00%] else goto <bb 8>; [11.00%] ``` We should detect this as a memmove.