[Bug libstdc++/87106] New: Group move and destruction of the source, where possible, for speed

glisse at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Sun Aug 26 10:23:00 GMT 2018


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87106

            Bug ID: 87106
           Summary: Group move and destruction of the source, where
                    possible, for speed
           Product: gcc
           Version: 9.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: enhancement
          Priority: P3
         Component: libstdc++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: glisse at gcc dot gnu.org
  Target Milestone: ---

Just a random testcase so I can give numbers, I don't claim this is a good
testcase at all

#include <string>
#include <vector>

__attribute__((flatten))
void f(){
  int n = 1024*1024;
  std::vector<std::string> v(n);
  v.resize(n+1);
}
int main(){
  for(int i=0;i<256;++i) f();
}

runs in about 2.4s now. In _M_default_append, we have a first loop that copies
(moves) strings from old to new, and a second loop that destroys old. If I
comment out the destroying loop (not something we should do in general, this is
just for the numbers), the running time goes down to 2.0s. If I replace the 2
loops with a single loop that does both move and destroy, the running time is
now 1.6s. Move+destroy (aka destructive move, relocation, etc) are 2 operations
that go well together and are not unlikely to simplify. Ideally the compiler
would merge the 2 loops (loop fusion) for us, but it doesn't. Doing the
operations in this order is only valid here because std::string can be
moved+destroyed nothrow.

I think it would be nice to introduce a special case for nothrow-relocatable
types in several functions for std::vector (_M_default_append is just one among
several, and probably not the most important one). If that makes the code
simpler, we could use if constexpr and limit the optimization to recent
standards. If one of the relocation papers ever makes it through the committee,
it will likely require this optimization (or at least make it an important QoI
point).

There are probably places outside of vector that could also benefit, but vector
looks like a good starting point.


More information about the Gcc-bugs mailing list