std::vector move assign patch
Jonathan Wakely
jwakely@redhat.com
Tue Apr 25 15:35:00 GMT 2017
On 25/04/17 17:23 +0200, Marc Glisse wrote:
>On Tue, 25 Apr 2017, Jonathan Wakely wrote:
>
>>On 24/04/17 22:10 +0200, Marc Glisse wrote:
>>>It seems that this patch had 2 consequences that may or may not
>>>have been planned. Consider this example (from PR64601)
>>>
>>>#include <vector>
>>>typedef std::vector<int> V;
>>>void f(V&v,V&w){ V(std::move(w)).swap(v); }
>>>void g(V&v,V&w){ v=std::move(w); }
>>>
>>>1) We generate shorter code for f than for g, probably since the
>>>fix for PR59738. g ends up zeroing v, copying w to v, and finally
>>>zeroing w, and for weird reasons (and because we swap the members
>>>one by one) the standard prevents us from assuming that v and w do
>>>not overlap in weird ways so we cannot optimize as much as one
>>>might expect.
>>
>>f has an additional precondition (that the allocators of the vectors
>>being swapped must propagate on swap or be equal) and so the swap code
>>doesn't have to worry about non-equal allocators.
>>
>>g has to be able to cope with the case where the allocator doesn't
>>propagate and isn't equal, and so is more complicated.
>>
>>However, the propagation trait is known at compile-time, and for the
>>common case so is the equality condition, so it's unfortunate if that
>>can't be simplified (I'm sure you've analysed it carefully already
>>though!)
>
>The code isn't horrible. With f, we get:
>
> movq (%rsi), %r8
> movq 8(%rsi), %rcx
> movq $0, (%rsi)
> movq $0, 8(%rsi)
> movq 16(%rsi), %rdx
> movq $0, 16(%rsi)
> movq (%rdi), %rax
> movq %rcx, 8(%rdi)
> movq %r8, (%rdi)
> movq %rdx, 16(%rdi)
> testq %rax, %rax
>
>which seems quite optimal: read each pointer from w, write them to v,
>write 0s in w, that's 9 memory operations, +1 to read the pointer from
>w and possibly call delete on it.
>
>With g:
>
> movq $0, 8(%rdi)
> movq (%rdi), %rax
> movq $0, 16(%rdi)
> movq $0, (%rdi)
> movq (%rsi), %rdx
> movq %rdx, (%rdi)
> movq 8(%rsi), %rcx
> movq $0, (%rsi)
> movq 8(%rdi), %rdx
> movq %rcx, 8(%rdi)
> movq 16(%rsi), %rcx
> movq %rdx, 8(%rsi)
> movq 16(%rdi), %rdx
> movq %rcx, 16(%rdi)
> movq %rdx, 16(%rsi)
> testq %rax, %rax
>
>That's only 5 more memory operations. If I tweak vector swapping to
>avoid calling swap on each member (which drops type-based aliasing
>information, that was the topic of PR64601)
I didn't really understand the discussion in the PR. I find that's
true of most TBAA discussions.
std::swap(T& x, T& y) is hard to optimise because we don't know that
the dynamic type of the thing at &x is the same type as T?
> void _M_swap_data(_Vector_impl& __x) _GLIBCXX_NOEXCEPT
> {
> pointer tmp;
>#define MARC(x,y) tmp=x; x=y; y=tmp
> MARC(_M_start, __x._M_start);
> MARC(_M_finish, __x._M_finish);
> MARC(_M_end_of_storage, __x._M_end_of_storage);
> }
>
>this gets down to 13, which is kind of sensible
>* 0 the elements of v -> 3 ops
>* read the elements of w -> 3 ops
>* write them to v -> 3 ops
>* 0 the elements of w -> 3 ops
>(+1 to get the pointer that we might call delete on)
>
>The first step of zeroing the elements of v is redundant
>* if v and w don't alias, we are going to overwrite those 0s in step 3
>without ever reading them
>* if v and w are the same, we are going to write those 0s in step 4 anyway
>
>but that's hard for the optimizers to notice.
>
>I didn't try hard to find a nice C++ way to get an equivalent of g
>that generates the optimal number of operations, but it would be a
>little ugly to write in operator=
>this->_M_impl._M_finish = x._M_impl._M_finish; x._M_impl._M_finish = 0;
>same for _M_end_of_storage and _M_start, and remembering to use the
>original this->_M_impl._M_start for delete.
I'm not opposed to writing it out by hand. Operations on std::vector
should be as fast as possible, and move-assignment should be cheap.
>>>2) g(v,v) seems to turn v into a nice empty vector,
>>
>>Yes.
>>
>>>while f(v,v) turns it into an invalid vector pointing at released memory.
>>
>>Does it?! I don't see that happening, and it's a bug if it does.
>
>Er, my fault, you are right. It shuffles things around and amounts to
>a NOP, v remains as it was before the call to f. Which could be
>considered less desirable than clearing the vector by some, but not as
>much as getting something invalid of course :-)
Phew :-)
More information about the Libstdc++
mailing list