This is the mail archive of the libstdc++@gcc.gnu.org mailing list for the libstdc++ project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: std::vector move assign patch

From: Jonathan Wakely <jwakely at redhat dot com>
To: libstdc++ at gcc dot gnu dot org
Cc: Jonathan Wakely <jwakely dot gcc at gmail dot com>, gcc-patches <gcc-patches at gcc dot gnu dot org>
Date: Tue, 25 Apr 2017 16:35:30 +0100
Subject: Re: std::vector move assign patch
Authentication-results: sourceware.org; auth=none
Authentication-results: ext-mx09.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com
Authentication-results: ext-mx09.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=jwakely at redhat dot com
Dkim-filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 2986D66CAC
Dmarc-filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 2986D66CAC
References: <52BDC6AD.5030207@gmail.com> <CAMe9rOok7WqKip7MAB885B_ZJonx5H3UpvUyZopiHur6W-tk2w@mail.gmail.com> <CAH6eHdSuwbuXzs0fmy4iJ6z+AUiFaKXNe1qhzzmNACnsdNKXhQ@mail.gmail.com> <alpine.DEB.2.20.1704220003540.2493@stedding.saclay.inria.fr> <20170425125253.GE5109@redhat.com> <alpine.DEB.2.20.1704251536320.2317@stedding.saclay.inria.fr>

On 25/04/17 17:23 +0200, Marc Glisse wrote:

On Tue, 25 Apr 2017, Jonathan Wakely wrote:
On 24/04/17 22:10 +0200, Marc Glisse wrote:
It seems that this patch had 2 consequences that may or may nothave been planned. Consider this example (from PR64601)
#include <vector>
typedef std::vector<int> V;
void f(V&v,V&w){ V(std::move(w)).swap(v); }
void g(V&v,V&w){ v=std::move(w); }
1) We generate shorter code for f than for g, probably since thefix for PR59738. g ends up zeroing v, copying w to v, and finallyzeroing w, and for weird reasons (and because we swap the membersone by one) the standard prevents us from assuming that v and w donot overlap in weird ways so we cannot optimize as much as onemight expect.
f has an additional precondition (that the allocators of the vectors
being swapped must propagate on swap or be equal) and so the swap code
doesn't have to worry about non-equal allocators.

g has to be able to cope with the case where the allocator doesn't
propagate and isn't equal, and so is more complicated.

However, the propagation trait is known at compile-time, and for the
common case so is the equality condition, so it's unfortunate if that
can't be simplified (I'm sure you've analysed it carefully already
though!)
The code isn't horrible. With f, we get:

       movq    (%rsi), %r8
       movq    8(%rsi), %rcx
       movq    $0, (%rsi)
       movq    $0, 8(%rsi)
       movq    16(%rsi), %rdx
       movq    $0, 16(%rsi)
       movq    (%rdi), %rax
       movq    %rcx, 8(%rdi)
       movq    %r8, (%rdi)
       movq    %rdx, 16(%rdi)
       testq   %rax, %rax
which seems quite optimal: read each pointer from w, write them to v,write 0s in w, that's 9 memory operations, +1 to read the pointer fromw and possibly call delete on it.
With g:

       movq    $0, 8(%rdi)
       movq    (%rdi), %rax
       movq    $0, 16(%rdi)
       movq    $0, (%rdi)
       movq    (%rsi), %rdx
       movq    %rdx, (%rdi)
       movq    8(%rsi), %rcx
       movq    $0, (%rsi)
       movq    8(%rdi), %rdx
       movq    %rcx, 8(%rdi)
       movq    16(%rsi), %rcx
       movq    %rdx, 8(%rsi)
       movq    16(%rdi), %rdx
       movq    %rcx, 16(%rdi)
       movq    %rdx, 16(%rsi)
       testq   %rax, %rax
That's only 5 more memory operations. If I tweak vector swapping toavoid calling swap on each member (which drops type-based aliasinginformation, that was the topic of PR64601)


I didn't really understand the discussion in the PR. I find that's
true of most TBAA discussions.

std::swap(T& x, T& y) is hard to optimise because we don't know that
the dynamic type of the thing at &x is the same type as T?

       void _M_swap_data(_Vector_impl& __x) _GLIBCXX_NOEXCEPT
       {
         pointer tmp;
#define MARC(x,y) tmp=x; x=y; y=tmp
         MARC(_M_start, __x._M_start);
         MARC(_M_finish, __x._M_finish);
         MARC(_M_end_of_storage, __x._M_end_of_storage);
       }

this gets down to 13, which is kind of sensible
* 0 the elements of v -> 3 ops
* read the elements of w -> 3 ops
* write them to v -> 3 ops
* 0 the elements of w -> 3 ops
(+1 to get the pointer that we might call delete on)

The first step of zeroing the elements of v is redundant
* if v and w don't alias, we are going to overwrite those 0s in step 3without ever reading them
* if v and w are the same, we are going to write those 0s in step 4 anyway

but that's hard for the optimizers to notice.
I didn't try hard to find a nice C++ way to get an equivalent of gthat generates the optimal number of operations, but it would be alittle ugly to write in operator=
this->_M_impl._M_finish = x._M_impl._M_finish; x._M_impl._M_finish = 0;
same for _M_end_of_storage and _M_start, and remembering to use theoriginal this->_M_impl._M_start for delete.


I'm not opposed to writing it out by hand. Operations on std::vector
should be as fast as possible, and move-assignment should be cheap.

2) g(v,v) seems to turn v into a nice empty vector,
Yes.
while f(v,v) turns it into an invalid vector pointing at released memory.
Does it?! I don't see that happening, and it's a bug if it does.
Er, my fault, you are right. It shuffles things around and amounts toa NOP, v remains as it was before the call to f. Which could beconsidered less desirable than clearing the vector by some, but not asmuch as getting something invalid of course :-)


Phew :-)

Follow-Ups:
- Re: std::vector move assign patch
  - From: Marc Glisse

References:
- Re: std::vector move assign patch
  - From: Marc Glisse
- Re: std::vector move assign patch
  - From: Jonathan Wakely
- Re: std::vector move assign patch
  - From: Marc Glisse

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]