This is the mail archive of the libstdc++@gcc.gnu.org mailing list for the libstdc++ project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [GSoC] __enable_shared_from_this_helper


On 28/04/15 22:16 -0700, Tim Shen wrote:
Just for background, maybe Jon's stackoverflow answer helps:
<http://stackoverflow.com/questions/13912286/intrusive-ptr-in-c11>

There are also design notes at
https://gcc.gnu.org/onlinedocs/libstdc++/manual/memory.html#std.util.memory.shared_ptr

On Tue, Apr 28, 2015 at 7:21 PM, Fan You <youfan.noey@gmail.com> wrote:
To make sure I understand the question.

Shared_count actually allocate the memory which is exact the same size
as single _Sp_cp_type object by doing this:

_Sp_cp_type* __mem = _Alloc_traits::allocate(__a2, 1);

This is correct.

However, assume we have some usage like this:

auto sp1 = std::experimental::make_shared<int>(5);

auto sp2 = std::experimental::make_shared<int[5]>(5);


Both of them use std::allocator<int> as an allocator. They will have

That's what you need to fix :).

same behavior when allocate memory (which only allocate single
_Sp_cp_type object.) And it's clearly not right to do something like:

_Sp_cp_type* __mem = _Alloc_traits::allocate(__a2, size);


So, what's the best method to allocate the right size for array type?

Looking at three derived classes of _Sp_counted_base:
1) _Sp_counted_ptr, a simple _M_ptr with ownership and default deleter
("delete _M_ptr").

2) _Sp_counted_deleter, a _M_ptr with user customized alloc and deleter.

3) _Sp_counted_ptr_inplace, a piece of *data* instead of a pointer,
who doesn't need to be deallocated separately.

The reason for 1) is less overhead; for 2) is user needs; for 3), is
less times of allocation.

Interestingly, I think _Sp_counted_ptr can be implemented in terms of
_Sp_counted_deleter, with default allocator and deleter, both of which
cost 0 bytes for storage, after ebo. Maybe it's just for readability?

I think originally I didn't use EBO for the deleter, only the
allocator, so _Sp_counted_ptr had less overhead. Now EBO is used for
both, and potentially we could have removed _Sp_counted_ptr.
For array support it would be fine to combine _Sp_counted_ptr and
_Sp_counted_deleter into a single _Sp_counted_array type that manages
an array and stores a deleter and an allocator.

The question worth to ask is: are these strategies can be used in
shared_ptr for array? For each of them, if so, how?
For 1), Use delete[].

For 2), User will offer allocator and deleter, but we still need to
maintain the length.

When the user allocates the array and passes it to the shared_ptr
constructor we don't need to store the length, because we won't be
deallocating it. The only deallocation is for the _Sp_counted_array
itself and we know the size of that.

For 3),

N.B. make_shared for arrays is *not* part of the Library Fundamentals
v1 TS. There is a proposal
(http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n3939.html)
but it hasn't been approved or included in any working paper.

That means make_shared<T[]> and make_shared<T[N]> are not required for
this GSoC project.

It would be great if they can be included, but could be left until
later. I'm fairly sure that the changes needed for make_shared to
support arrays will be largely independent of any other work and so
you don't need to consider it now while designing the
_Sp_counted_array type.

User will offer allocator, we maintain the length, let's say
n, and include n objects *all adjacent with other types of data*, like
two counters in _Sp_counted_base and the length. This is possible in C
using malloc and flexible member array, but as of my knowledge, it's
hard/impossible to implement this using C++ allocator?

Certainly not impossible, you don't even need a new type you could
just reuse the same _Sp_counted_array type:

const size_t aligned_sp = sizeof(_Sp_counted_array<T>) + alignof(_Sp_counted_array<T>) - 1;

const size_t aligned_array = n * sizeof(T) + alignof(T) - 1;

const size_t aligned_len = sizeof(size_t) + alignof(size_t) - 1;

using char_alloc = allocator_traits<Alloc>::rebind_alloc<char>;
using char_alloc_traits = allocator_traits<char_alloc>;

char_alloc ca(alloc);

auto p = char_alloc_traits::allocate(ca, aligned_sp
                                        + 1
                                        + aligned_len
                                        + aligned_array);

Now use std::align() and placement new to construct into that buffer,
with the following layout:

 Initial padding required to align _Sp_counted_array<T> correctly
 _Sp_counted_array<T>
 unsigned char storing size of initial padding
 padding necessary to align size_t correctly
 size_t storing length n
 padding necessary to align T correctly
 T[n]

The _Sp_counted_array::_M_ptr member would point to the start of the
T[] array. The length of the array is at a known offset to the start
of the _Sp_counted_array object.

To deallocate simply read the unsigned char that stores the initial
padding and subtract that from the address of the _Sp_counted_array
object to find the original pointer, then re-calculate the same
number of bytes, rebind the allocator to char again and deallocate.

This misses some optimization opportunities and wastes some space on
padding to be safe, but should work.

A smarter implementation would use a different type instead of
_Sp_counted_array and rely on the non-standard extension of flexible
array members e.g.

 template<typename T, typename Del, typename Alloc>
   struct _Sp_counted_array_inplace
   {
     struct Impl : ebo_helper<1, Del>, ebo_helper<2, Alloc>
     {
       size_t len; // array length
     };

     Impl impl;
     unsigned char initial_padding;
     T arr[];
   };

This takes care of the internal padding, so you only need to use
std::align to place the _Sp_counted_array_inplace object into the char
buffer (and record the offset in the initial_padding member).


It maybe
possible for a small sized _Tp, so we just allocate a few more slots
for our extra data (and also we need to handle alignment well); but if
_Tp is large, then there's no way to do this?

The conclusion is, adopt 1) for potential less overhead; adopt 2) for
user needs; drop 3) at least for now, since it's hard to implement.

Drop 3 for now because it's not required, maybe come back to it if
time permits.

Coming back to your question, I don't think you need to care about
make_shared optimization :)

Agreed.

But yeah, for implementing 2) you still need to propagate the length.

Why? Am I missing something?

The shared_ptr doesn't use the length in these examples:

 shared_ptr<int[]> p(new int[3]);

 shared_ptr<int[4]> q(new int[4]);

The shared_count only needs to allocate/deallocate a
_Sp_counted_array<int> in both cases, and that has a known size.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]