Created attachment 34807 [details] testcase Even in C++11 mode, even with the C++11 alignas keyword, element alignment isn't honored by std::vector. $ g++ vectoralign.cpp -o a --std=c++11 && ./a ctor 0x2422010 ctor 0x2422050 Tested with a freshly built GCC 4.9.2 on Ubuntu x86-64. Attaching testcase. Note: this is morphing bug 53900.
Homologous libc++ bug report: http://llvm.org/bugs/show_bug.cgi?id=22634
IMHO the only sensible solution is in this direction: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3396.htm I hope Clark is still working on this...
I'd be interested in an explanation of why the default STL allocator can't just honor the alignment of the value_type ? (The document linked in comment 2 seemed to assume that that goes without saying?)
(In reply to Benoit Jacob from comment #3) > I'd be interested in an explanation of why the default STL allocator can't > just honor the alignment of the value_type ? [allocator.members] "It is implementation-defined whether over-aligned types are supported" so we don't really have to. "the storage is obtained by calling ::operator new(std::size_t)" so we can't use posix_memalign (providing a separate allocator that does could be a good idea though). We could, when the type is over-aligned, do the usual trick of requesting too much memory so we have enough margin to find a suitably aligned region inside, and write a marker before so we remember where the real allocation started. That might be what was tried in PR55727 (I didn't check), the maintainers might be ok with this if someone posts a patch. But calling new T would still be broken. It seems better in the long term to add the aligned versions of operator new and make both new and std::allocator use those (in the standard). Note that there would be ABI issues switching from the trick in the previous paragraph to this.
So while the standard says that over-aligned types dont have to be supported, it also says in 3.11/9 in http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4296.pdf that: > If a request for a specific extended alignment in a specific context is not supported by an implementation, > the program is ill-formed. Additionally, a request for runtime allocation of dynamic storage for which the > requested alignment cannot be honored shall be treated as an allocation failure In my naive understanding, that sounds like if over-aligned allocation is not supported then it must be an allocation failure (i.e. not fail silently to honor alignment). That's relevant because failing all over-allocated allocations is probably not something that a compiler could do in the real world (that would break a lot of existing software) and so this 3.11/9 clause might then be de-facto forcing compilers to support over-allocated allocation. What do you think? How else would you interprete 3.11/9?
(In reply to Marc Glisse from comment #4) > "the storage is obtained by > calling ::operator new(std::size_t)" so we can't use posix_memalign Ouch. That's very unfortunate. I see. I would still be interested in how you understand 3.11/9 and how to reconcile it with that without breaking a lot of software.
(In reply to Benoit Jacob from comment #6) > I would still be interested in how you understand 3.11/9 I consider it a defect in the standard, so it needs fixing, not understanding...
If there is a defect in the standard, isn't it in the part that forces the compiler to not use the useful type information that it has, that is, the above-quoted "the storage is obtained by calling ::operator new(std::size_t)" ?
s/compiler/standard library
(In reply to Benoit Jacob from comment #6) > (In reply to Marc Glisse from comment #4) > > "the storage is obtained by > > calling ::operator new(std::size_t)" so we can't use posix_memalign > > Ouch. That's very unfortunate. I see. I would still be interested in how you > understand 3.11/9 and how to reconcile it with that without breaking a lot > of software. But ::operator new(std::size_t) could always return memory aligned for the most over-aligned type? Thus our default new implementation could use posix_memalign (..., MIN (size, BIGGEST_ALIGNMENT), size)? If the user provides its own ::new then he is on its own, of course (and doing that and using posix_memalign in it would be a workaround for this issue?!).
(In reply to Richard Biener from comment #10) > But ::operator new(std::size_t) could always return memory aligned for the > most over-aligned type? Thus our default new implementation could use > posix_memalign (..., MIN (size, BIGGEST_ALIGNMENT), size)? The problem is there are lots of use cases for really high alignment: AVX brings uses of 32-byte alignment, and cache lines are typically >= 64 byte aligned. So alignas has to support at least 64 or 128 byte alignment to support cache friendly code, and even without that, it would have to support at least 32 byte alignment for AVX vectorized code. Making all allocations that large would lead to substantial allocator slop. For example, jemalloc has a quantum of 8 or 16 byte depending on whether the arch is 32 or 64 bit, so increasing it to 32, 64 or 128 byte would be a big difference. > > If the user provides its own ::new then he is on its own, of course I agree that's how I would like things to be. Unfortunately, the spec quote in comment 4, "the storage is obtained by calling ::operator new(std::size_t)", goes in the opposite direction, by requiring allocators to use the type-blind ::new , thereby losing useful type information such as the type's alignment. That's why I think that this is the spec's weak spot that might be questioned. but you're the expert, not me :) > (and > doing that and using posix_memalign in it would be a workaround for this > issue?!). That could be a good compromise for many applications, that either don't care about minimizing memory usage, or don't do many tiny allocations. Unfortunately, my context here is a library (Eigen), so we can't make this decision for all our users.
(In reply to Benoit Jacob from comment #11) > (In reply to Richard Biener from comment #10) > > But ::operator new(std::size_t) could always return memory aligned for the > > most over-aligned type? Thus our default new implementation could use > > posix_memalign (..., MIN (size, BIGGEST_ALIGNMENT), size)? > > The problem is there are lots of use cases for really high alignment: AVX > brings uses of 32-byte alignment, and cache lines are typically >= 64 byte > aligned. So alignas has to support at least 64 or 128 byte alignment to > support cache friendly code, and even without that, it would have to support > at least 32 byte alignment for AVX vectorized code. > > Making all allocations that large would lead to substantial allocator slop. > For example, jemalloc has a quantum of 8 or 16 byte depending on whether the > arch is 32 or 64 bit, so increasing it to 32, 64 or 128 byte would be a big > difference. Though does it really matter in practice? Tiny allocations would not suffer because of the MIN (align, size), so the worst-case is max-align + 1 allocations. Btw, you could as well try MIN (align, size & -size), thus assume that the allocation size is N * alignof (). > > > > If the user provides its own ::new then he is on its own, of course > > I agree that's how I would like things to be. Unfortunately, the spec quote > in comment 4, "the storage is obtained by calling ::operator > new(std::size_t)", goes in the opposite direction, by requiring allocators > to use the type-blind ::new , thereby losing useful type information such as > the type's alignment. That's why I think that this is the spec's weak spot > that might be questioned. but you're the expert, not me :) > > > (and > > doing that and using posix_memalign in it would be a workaround for this > > issue?!). > > That could be a good compromise for many applications, that either don't > care about minimizing memory usage, or don't do many tiny allocations. > Unfortunately, my context here is a library (Eigen), so we can't make this > decision for all our users.
(In reply to Richard Biener from comment #12) > (In reply to Benoit Jacob from comment #11) > > (In reply to Richard Biener from comment #10) > > > But ::operator new(std::size_t) could always return memory aligned for the > > > most over-aligned type? Thus our default new implementation could use > > > posix_memalign (..., MIN (size, BIGGEST_ALIGNMENT), size)? > > > > The problem is there are lots of use cases for really high alignment: AVX > > brings uses of 32-byte alignment, and cache lines are typically >= 64 byte > > aligned. So alignas has to support at least 64 or 128 byte alignment to > > support cache friendly code, and even without that, it would have to support > > at least 32 byte alignment for AVX vectorized code. > > > > Making all allocations that large would lead to substantial allocator slop. > > For example, jemalloc has a quantum of 8 or 16 byte depending on whether the > > arch is 32 or 64 bit, so increasing it to 32, 64 or 128 byte would be a big > > difference. > > Though does it really matter in practice? Tiny allocations would not suffer > because of the MIN (align, size), so the worst-case is max-align + 1 > allocations. Btw, you could as well try MIN (align, size & -size), > thus assume that the allocation size is N * alignof (). Oh, I was missing the MIN (size, BIGGEST_ALIGNMENT) part of your proposal. So you're right, that nicely takes care of tiny allocations. Correct also on MIN (align, size & -size) (though I have to trust your bit wizardry here) so that neatly exploits the low bits of the size parameter to infer the alignof --- sounds very good to me!
(In reply to Benoit Jacob from comment #13) > (In reply to Richard Biener from comment #12) > > (In reply to Benoit Jacob from comment #11) > > > (In reply to Richard Biener from comment #10) > > > > But ::operator new(std::size_t) could always return memory aligned for the > > > > most over-aligned type? Thus our default new implementation could use > > > > posix_memalign (..., MIN (size, BIGGEST_ALIGNMENT), size)? > > > > > > The problem is there are lots of use cases for really high alignment: AVX > > > brings uses of 32-byte alignment, and cache lines are typically >= 64 byte > > > aligned. So alignas has to support at least 64 or 128 byte alignment to > > > support cache friendly code, and even without that, it would have to support > > > at least 32 byte alignment for AVX vectorized code. > > > > > > Making all allocations that large would lead to substantial allocator slop. > > > For example, jemalloc has a quantum of 8 or 16 byte depending on whether the > > > arch is 32 or 64 bit, so increasing it to 32, 64 or 128 byte would be a big > > > difference. > > > > Though does it really matter in practice? Tiny allocations would not suffer > > because of the MIN (align, size), so the worst-case is max-align + 1 > > allocations. Btw, you could as well try MIN (align, size & -size), > > thus assume that the allocation size is N * alignof (). > > Oh, I was missing the MIN (size, BIGGEST_ALIGNMENT) part of your proposal. > So you're right, that nicely takes care of tiny allocations. Correct also on > MIN (align, size & -size) (though I have to trust your bit wizardry here) > so that neatly exploits the low bits of the size parameter to infer the > alignof --- sounds very good to me! That trick of course requires you to not "optimize" your allocations manually to omit tail padding in aggregates (or manually compute allocation sizes). Or use struct { avx512vector x; char c; } __attribute__((packed,aligned(64))); which has alignment 64 and size 65... (of course arrays of such objects are ill-defined)
Oh, and I wonder if there is an aligned realloc (though C++ doesn't define sth like realloc and thus std::vector can't optimize the copy when reallocating?!). But for a C program using posix_memalign or any other such facility means realloc is out of the question as well.
I think this is a dup of PR 36159.
I think this is fixed for GCC 7 with -std=c++17 support.
(In reply to Andrew Pinski from comment #17) > I think this is fixed for GCC 7 with -std=c++17 support. Thanks for the update, that's great news!
(In reply to Andrew Pinski from comment #17) > I think this is fixed for GCC 7 with -std=c++17 support. No, it isn't. new T[10] will give suitably aligned memory, but not std::allocator<T>. Only the core part of alignment support was added in C++17, we are expecting a corresponding library part in C++20. On the other hand, providing it as an extension in C++17-mode (gnu++17?) might be ok (requires a discussion).
In C++17 std::allocator no longer says "It is implementation-defined whether over-aligned types are supported" and is no longer required to call operator new(size_t), it is supposed to use the over-aligned form as appropriate. We should do that when __cpp_aligned_new is defined.
Created attachment 39686 [details] Make new_allocator support types with new-extended alignment This works, but currently produces a warning due to PR 77742.
(In reply to Jonathan Wakely from comment #20) > In C++17 std::allocator no longer says "It is implementation-defined whether > over-aligned types are supported" and is no longer required to call operator > new(size_t), it is supposed to use the over-aligned form as appropriate. Ah, I had missed those last few lines of p0035r4 (they were not in r2). That's great news! I remember discussions and polls about the fact that the proposal wasn't handling the library part, is there still anything missing (postponed?) or did the last couple revisions add everything?
I think it's supposed to be cover everything, but I'm not sure.
Fixed for GCC 7
Author: redi Date: Fri Oct 14 12:03:47 2016 New Revision: 241158 URL: https://gcc.gnu.org/viewcvs?rev=241158&root=gcc&view=rev Log: PR65122 extended alignment support in allocators PR libstdc++/65122 * include/ext/malloc_allocator.h (malloc_allocator::allocate): Use aligned_alloc for types with extended alignment if available, otherwise throw bad_alloc if malloc doesn't return a suitable value. * include/ext/bitmap_allocator.h (bitmap_allocator::allocate) (bitmap_allocator::deallocate): Use aligned new/delete for types with extended alignment. * include/ext/mt_allocator.h (__mt_alloc::allocate) (__mt_alloc::deallocate): Likewise. * include/ext/new_allocator.h (new_allocator::allocate) (new_allocator::deallocate): Likewise. * include/ext/pool_allocator.h (__pool_alloc::allocate) (__pool_alloc::deallocate): Likewise. * testsuite/20_util/allocator/overaligned.cc: New test. * testsuite/ext/bitmap_allocator/overaligned.cc: New test. * testsuite/ext/malloc_allocator/overaligned.cc: New test. * testsuite/ext/mt_allocator/overaligned.cc: New test. * testsuite/ext/new_allocator/overaligned.cc: New test. * testsuite/ext/pool_allocator/overaligned.cc: New test. Added: trunk/libstdc++-v3/testsuite/20_util/allocator/overaligned.cc trunk/libstdc++-v3/testsuite/ext/bitmap_allocator/overaligned.cc trunk/libstdc++-v3/testsuite/ext/malloc_allocator/overaligned.cc trunk/libstdc++-v3/testsuite/ext/mt_allocator/overaligned.cc trunk/libstdc++-v3/testsuite/ext/new_allocator/overaligned.cc trunk/libstdc++-v3/testsuite/ext/pool_allocator/overaligned.cc Modified: trunk/libstdc++-v3/ChangeLog trunk/libstdc++-v3/include/ext/bitmap_allocator.h trunk/libstdc++-v3/include/ext/malloc_allocator.h trunk/libstdc++-v3/include/ext/mt_allocator.h trunk/libstdc++-v3/include/ext/new_allocator.h trunk/libstdc++-v3/include/ext/pool_allocator.h