Bug 19495

Summary: basic_string::_M_rep() can produce an unnaturally aligned pointer to _Rep
Product: gcc Reporter: Ben Elliston <bje>
Component: libstdc++Assignee: Paolo Carlini <paolo.carlini>
Status: RESOLVED FIXED    
Severity: normal CC: bkoz, danglin, fang, gcc-bugs, gdr, janis, ncm-nospam
Priority: P2    
Version: 4.0.0   
Target Milestone: ---   
Host: powerpc*-linux Target: powerpc*-linux
Build: Known to work:
Known to fail: Last reconfirmed: 2005-12-23 00:07:02
Bug Depends on: 8670, 24882    
Bug Blocks: 21554    

Description Ben Elliston 2005-01-18 05:14:49 UTC
basic_string.h says:

  /*
   * [...]
   *  Where the _M_p points to the first character in the string, and
   *  you cast it to a pointer-to-_Rep and subtract 1 to get a
   *  pointer to the header.
   * [...]
   */

The test case `testsuite/ext/array_allocator/2.cc' illustrates the bug.  After
linking, the global variable `extern_array' is located at an odd-numbered
address (which is fine; it is a char array).

When _M_rep() is called by basic_string's destructor, a _Rep* is returned that
is unaligned for calling _Rep methods like dispose(), because an odd-numbered
address minus 1*sizeof (_Rep) is still odd-numbered:
      _Rep*
      _M_rep() const
      { return &((reinterpret_cast<_Rep*> (_M_data()))[-1]); }

This method call:
      ~basic_string()
      { _M_rep()->_M_dispose(this->get_allocator()); }

induces a bus error on architectures where misalignment triggers a bus error. 
Is this a libstdc++ bug or a documentation omission?
Comment 1 Paolo Carlini 2005-01-18 09:17:29 UTC
This is essentially a duplicate of libstdc++/8670: we'll try to look again into
it: perhaps we can fix it now, but __attribute__(align(K)) is still very weak,
see c++/19163 and c++/17743.

*** This bug has been marked as a duplicate of 8670 ***
Comment 2 Paolo Carlini 2005-01-18 09:35:21 UTC
Actually, sorry, but I'm not sure this is really the same of libstdc++/8670
(which is *not* about basic_string instantiated for plain char). The problem
here seems that the alignment requirements of basic_string are those of its
_Rep (for plain char, at least, not always, thus 8670), that is, higher than
one. Probably ext/array_allocator/2.cc should be fixed.
Comment 3 Paolo Carlini 2005-01-18 09:45:44 UTC
In other terms, as far as this PR is concerned, basic_string seems ok, just
you cannot create a basic_string<char> object in memory aligned one, as happens
in ext/array_allocator/2.cc, which therefore should be fixed.
Comment 4 Paolo Carlini 2005-01-18 09:53:12 UTC
Benjamin, can you please have a look?
Comment 5 bje@au1.ibm.com 2005-01-19 06:05:17 UTC
Subject: Re:  basic_string::_M_rep() can produce an unnaturally aligned pointer to _Rep

On Tue, Jan 18, 2005 at 09:45:48AM -0000, pcarlini at suse dot de wrote:

> In other terms, as far as this PR is concerned, basic_string seems
> ok, just you cannot create a basic_string<char> object in memory
> aligned one, as happens in ext/array_allocator/2.cc, which therefore
> should be fixed.

Does this behaviour seem a little bit unusual to you?  You said: "You
cannot create a basic_string<char> object in memory aligned one".
That is rather counter-intuitive to a user of libstdc++ who has no
understanding of the implementation details of basic_string.

Ben
Comment 6 Alan Modra 2005-01-19 06:28:02 UTC
The reason for the bus error is the __exchange_and_add decrement of _M_refcount.
 On powerpc, lwarx and stwcx. must have an aligned effective address.
Comment 7 Paolo Carlini 2005-01-19 09:09:21 UTC
> Does this behaviour seem a little bit unusual to you?  You said: "You
> cannot create a basic_string<char> object in memory aligned one".
> That is rather counter-intuitive to a user of libstdc++ who has no
> understanding of the implementation details of basic_string.

I don't think so. Memory provided by new, f.i., is never returned aligned
one. Quite to the contrary, *missing* a knowledge of the internal impl
details, you are supposed to pass to the string object an allocator
returning memory maximally aligned (like the default one does!)
Comment 8 Paolo Carlini 2005-01-19 10:05:56 UTC
Well, I can see that basic_string default allocator, std::allocator<CharT>
according to the standard shall return memory only aligned as CharT requests,
not more; whereas our implementation of std::allocator provides stronger
alignment guarantees. But I can also see that the general requirements in
the standard for allocators do *not* talk about alignment at all...

Therefore, probably, it would be nice if basic_string could make use of
memory only aligned as CharT wants, not more, but, AFAICS, this principle
is not present in the original design. I don't think we can implement it
now, for 4.0, without changing the ABI. I think we should just document
that for our current basic_string memory rerurned by the allocator should
be maximally aligned (in some cases less aligned is ok, but details become
tricky to spell out).
Comment 9 Paolo Carlini 2005-01-19 11:51:23 UTC
For 4.0, besides the documentation issue, I think we should change
ext/array_allocator to use tr1::type_traits::aligned_storage, finally
available. I will post a patch ASAP...
Comment 10 Gabriel Dos Reis 2005-01-19 15:57:18 UTC
Subject: Re:  basic_string::_M_rep() can produce an unnaturally aligned pointer to _Rep

"pcarlini at suse dot de" <gcc-bugzilla@gcc.gnu.org> writes:

| > Does this behaviour seem a little bit unusual to you?  You said: "You
| > cannot create a basic_string<char> object in memory aligned one".
| > That is rather counter-intuitive to a user of libstdc++ who has no
| > understanding of the implementation details of basic_string.
| 
| I don't think so. Memory provided by new, f.i., is never returned aligned
| one. Quite to the contrary, *missing* a knowledge of the internal impl
| details, you are supposed to pass to the string object an allocator
| returning memory maximally aligned (like the default one does!)

I don't understand the reasoning here.
The allocator is that for char, which is required to provide 
storagte aligned only for char, i.e. in this particular case.
Therefore we have real issue with _Rep.
This is basically the same issue as the one filled by
James Kanze.

-- Gaby
Comment 11 Gabriel Dos Reis 2005-01-19 16:02:47 UTC
Subject: Re:  basic_string::_M_rep() can produce an unnaturally aligned pointer to _Rep

"pcarlini at suse dot de" <gcc-bugzilla@gcc.gnu.org> writes:

| Well, I can see that basic_string default allocator, std::allocator<CharT>
| according to the standard shall return memory only aligned as CharT requests,
| not more; whereas our implementation of std::allocator provides stronger
| alignment guarantees. But I can also see that the general requirements in
| the standard for allocators do *not* talk about alignment at all...

well, I don't think we can go very far with that stretch :-)

| Therefore, probably, it would be nice if basic_string could make use of
| memory only aligned as CharT wants, not more, but, AFAICS, this principle

Yes.  Basically, we need to have tha aligned attribute work correctly.

| is not present in the original design. I don't think we can implement it
| now, for 4.0, without changing the ABI. I think we should just document
| that for our current basic_string memory rerurned by the allocator should
| be maximally aligned (in some cases less aligned is ok, but details become
| tricky to spell out).

I rather we fix it.  Remember, this is more an optimization issue
than a semantics issue.  An optimization issue that had causes us
more trouble than benefits I believe.  I don't believe it is wise 
for us to go that path down putting more an more restrictions.
With people playing with fancy allocator around, it is likely that
we're going to have more and more of this issue popping up.

In some sense, I'm happy that C++ finally gives progremmers ways
to crontrol storage allocation.

-- Gaby
Comment 12 Paolo Carlini 2005-01-20 00:34:53 UTC
Hi Gaby,

> Yes.  Basically, we need to have tha aligned attribute work correctly.

Agreed, in principle: indeed, we are filing together and taking care of
many PRs in this area.

> | is not present in the original design. I don't think we can implement it
> | now, for 4.0, without changing the ABI. I think we should just document
> | that for our current basic_string memory rerurned by the allocator should
> | be maximally aligned (in some cases less aligned is ok, but details become
> | tricky to spell out).
>
> I rather we fix it.  Remember, this is more an optimization issue
> than a semantics issue.  An optimization issue that had causes us
> more trouble than benefits I believe.  I don't believe it is wise 
> for us to go that path down putting more an more restrictions.
> With people playing with fancy allocator around, it is likely that
> we're going to have more and more of this issue popping up.

Again, in principle I agree, but remember that:
1- It's almost impossible (see messages from Mark) that the
__attribute__(aligned) machinery will be satisfactorily fixed for
4.0.
2- We do *not* want to break the ABI for 4.0, in particular, in
the basic_string area. Indeed, as you can see, we are always
monitoring check-abi and very careful about everything (see
constructor / assignment issue of yesterday)

This is way I'm proposing, *for 4.0, only for 4.0*, to document
a bit some weaknesses wrt alignment, which we always had, only
implicitly, and change a bit the ext/array_allocator things to
not trigger problems, currently (see patch posted on the v3 list)

Sounds reasonable?
Comment 13 Gabriel Dos Reis 2005-01-20 04:08:17 UTC
Subject: Re:  basic_string::_M_rep() can produce an unnaturally aligned pointer to _Rep

"pcarlini at suse dot de" <gcc-bugzilla@gcc.gnu.org> writes:

| This is way I'm proposing, *for 4.0, only for 4.0*, to document
| a bit some weaknesses wrt alignment, which we always had, only
| implicitly, and change a bit the ext/array_allocator things to
| not trigger problems, currently (see patch posted on the v3 list)
| 
| Sounds reasonable?

Works for me.


-- Gaby
Comment 14 ncm-nospam@cantrip.org 2005-01-21 15:22:39 UTC
This is a real bug, but easily fixed, and (I think) without breaking ABI.

The problem is in basic_string.h, where it says

  struct _Rep : _Rep_base
  {
    // Types:
    typedef typename _Alloc::template rebind<char>::other _Raw_bytes_alloc;

Technically this should be "rebind<_Rep>::other".  Of course then 
it's not really a raw-bytes allocator, but we have no need for a 
raw_bytes allocator.  We need an allocator of space for a _Rep with 
room for raw bytes at the end.  Remaining to fix is to change the name 
in the very few places where it's used, and to change the argument to 
allocate() (and deallocate) so that it ends up allocating the right 
number of bytes.

Probably a better fix would be to use "rebind<T>::other", where 
T is the most strictly-aligned of members of _Rep, i.e. size_type 
or _Atomic_word.  This is easily determined as the union of the 
two:

  union { _Atomic_word w; size_type s; } _Alloc_unit;
  typedef typename _Alloc::template rebind<_Alloc_unit>::other _Raw_alloc;

and then scale the argument to allocate() by the size of _Alloc_unit.
Fortunately, the required "/ sizeof(_Alloc_unit)" may (will?) be 
weakened to ">> 2" or ">> 4" on all real targets.

Either way, then, it's up to __gnu_cxx::array_allocator<char_type, 
array_type> to make sure its argument is aligned properly.  If it 
can't, that's a (more serious) bug in that template.  The test case 
2.cc had better leave enough room in the byte array for that 
alignment adjustment, which it does not quite do, I think, on 64-bit 
targets.

Of course all this means this bug is not really an "enhancement",
but bugzilla won't let me fix that.
Comment 15 ncm-nospam@cantrip.org 2005-01-21 15:37:21 UTC
Hmm, it's a little more complicated than I said, although it might be 
academic.  There's an implicit assumption in the code that any type 
on which basic_string<> might be instantiated has no stricter alignment
than _Rep itself.  But that's a different bug.  Probably the right way 
to fix it is to replace the first member of _Rep with an anonymous union:

  size_type               _M_length;

becomes

  union { size_type _M_length; _Some_big_aligned_type _M_alignment_dummy; };

I don't know if gcc supplies a built-in typedef for that.  Without
checking, I think tr1 must do.   I believe this can also be done
without breaking ABI on any actual target.
Comment 16 Paolo Carlini 2005-01-21 15:41:10 UTC
Thanks Nathan, I will implement what you are suggesting. The last issue, actually
is filed as libstdc++/8670 and in the audit trail we agreed to fix it using a
suited __attribute__(aligned), which however, doesn't work correctly at the
moment (we have already a few open PRs about this). Indeed, I'd rather prefer
fixing the two problems separately.
Comment 17 Paolo Carlini 2005-01-21 15:43:28 UTC
By the way, I understand that tr1/aligned_storage & co facilities cannot be 
directly used for implementing the current std, since we cannot pollute the
std namespace.
Comment 18 ncm-nospam@cantrip.org 2005-01-21 16:39:16 UTC
I agree that 8670 is a separate bug.

The referenced test 2.cc can be made to fail more reliably with
the following changes:

First, leave enough space for alignment adjustments, even on 128-bit
machines:

- typedef std::tr1::array<char_type, 32> array_type;
+ typedef std::tr1::array<char_type, 256> array_type;

Then, make sure extern_array itself is aligned more or less predictably.

- array_type extern_array;
+ union 
+ { 
+   array_type extern_array; size_t sdummy; double ddummy; char* pdummy;
+ };

Finally, make sure the storage passed to the allocator is misaligned:

-   allocator_type a(&extern_array);
+   allocator_type a(&extern_array[1]);

I don't know how many other tests should be adjusted similarly.
Comment 19 Paolo Carlini 2005-02-09 09:33:22 UTC
Will be trivially fixed in v7...
Comment 20 Andrew Pinski 2005-02-09 19:38:08 UTC
*** Bug 19867 has been marked as a duplicate of this bug. ***
Comment 21 Janis Johnson 2005-02-10 00:27:13 UTC
I've made execution of ext/array_allocator/2.cc XFAIL for powerpc*-*-linux*.
Comment 22 ncm 2005-04-01 13:24:00 UTC
Subject: Re:  basic_string::_M_rep() can produce an unnaturally aligned pointer to _Rep

On Fri, Apr 01, 2005 at 11:42:27AM -0000, pcarlini at suse dot de wrote:
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>            Severity|normal                      |enhancement
> 
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19495

I don't see how this (or 8670, or the other one) is an enhancement 
request.  Users are absolutely allowed to make allocators that 
enforce only the alignment of the type they are instantiated on, 
and string is certainly using the wrong kind of allocator.

It's a fairly minor bug, but seems to me clearly a bug.

N
Comment 23 Paolo Carlini 2005-04-01 13:31:40 UTC
Ok, my change was only dictated by consistency, and the original idea of using
"enhancement" is not mine ;) Let's remove "enhancement" from both. By the way,
I really noticed yesterday for the first time that basic_string, which often has
requirements very similar to the other STL containers, differs in that, at
variance with the other STL containers, its allocator can have a value_type !=
CharT (in case, I would suggest not further discussing this side remark here)
Comment 24 GCC Commits 2005-05-18 22:11:35 UTC
Subject: Bug 19495

CVSROOT:	/cvs/gcc
Module name:	gcc
Changes by:	paolo@gcc.gnu.org	2005-05-18 22:11:24

Modified files:
	libstdc++-v3   : ChangeLog 
	libstdc++-v3/include/bits: basic_string.h basic_string.tcc 
	libstdc++-v3/src: mt_allocator.cc pool_allocator.cc 
	                  bitmap_allocator.cc 
	libstdc++-v3/include/ext: array_allocator.h 
	libstdc++-v3/testsuite/ext/array_allocator: 2.cc 

Log message:
	2005-05-18  Paolo Carlini  <pcarlini@suse.de>
	Nathan Myers  <ncm@cantrip.org>
	
	PR libstdc++/19495
	* include/bits/basic_string.h (_Raw_bytes_alloc): Rebind to
	size_type instead of char and rename to _Raw_alloc.
	* include/bits/basic_string.tcc (_Rep::_M_destroy, _Rep::_S_create):
	Use the above.
	* src/bitmap_allocator.cc: Add instantiation for size_type.
	* src/mt_allocator.cc: Likewise.
	* src/pool_allocator.cc: Likewise.
	* include/ext/array_allocator.h: Tweak slightly, avoid assuming
	the existence of an _Array::begin() and size() members.
	* testsuite/ext/array_allocator/2.cc: Tweak to use an allocator
	of size_type, instead of char, thus avoiding problems with
	rebinds, not treated correctly by array_allocator.

Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/libstdc++-v3/ChangeLog.diff?cvsroot=gcc&r1=1.3003&r2=1.3004
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/libstdc++-v3/include/bits/basic_string.h.diff?cvsroot=gcc&r1=1.75&r2=1.76
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/libstdc++-v3/include/bits/basic_string.tcc.diff?cvsroot=gcc&r1=1.84&r2=1.85
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/libstdc++-v3/src/mt_allocator.cc.diff?cvsroot=gcc&r1=1.8&r2=1.9
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/libstdc++-v3/src/pool_allocator.cc.diff?cvsroot=gcc&r1=1.2&r2=1.3
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/libstdc++-v3/src/bitmap_allocator.cc.diff?cvsroot=gcc&r1=1.6&r2=1.7
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/libstdc++-v3/include/ext/array_allocator.h.diff?cvsroot=gcc&r1=1.8&r2=1.9
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/libstdc++-v3/testsuite/ext/array_allocator/2.cc.diff?cvsroot=gcc&r1=1.4&r2=1.5

Comment 25 GCC Commits 2005-05-28 21:57:13 UTC
Subject: Bug 19495

CVSROOT:	/cvs/gcc
Module name:	gcc
Changes by:	paolo@gcc.gnu.org	2005-05-28 21:57:03

Modified files:
	libstdc++-v3   : ChangeLog 
	libstdc++-v3/include/bits: basic_string.h basic_string.tcc 
	libstdc++-v3/src: bitmap_allocator.cc mt_allocator.cc 
	                  pool_allocator.cc 
	libstdc++-v3/include/ext: array_allocator.h 
	libstdc++-v3/testsuite/ext/array_allocator: 2.cc 

Log message:
	2005-05-28  Paolo Carlini  <pcarlini@suse.de>
	
	Revert:
	2005-05-18  Paolo Carlini  <pcarlini@suse.de>
	Nathan Myers  <ncm@cantrip.org>
	
	PR libstdc++/19495
	* include/bits/basic_string.h (_Raw_bytes_alloc): Rebind to
	size_type instead of char and rename to _Raw_alloc.
	* include/bits/basic_string.tcc (_Rep::_M_destroy, _Rep::_S_create):
	Use the above.
	* src/bitmap_allocator.cc: Add instantiation for size_type.
	* src/mt_allocator.cc: Likewise.
	* src/pool_allocator.cc: Likewise.
	* include/ext/array_allocator.h: Tweak slightly, avoid assuming
	the existence of an _Array::begin() and size() members.
	* testsuite/ext/array_allocator/2.cc: Tweak to use an allocator
	of size_type, instead of char, thus avoiding problems with
	rebinds, not treated correctly by array_allocator.

Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/libstdc++-v3/ChangeLog.diff?cvsroot=gcc&r1=1.3023&r2=1.3024
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/libstdc++-v3/include/bits/basic_string.h.diff?cvsroot=gcc&r1=1.78&r2=1.79
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/libstdc++-v3/include/bits/basic_string.tcc.diff?cvsroot=gcc&r1=1.85&r2=1.86
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/libstdc++-v3/src/bitmap_allocator.cc.diff?cvsroot=gcc&r1=1.7&r2=1.8
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/libstdc++-v3/src/mt_allocator.cc.diff?cvsroot=gcc&r1=1.9&r2=1.10
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/libstdc++-v3/src/pool_allocator.cc.diff?cvsroot=gcc&r1=1.3&r2=1.4
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/libstdc++-v3/include/ext/array_allocator.h.diff?cvsroot=gcc&r1=1.9&r2=1.10
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/libstdc++-v3/testsuite/ext/array_allocator/2.cc.diff?cvsroot=gcc&r1=1.5&r2=1.6

Comment 26 Ben Elliston 2009-05-14 02:54:49 UTC
This test now passes on powerpc*-linux-gnu.
Comment 27 Jonathan Wakely 2023-11-08 12:32:54 UTC
(In reply to Ben Elliston from comment #26)
> This test now passes on powerpc*-linux-gnu.

I wonder how ... the "fix" got reverted, and we still use an allocator of char to create the storage for the COW string's _Rep, which has stricter alignment requirements than char.

The array_allocator was eventually removed in 2019, because it was broken. But the alignment problem still seems to be present in the COW std::string.
Comment 28 Jonathan Wakely 2023-11-08 12:35:36 UTC
(In reply to Jonathan Wakely from comment #27)
> But the alignment problem still seems to be present in the COW std::string.

But I think that's covered by PR 8670, which is still open.