Bug 107590 - __atomic_test_and_set broken on PowerPC
Summary: __atomic_test_and_set broken on PowerPC
Status: WAITING
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 11.3.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-11-09 15:33 UTC by Sergey Fedorov
Modified: 2024-06-13 10:10 UTC (History)
4 users (show)

See Also:
Host:
Target: powerpc-apple-darwin
Build:
Known to work:
Known to fail:
Last reconfirmed: 2022-11-09 00:00:00


Attachments
Preprocessed spinlock_test from 10.6 (4.88 KB, text/plain)
2022-11-10 18:14 UTC, Sergey Fedorov
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Sergey Fedorov 2022-11-09 15:33:39 UTC
The bug was discovered while trying to build i2pd on Darwin PPC. Build itself succeeded, but the binary failed with Bus error. I thought it was caused by Boost, however it turned out that Boost was fine, while Bus error was a result of broken GCC builtin.

Details here: https://github.com/PurpleI2P/i2pd/issues/1726#issuecomment-1305536144

In particular, simple spin lock tests fail with Bus error when built with default usage of GCC atomics. Like this: https://github.com/boostorg/smart_ptr/blob/develop/test/spinlock_test.cpp

36-87:boost svacchanda$ /opt/local/bin/g++-mp-11 spinlock_test.cpp -I/opt/local/libexec/boost/1.76/include -L/opt/local/libexec/boost/1.76/lib -o spinlock_test
36-87:boost svacchanda$ /Users/svacchanda/Dev/boost/spinlock_test 
Bus error
36-87:boost svacchanda$ /opt/local/bin/g++-mp-11 spinlock_test.cpp -DBOOST_SP_USE_STD_ATOMIC -I/opt/local/libexec/boost/1.76/include -L/opt/local/libexec/boost/1.76/lib -o spinlock_test
36-87:boost svacchanda$ /Users/svacchanda/Dev/boost/spinlock_test

No error when Boost is forced to use its own atomics instead of GCC builtins.

Boost developer suggested that __atomic_test_and_set is broken on PPC, and GDB output proved that:

(gdb) run
Starting program: /Users/svacchanda/Dev/boost/a.out 
Reading symbols for shared libraries +++..... done

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: 259 at address: 0x0000208d
0x00001b2c in boost::detail::spinlock::try_lock (this=0x208d) at spinlock_gcc_atomic.hpp:39
39	        return __atomic_test_and_set( &v_, __ATOMIC_ACQUIRE ) == 0;
(gdb) quit

Same failure confirmed on 10.5.8 and 10.6. gcc11 and gcc7 were tried, likely to apply to other versions.

GCC output from 10.5.8:

Using built-in specs.
COLLECT_GCC=/opt/local/bin/gcc-mp-11
COLLECT_LTO_WRAPPER=/opt/local/libexec/gcc/ppc-apple-darwin9/11.3.0/lto-wrapper
Target: ppc-apple-darwin9
Configured with: /opt/local/var/macports/build/_opt_PPCLeopardPorts_lang_gcc11/gcc11/work/gcc-11.3.0/configure --prefix=/opt/local --build=ppc-apple-darwin9 --enable-languages=c,c++,objc,obj-c++,lto,fortran --libdir=/opt/local/lib/gcc11 --includedir=/opt/local/include/gcc11 --infodir=/opt/local/share/info --mandir=/opt/local/share/man --datarootdir=/opt/local/share/gcc-11 --with-local-prefix=/opt/local --with-system-zlib --disable-nls --program-suffix=-mp-11 --with-gxx-include-dir=/opt/local/include/gcc11/c++/ --with-gmp=/opt/local --with-mpfr=/opt/local --with-mpc=/opt/local --with-isl=/opt/local --with-zstd=/opt/local --enable-stage1-checking --enable-lto --enable-libstdcxx-time --with-build-config=bootstrap-debug --with-bugurl=https://trac.macports.org/newticket --enable-host-shared --with-tune-cpu=G5 --disable-tls --with-as=/opt/local/bin/as --with-ld=/opt/local/bin/ld --with-ar=/opt/local/bin/ar --with-pkgversion='MacPorts gcc11 11.3.0_1+universal'
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 11.3.0 (MacPorts gcc11 11.3.0_1+universal)
Comment 1 Andrew Pinski 2022-11-09 15:42:17 UTC
This seems like an alignment issue.
That is GCC thinks the alignment of the variables are one thing but the reality is something different.

We need more information than this since the atomics are run part of the testsuite and we have not seen any issues related to them either. Plus libstdc++ uses the atomics too.
Comment 2 Andrew Pinski 2022-11-09 15:48:15 UTC
>Reason: 259 at address: 0x00003109

Yes that does seem like an alignment disagreement.

I suspect the code is broken for allocation and it is allocating unaligned structs.

Also inside gdb can you do the following:

disassemble $pc-0x10 $pc+0x10
info registers
Comment 3 Sergey Fedorov 2022-11-09 16:05:39 UTC
(In reply to Andrew Pinski from comment #2)
> >Reason: 259 at address: 0x00003109
> 
> Yes that does seem like an alignment disagreement.
> 
> I suspect the code is broken for allocation and it is allocating unaligned
> structs.

The code in Boost shared pointer tests looks pretty simple. Since the bus error happens on those, no need to look into i2pd case.
I don’t really know how alignment works there, but hopefully someone can look at that.

But the error was reproducible on several tests plus i2pd itself. All fixed by switching to Boost own atomics. I am not qualified to judge if the code is broken, but this looks like it is worth investigation.

> Also inside gdb can you do the following:
> 
> disassemble $pc-0x10 $pc+0x10
> info registers

I could try that tomorrow, provided an ancient GBD we have on PPC supports that.
Comment 4 Andrew Pinski 2022-11-09 16:36:13 UTC
(In reply to Sergey Fedorov from comment #3)
> > Also inside gdb can you do the following:
> > 
> > disassemble $pc-0x10 $pc+0x10
> > info registers
> 
> I could try that tomorrow, provided an ancient GBD we have on PPC supports
> that.

It should because the above commands have been there for a long time (over 20 years).
Comment 5 Iain Sandoe 2022-11-09 16:40:30 UTC
I'm away from my usual sources of information but I'd suggest exploring the possibility that someone has assumed that either the spinlock or a bool is 8bits; As far as my memory serves both are 32b on power darwin.
Comment 6 Andrew Pinski 2022-11-09 16:52:36 UTC
(In reply to Iain Sandoe from comment #5)
> I'm away from my usual sources of information but I'd suggest exploring the
> possibility that someone has assumed that either the spinlock or a bool is
> 8bits; As far as my memory serves both are 32b on power darwin.

Oh yes I forgot about bool being 32bit due to an ABI choice early on with Apple's GCC 2.95 because of not having a bool type but defining it to int.
So there might be still a mismatch there still ...
Comment 7 Peter Dimov 2022-11-10 13:19:41 UTC
The spinlock is indeed using an `unsigned char`:

https://github.com/boostorg/smart_ptr/blob/c577d68b0272fd0bddc88ea60a8db07219391589/include/boost/smart_ptr/detail/spinlock_gcc_atomic.hpp#L33

That's because `__atomic_test_and_set` is documented to work on either `bool` or `char`:

https://gcc.gnu.org/onlinedocs/gcc/extensions-to-the-c-language-family/built-in-functions-for-memory-model-aware-atomic-operations.html#_CPPv421__atomic_test_and_setPvi

"bool __atomic_test_and_set(void *ptr, int memorder)

This built-in function performs an atomic test-and-set operation on the byte at *ptr. The byte is set to some implementation defined nonzero ‘set’ value and the return value is true if and only if the previous contents were ‘set’. It should be only used for operands of type bool or char. For other types only part of the value may be set."

I don't see an alignment requirement being mentioned here.
Comment 8 Andrew Pinski 2022-11-10 16:56:36 UTC
(In reply to Peter Dimov from comment #7) 
> I don't see an alignment requirement being mentioned here.

I think you misunderstood the alignment issue. There might be no alignment requirement directly on __atomic_test_and_set but if there is a mismatch understanding of alignment of what GCC thinks the alignment of the address should be and what the address really is, then there will be an issue. now that could be still a GCC bug or it can be a bug in the source; usually suballocators providing wrong aligned addresses.


But there is not enough information here to figure out what exactly is going wrong.
No preprocessed source of the code going wrong.
Not even register state or instructions where the segfault is happening (this is would be very useful but not required).
Comment 9 Peter Dimov 2022-11-10 17:13:28 UTC
The easiest way to reproduce the issue is with the following test:

https://github.com/boostorg/smart_ptr/blob/c577d68b0272fd0bddc88ea60a8db07219391589/test/spinlock_test.cpp

This crashes because - presumably - sp2 is on an odd address.

The definition of the spinlock class is here:

https://github.com/boostorg/smart_ptr/blob/c577d68b0272fd0bddc88ea60a8db07219391589/include/boost/smart_ptr/detail/spinlock_gcc_atomic.hpp
Comment 10 Sergey Fedorov 2022-11-10 18:14:30 UTC
Created attachment 53876 [details]
Preprocessed spinlock_test from 10.6

36-172% /opt/local/bin/g++-mp-11 --save-temps -v spinlock_test.cpp -I/opt/local/libexec/boost/1.79/include -L/opt/local/libexec/boost/1.79/lib -o spinlock_test
Using built-in specs.
COLLECT_GCC=/opt/local/bin/g++-mp-11
COLLECT_LTO_WRAPPER=/opt/local/libexec/gcc/ppc-apple-darwin10/11.3.0/lto-wrapper
Target: ppc-apple-darwin10
Configured with: /opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_lang_gcc11/gcc11/work/gcc-11.3.0/configure --prefix=/opt/local --build=ppc-apple-darwin10 --enable-languages=c,c++,objc,obj-c++,lto,fortran,jit --libdir=/opt/local/lib/gcc11 --includedir=/opt/local/include/gcc11 --infodir=/opt/local/share/info --mandir=/opt/local/share/man --datarootdir=/opt/local/share/gcc-11 --with-local-prefix=/opt/local --with-system-zlib --disable-nls --program-suffix=-mp-11 --with-gxx-include-dir=/opt/local/include/gcc11/c++/ --with-gmp=/opt/local --with-mpfr=/opt/local --with-mpc=/opt/local --with-isl=/opt/local --with-zstd=/opt/local --enable-stage1-checking --disable-multilib --enable-lto --enable-libstdcxx-time --with-build-config=bootstrap-debug --with-bugurl=https://trac.macports.org/newticket --enable-host-shared --with-tune-cpu=G5 --disable-tls --with-as=/opt/local/bin/as --with-ld=/opt/local/bin/ld --with-ar=/opt/local/bin/ar --with-pkgversion='MacPorts gcc11 11.3.0_1'
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 11.3.0 (MacPorts gcc11 11.3.0_1) 
COLLECT_GCC_OPTIONS='-save-temps' '-v' '-I' '/opt/local/libexec/boost/1.79/include' '-L/opt/local/libexec/boost/1.79/lib' '-o' 'spinlock_test' '-mmacosx-version-min=10.6.0' '-asm_macosx_version_min=10.6' '-nodefaultexport' '-shared-libgcc'
 /opt/local/libexec/gcc/ppc-apple-darwin10/11.3.0/cc1plus -E -quiet -v -I /opt/local/libexec/boost/1.79/include -D__DYNAMIC__ spinlock_test.cpp -fPIC -mmacosx-version-min=10.6.0 -fpch-preprocess -o spinlock_test.ii
ignoring nonexistent directory "/opt/local/lib/gcc11/gcc/ppc-apple-darwin10/11.3.0/../../../../../ppc-apple-darwin10/include"
#include "..." search starts here:
#include <...> search starts here:
 /opt/local/libexec/boost/1.79/include
 /opt/local/include/gcc11/c++/
 /opt/local/include/gcc11/c++//ppc-apple-darwin10
 /opt/local/include/gcc11/c++//backward
 /opt/local/lib/gcc11/gcc/ppc-apple-darwin10/11.3.0/include
 /opt/local/include
 /opt/local/lib/gcc11/gcc/ppc-apple-darwin10/11.3.0/include-fixed
 /usr/include
 /System/Library/Frameworks
 /Library/Frameworks
End of search list.
COLLECT_GCC_OPTIONS='-save-temps' '-v' '-I' '/opt/local/libexec/boost/1.79/include' '-L/opt/local/libexec/boost/1.79/lib' '-o' 'spinlock_test' '-mmacosx-version-min=10.6.0' '-asm_macosx_version_min=10.6' '-nodefaultexport' '-shared-libgcc'
 /opt/local/libexec/gcc/ppc-apple-darwin10/11.3.0/cc1plus -fpreprocessed spinlock_test.ii -fPIC -quiet -dumpbase spinlock_test.cpp -dumpbase-ext .cpp -mmacosx-version-min=10.6.0 -version -o spinlock_test.s
GNU C++17 (MacPorts gcc11 11.3.0_1) version 11.3.0 (ppc-apple-darwin10)
	compiled by GNU C version 11.3.0, GMP version 6.2.1, MPFR version 4.1.0, MPC version 1.2.1, isl version isl-0.25-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
GNU C++17 (MacPorts gcc11 11.3.0_1) version 11.3.0 (ppc-apple-darwin10)
	compiled by GNU C version 11.3.0, GMP version 6.2.1, MPFR version 4.1.0, MPC version 1.2.1, isl version isl-0.25-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: bb9117971f7aa7e0fe2b40679b3b256c
COLLECT_GCC_OPTIONS='-save-temps' '-v' '-I' '/opt/local/libexec/boost/1.79/include' '-L/opt/local/libexec/boost/1.79/lib' '-o' 'spinlock_test' '-mmacosx-version-min=10.6.0'  '-nodefaultexport' '-shared-libgcc'
 /opt/local/bin/as -v -I /opt/local/libexec/boost/1.79/include -arch ppc -o spinlock_test.o spinlock_test.s
Apple Inc version cctools-localbuild, GNU assembler version 1.38
COMPILER_PATH=/opt/local/libexec/gcc/ppc-apple-darwin10/11.3.0/:/opt/local/libexec/gcc/ppc-apple-darwin10/11.3.0/:/opt/local/libexec/gcc/ppc-apple-darwin10/:/opt/local/lib/gcc11/gcc/ppc-apple-darwin10/11.3.0/:/opt/local/lib/gcc11/gcc/ppc-apple-darwin10/
LIBRARY_PATH=/opt/local/lib/gcc11/gcc/ppc-apple-darwin10/11.3.0/:/opt/local/lib/gcc11/gcc/ppc-apple-darwin10/11.3.0/../../../
COLLECT_GCC_OPTIONS='-save-temps' '-v' '-I' '/opt/local/libexec/boost/1.79/include' '-L/opt/local/libexec/boost/1.79/lib' '-o' 'spinlock_test' '-mmacosx-version-min=10.6.0'  '-nodefaultexport' '-shared-libgcc' '-dumpdir' 'spinlock_test.'
 /opt/local/libexec/gcc/ppc-apple-darwin10/11.3.0/collect2 -dynamic -arch ppc -macosx_version_min 10.6.0 -o spinlock_test -lcrt1.10.5.o -L/opt/local/libexec/boost/1.79/lib -L/opt/local/lib/gcc11/gcc/ppc-apple-darwin10/11.3.0 -L/opt/local/lib/gcc11/gcc/ppc-apple-darwin10/11.3.0/../../.. spinlock_test.o -lstdc++ -lgcc_s.1.1 -lgcc_s.10.5 -lgcc -lSystem -lef_ppc
36-172% /Users/svacchanda/Dev/boost/spinlock_test 
zsh: bus error  /Users/svacchanda/Dev/boost/spinlock_test
Comment 11 Sergey Fedorov 2022-11-10 18:16:47 UTC
(In reply to Andrew Pinski from comment #8)
> No preprocessed source of the code going wrong.
> Not even register state or instructions where the segfault is happening
> (this is would be very useful but not required).

I am on 10.6 PPC (10A190) now and cannot reboot into 10.5.8, so attaching .ii file from there, but the failure happens on Leopard just as well.
Comment 12 Sergey Fedorov 2022-11-10 18:24:16 UTC
(In reply to Andrew Pinski from comment #2)
> >Reason: 259 at address: 0x00003109
> 
> Yes that does seem like an alignment disagreement.
> 
> I suspect the code is broken for allocation and it is allocating unaligned
> structs.
> 
> Also inside gdb can you do the following:
> 
> disassemble $pc-0x10 $pc+0x10
> info registers

Here:

(gdb) run
Starting program: /Users/svacchanda/Dev/boost/a.out 
Reading symbols for shared libraries ++++.. done

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: 259 at address: 0x0000208d
0x00001b38 in boost::detail::spinlock::try_lock (this=0x208d) at spinlock_gcc_atomic.hpp:39
39	        return __atomic_test_and_set( &v_, __ATOMIC_ACQUIRE ) == 0;
(gdb) disassemble $pc-0x10 $pc+0x10
Dump of assembler code from 0x1b28 to 0x1b48:
0x00001b28 <_ZN5boost6detail8spinlock8try_lockEv+8>:	mr      r30,r1
0x00001b2c <_ZN5boost6detail8spinlock8try_lockEv+12>:	stw     r3,72(r30)
0x00001b30 <_ZN5boost6detail8spinlock8try_lockEv+16>:	lwz     r2,72(r30)
0x00001b34 <_ZN5boost6detail8spinlock8try_lockEv+20>:	li      r9,1
0x00001b38 <_ZN5boost6detail8spinlock8try_lockEv+24>:	lwarx   r10,0,r2
0x00001b3c <_ZN5boost6detail8spinlock8try_lockEv+28>:	stwcx.  r9,0,r2
0x00001b40 <_ZN5boost6detail8spinlock8try_lockEv+32>:	bne-    0x1b38 <_ZN5boost6detail8spinlock8try_lockEv+24>
0x00001b44 <_ZN5boost6detail8spinlock8try_lockEv+36>:	isync
End of assembler dump.
(gdb) info registers
r0             0x19c4	6596
r1             0xbffff9f0	3221223920
r2             0x208d	8333
r3             0x208d	8333
r4             0xbffffb30	3221224240
r5             0xbffffb38	3221224248
r6             0xbffffb88	3221224328
r7             0xc	12
r8             0x0	0
r9             0x1	1
r10            0x0	0
r11            0x2048	8264
r12            0x1b20	6944
r13            0x0	0
r14            0x0	0
r15            0x0	0
r16            0x0	0
r17            0x0	0
r18            0x0	0
r19            0x0	0
r20            0x0	0
r21            0x0	0
r22            0x0	0
r23            0x0	0
r24            0x0	0
r25            0x0	0
r26            0xbffffb2c	3221224236
r27            0x8	8
r28            0x0	0
r29            0x0	0
r30            0xbffff9f0	3221223920
r31            0x19a8	6568
pc             0x1b38	6968
ps             0x100000000000f030	1152921504606908464
cr             0x24000002	603979778
lr             0x1ba4	7076
ctr            0x1b20	6944
xer            0x0	0
mq             0x0	0
fpscr          0x0	0
vscr           0x10000	65536
vrsave         0x0	0
Comment 13 Andrew Pinski 2022-11-10 22:20:26 UTC
Oh yes there could be a bug here when the byte (lbarx/stbcx.) and half-word (lharx/sthcx.) instruction support was added (which was for Power8; r0-123873-g4b02c96265fb52).

But that was 9 years ago.
Is powerpc-darwin really coming back in style or something?
Comment 14 Iain Sandoe 2022-11-10 23:02:03 UTC
(In reply to Andrew Pinski from comment #13)
> Oh yes there could be a bug here when the byte (lbarx/stbcx.) and half-word
> (lharx/sthcx.) instruction support was added (which was for Power8;
> r0-123873-g4b02c96265fb52).
> 
> But that was 9 years ago.
> Is powerpc-darwin really coming back in style or something?

hehe ... it's never actually gone out of style .. but has (very) limited time available, so sometimes gets broken for a while ..
.. right now we have some keen folks doing some testing on a wider range of packages.

sorry my comments are not more constructive right now - I'm fully tied up with WG21 meeting.
Comment 15 Sergey Fedorov 2022-11-10 23:24:59 UTC
(In reply to Andrew Pinski from comment #13)
> Oh yes there could be a bug here when the byte (lbarx/stbcx.) and half-word
> (lharx/sthcx.) instruction support was added (which was for Power8;
> r0-123873-g4b02c96265fb52).

There was some argument re appropriateness lwarx etc. ages ago here: https://lists.boost.org/Archives/boost//2005/04/83865.php
Though Peter Dimov said it is likely irrelevant.

> But that was 9 years ago.
> Is powerpc-darwin really coming back in style or something?

PPC is very much alive! :)
(Primarily thanks to Iain’s efforts with GCC – without a new GCC we could not have done much. But yes, we have fixed some forever-broken stuff for PPC recently, and process ongoing.)
Comment 16 Sergey Fedorov 2024-01-24 13:21:55 UTC
(In reply to Andrew Pinski from comment #13)
> Oh yes there could be a bug here when the byte (lbarx/stbcx.) and half-word
> (lharx/sthcx.) instruction support was added (which was for Power8;
> r0-123873-g4b02c96265fb52).

Any update on this by the way? It could be that we keep getting obscure failures due to this issue being unfixed.
Comment 17 Sergey Fedorov 2024-06-13 10:10:31 UTC
(In reply to Andrew Pinski from comment #13)
> Oh yes there could be a bug here when the byte (lbarx/stbcx.) and half-word
> (lharx/sthcx.) instruction support was added (which was for Power8;
> r0-123873-g4b02c96265fb52).
> 
> But that was 9 years ago.
> Is powerpc-darwin really coming back in style or something?

Andrew, could you suggest a way to check if that was the issue? (I could perhaps revert the whole commit, but that may be excessive, perhaps a more precise change can be done?)