[committed] libstdc++: Use __libc_single_threaded to optimise atomics [PR 96817]
Jonathan Wakely
jwakely@redhat.com
Thu Oct 1 07:50:07 GMT 2020
On 01/10/20 09:30 +0200, Christophe Lyon via Libstdc++ wrote:
>On Wed, 30 Sep 2020 at 22:44, Jonathan Wakely <jwakely@redhat.com> wrote:
>>
>> On 30/09/20 16:03 +0100, Jonathan Wakely wrote:
>> >On 29/09/20 13:51 +0200, Christophe Lyon via Libstdc++ wrote:
>> >>On Sat, 26 Sep 2020 at 21:42, Jonathan Wakely via Gcc-patches
>> >><gcc-patches@gcc.gnu.org> wrote:
>> >>>
>> >>>Glibc 2.32 adds a global variable that says whether the process is
>> >>>single-threaded. We can use this to decide whether to elide atomic
>> >>>operations, as a more precise and reliable indicator than
>> >>>__gthread_active_p.
>> >>>
>> >>>This means that guard variables for statics and reference counting in
>> >>>shared_ptr can use less expensive, non-atomic ops even in processes that
>> >>>are linked to libpthread, as long as no threads have been created yet.
>> >>>It also means that we switch to using atomics if libpthread gets loaded
>> >>>later via dlopen (this still isn't supported in general, for other
>> >>>reasons).
>> >>>
>> >>>We can't use __libc_single_threaded to replace __gthread_active_p
>> >>>everywhere. If we replaced the uses of __gthread_active_p in std::mutex
>> >>>then we would elide the pthread_mutex_lock in the code below, but not
>> >>>the pthread_mutex_unlock:
>> >>>
>> >>> std::mutex m;
>> >>> m.lock(); // pthread_mutex_lock
>> >>> std::thread t([]{}); // __libc_single_threaded = false
>> >>> t.join();
>> >>> m.unlock(); // pthread_mutex_unlock
>> >>>
>> >>>We need the lock and unlock to use the same "is threading enabled"
>> >>>predicate, and similarly for init/destroy pairs for mutexes and
>> >>>condition variables, so that we don't try to release resources that were
>> >>>never acquired.
>> >>>
>> >>>There are other places that could use __libc_single_threaded, such as
>> >>>_Sp_locker in src/c++11/shared_ptr.cc and locale init functions, but
>> >>>they can be changed later.
>> >>>
>> >>>libstdc++-v3/ChangeLog:
>> >>>
>> >>> PR libstdc++/96817
>> >>> * include/ext/atomicity.h (__gnu_cxx::__is_single_threaded()):
>> >>> New function wrapping __libc_single_threaded if available.
>> >>> (__exchange_and_add_dispatch, __atomic_add_dispatch): Use it.
>> >>> * libsupc++/guard.cc (__cxa_guard_acquire, __cxa_guard_abort)
>> >>> (__cxa_guard_release): Likewise.
>> >>> * testsuite/18_support/96817.cc: New test.
>> >>>
>> >>>Tested powerpc64le-linux, with glibc 2.31 and 2.32. Committed to trunk.
>> >>
>> >>Hi,
>> >>
>> >>This patch introduced regressions on armeb-linux-gnueabhf:
>> >>--target armeb-none-linux-gnueabihf --with-cpu cortex-a9
>> >> g++.dg/compat/init/init-ref2 cp_compat_x_tst.o-cp_compat_y_tst.o execute
>> >> g++.dg/cpp2a/decomp1.C -std=gnu++14 execution test
>> >> g++.dg/cpp2a/decomp1.C -std=gnu++17 execution test
>> >> g++.dg/cpp2a/decomp1.C -std=gnu++2a execution test
>> >> g++.dg/init/init-ref2.C -std=c++14 execution test
>> >> g++.dg/init/init-ref2.C -std=c++17 execution test
>> >> g++.dg/init/init-ref2.C -std=c++2a execution test
>> >> g++.dg/init/init-ref2.C -std=c++98 execution test
>> >> g++.dg/init/ref15.C -std=c++14 execution test
>> >> g++.dg/init/ref15.C -std=c++17 execution test
>> >> g++.dg/init/ref15.C -std=c++2a execution test
>> >> g++.dg/init/ref15.C -std=c++98 execution test
>> >> g++.old-deja/g++.jason/pmf7.C -std=c++98 execution test
>> >> g++.old-deja/g++.mike/leak1.C -std=c++14 execution test
>> >> g++.old-deja/g++.mike/leak1.C -std=c++17 execution test
>> >> g++.old-deja/g++.mike/leak1.C -std=c++2a execution test
>> >> g++.old-deja/g++.mike/leak1.C -std=c++98 execution test
>> >> g++.old-deja/g++.other/init19.C -std=c++14 execution test
>> >> g++.old-deja/g++.other/init19.C -std=c++17 execution test
>> >> g++.old-deja/g++.other/init19.C -std=c++2a execution test
>> >> g++.old-deja/g++.other/init19.C -std=c++98 execution test
>> >>
>> >>and probably some (280) in libstdc++ tests: (I didn't bisect those):
>> >> 19_diagnostics/error_category/generic_category.cc execution test
>> >> 19_diagnostics/error_category/system_category.cc execution test
>> >> 20_util/scoped_allocator/1.cc execution test
>> >> 20_util/scoped_allocator/2.cc execution test
>> >> 20_util/scoped_allocator/construct_pair_c++2a.cc execution test
>> >> 20_util/to_address/debug.cc execution test
>> >> 20_util/variant/run.cc execution test
>> >
>> >I think this is a latent bug in the static initialization code for
>> >EABI that affects big endian. In libstdc++-v3/libsupc++/guard.cc we
>> >have:
>> >
>> ># ifndef _GLIBCXX_GUARD_TEST_AND_ACQUIRE
>> >
>> >// Test the guard variable with a memory load with
>> >// acquire semantics.
>> >
>> >inline bool
>> >__test_and_acquire (__cxxabiv1::__guard *g)
>> >{
>> > unsigned char __c;
>> > unsigned char *__p = reinterpret_cast<unsigned char *>(g);
>> > __atomic_load (__p, &__c, __ATOMIC_ACQUIRE);
>> > (void) __p;
>> > return _GLIBCXX_GUARD_TEST(&__c);
>> >}
>> ># define _GLIBCXX_GUARD_TEST_AND_ACQUIRE(G) __test_and_acquire (G)
>> ># endif
>> >
>> >That inspects the first byte of the guard variable. But for EABI the
>> >"is initialized" bit is the least significant bit of the guard
>> >variable. For little endian that's fine, the least significant bit is
>> >in the first byte. But for big endian, it's not in the first byte, so
>> >we are looking in the wrong place. This means that the initial check
>> >in __cxa_guard_acquire is wrong:
>> >
>> > extern "C"
>> > int __cxa_guard_acquire (__guard *g)
>> > {
>> >#ifdef __GTHREADS
>> > // If the target can reorder loads, we need to insert a read memory
>> > // barrier so that accesses to the guarded variable happen after the
>> > // guard test.
>> > if (_GLIBCXX_GUARD_TEST_AND_ACQUIRE (g))
>> > return 0;
>> >
>> >This will always be false for big endian EABI, which means we run the
>> >rest of the function even when the variable is initialized. Previously
>> >that still gave the right answer, but inefficiently. After my change
>> >it gives the wrong answer, because the new code assumes that the check
>> >right at the start of __cxa_guard_acquire ACTUALLY WORKS. Silly me.
>> >
>> >I think the attached should fix it. This should be backported to fix
>> >inefficient code for static init on big endian EABI targets.
>>
>>
>> Committed as r11-3572, but I haven't been able to test it on a big
>> endian EABI system, only little endian EABI.
>>
>Hi Jonathan,
>
>unfortunately, this breaks the build even for little-endian for me:
>9980570_1.tmpdir/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-arm-none-linux-gnueabi/gcc3/arm-none-linux-gnueabi/libstdc++-v3/include
>-I/tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++
>-D_GLIBCXX_SHARED -fno-implicit-templates -Wall -Wextra
>-Wwrite-strings -Wcast-qual -Wabi=2 -fdiagnostics-show-location=once
>-ffunction-sections -fdata-sections -frandom-seed=guard.lo -g -O2
>-D_GNU_SOURCE -c
>/tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++/guard.cc
> -fPIC -DPIC -D_GLIBCXX_SHARED -o guard.o
>In file included from
>/tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++/cxxabi.h:50,
> from
>/tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++/guard.cc:28:
>/tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++/guard.cc:
>In function 'int
>__cxxabiv1::__cxa_guard_acquire(__cxxabiv1::__guard*)':
>/tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++/guard.cc:249:9:
>error: invalid type argument of unary '*' (have 'int')
> 249 | if (_GLIBCXX_GUARD_TEST_AND_ACQUIRE (g))
> | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>make[4]: *** [Makefile:763: guard.lo] Error 1
>
>How did you test it? Maybe I'm missing something?
Apparently I didn't do a clean build and the guard.cc file didn't get
recompiled with the new header.
The attached works on armv7l-unknown-linux-gnueabihf, testing it now.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: patch.txt
Type: text/x-patch
Size: 918 bytes
Desc: not available
URL: <https://gcc.gnu.org/pipermail/gcc-patches/attachments/20201001/413b05c8/attachment-0001.bin>
More information about the Gcc-patches
mailing list