[PATCH] PR libstdc++/67843 set shared_ptr lock policy at build-time

Jonathan Wakely jwakely@redhat.com
Wed Nov 28 11:35:00 GMT 2018


On 28/11/18 10:54 +0100, Christophe Lyon wrote:
>On Wed, 28 Nov 2018 at 00:25, Jonathan Wakely <jwakely@redhat.com> wrote:
>>
>> This resolves a longstanding issue where the lock policy for shared_ptr
>> reference counting depends on compilation options when the header is
>> included, so that different -march options can cause ABI changes. For
>> example, objects compiled with -march=armv7 will use atomics to
>> synchronize reference counts, and objects compiled with -march=armv5t
>> will use a mutex. That means the shared_ptr control block will have a
>> different layout in different objects, causing ODR violations and
>> undefined behaviour. This was the root cause of PR libstdc++/42734 as
>> well as PR libstdc++/67843.
>>
>> The solution is to decide on the lock policy at build time, when
>> libstdc++ is configured. The configure script checks for the
>> availability of the necessary atomic built-ins for the target and fixes
>> that choice permanently. Different -march flags used to compile user
>> code will not cause changes to the lock policy. This results in an ABI
>> change for certain compilations, but only where there was already an ABI
>> incompatibility between the libstdc++.so library and objects built with
>> an incompatible -march option. In general, this means a more stable ABI
>> that isn't silently altered when -march flags make addition atomic ops
>> available.
>>
>> To force a target to use "atomic" or "mutex" the new configure option
>> --with-libstdcxx-lock-policy can be used.
>>
>> In order to turn ODR violations into linker errors, the uses of
>> shared_ptr in filesystem directory iterators have been replaced
>> with __shared_ptr, and explicit instantiations are declared. This
>> ensures that object files using those types cannot link to libstdc++
>> libs unless they use the same lock policy.
>>
>>         PR libstdc++/67843
>>         * acinclude.m4 (GLIBCXX_ENABLE_LOCK_POLICY): Add new macro
>>         that defines _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY.
>>         * config.h.in: Regenerate.
>>         * configure: Regenerate.
>>         * configure.ac: Use GLIBCXX_ENABLE_LOCK_POLICY.
>>         * doc/xml/manual/configure.xml: Document new configure option.
>>         * include/bits/fs_dir.h (directory_iterator): Use __shared_ptr
>>         instead of shared_ptr.
>>         (recursive_directory_iterator): Likewise.
>>         (__shared_ptr<_Dir>): Add explicit instantiation declaration.
>>         (__shared_ptr<recursive_directory_iterator::_Dir_stack>): Likewise.
>>         * include/bits/shared_ptr_base.h (__allocate_shared, __make_shared):
>>         Add default template argument for _Lock_policy template parameter.
>>         * include/ext/concurrence.h (__default_lock_policy): Check macro
>>         _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY instead of checking if the current
>>         target supports the builtins for compare-and-swap.
>>         * src/filesystem/std-dir.cc (__shared_ptr<_Dir>): Add explicit
>>         instantiation definition.
>>         (__shared_ptr<recursive_directory_iterator::_Dir_stack>): Likewise.
>>         (directory_iterator, recursive_directory_iterator): Use __make_shared
>>         instead of make_shared.
>>
>> Tested x86_64-linux and aarch64-linux, committed to trunk.
>>
>> I would appreciate testing on ARM, especially Christophe's
>> -march=armv5t set up.
>>
>
>Hi Jonathan,
>
>Your patch passed my tests, thanks!

Thanks for checking it.

>directory_iterator.cc tests used to be killed by a timeout until last
>night where now they do PASS in my configurations involving
>-march=armv5t.
>
>That being said, I'm not sure I fully understand the fix. In my tests
>involving -march=armv5t, I still configure GCC --with-cpu=cortex-a9,
>and I pass -march=armv5t as an override with the --target-board runtest option.

Yup, and that difference between the arch used to build libstdc++ and
the arch used to build the tests was what caused the FAILures. And
that's what my patch addresses, by ensuring that the arch used to
build libstdc++ is what dictates the type of synchronization for
shared_ptr.

>In these cases libstdc++ configure detects support for "atomic", so I
>suspect the tests pass only because I use QEMU with --cpu cortex-a9.
>I think if I used a different QEMU cpu (eg arm926), the tests would
>try to use atomics, and fail?

I don't know if a libstdc++ configured for cortex-a9 would run on an
arm296 QEMU. It might depend on cortex-a9 instructions (unrelated to
the use of atomics in my patch). See the end of this mail.

>The reason I'm still using cortex-a9 at QEMU level is that many tests
>override -mcpu/-march, and I used that as a compromise.

That means you're not fully testing armv5t, because you're still
linking to a libstdc++.so (and libstd++fs.a) that use the cortex-a9
instruction set. Usually that doesn't matter, because most libstdc++
code doesn't care about which specific instructions it gets compiled
to. That 

>However, I do have configurations using --with-cpu=arm10tdmi or
>--with-cpu=default and QEMU --cpu arm926.
>In these 2 cases, the tests do PASS, but they used to before your
>patch. libstdc++ configure does detect "mutex" in these cases.

What matters is that the tests and the libstdc++ libraries agree on the
lock policy. It doesn't matter if they both use "atomic" or they both
use "mutex", only that they agree. Previously the lock policy was
selected by the preprocessor in every translation unit. That made it
very easy to get incompatible sets of objects. Now the selection is
made once per build of GCC, and is not affected by -march options when
compiling user code.

>To be hopefully clearer, the subset of relevant configurations I use
>is as follows:
>GCC target / with-cpu / RUNTEST target-board / QEMU cpu
>1- arm-none-linux-gnueabi[hf]  / cortex-a9 / -march=armv5t / cortex-a9
>2- arm-none-linux-gnueabihf / arm10tdmi / "" / arm926
>3- arm-none-eabi / default / "" / arm926
>
>(1) uses atomics
>(2) and (3) use mutex
>
>All of them now say "PASS", but maybe I should give a try switch to
>arm926 for QEMU in (1) ?

That would be a useful test.

If I understand correctly, that will still work because the atomic ops
will use the kernel helper in libgcc. Previously it would have failed
the same way as (1) was already failing, because --with-cpu=cortex-a9
when libstdc++ was built chose atomics, and the -march=armv5t target
board chose mutexes, and so the testsuite files were incompatible with
the library.

Since my patch, only the --with-cpu option matters, and the testsuite
files will still use atomics. I think the arm926 cpu running under
QEMU will implement those atomics via libgcc's kernel helpers.

However, that test might fail for other reasons, if the libstdc++ libs
configured for cortex-a9 use other insns that are not supported by
arm296 then you'd get a SIGILL when running the tests.  The kernel
helpers only implement the atomic operations, not any cortex-a9
instructions that are not supported by arm296.



More information about the Gcc-patches mailing list