This is the mail archive of the libstdc++@gcc.gnu.org mailing list for the libstdc++ project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] PR libstdc++/67843 set shared_ptr lock policy at build-time


On 28/11/18 10:54 +0100, Christophe Lyon wrote:
On Wed, 28 Nov 2018 at 00:25, Jonathan Wakely <jwakely@redhat.com> wrote:

This resolves a longstanding issue where the lock policy for shared_ptr
reference counting depends on compilation options when the header is
included, so that different -march options can cause ABI changes. For
example, objects compiled with -march=armv7 will use atomics to
synchronize reference counts, and objects compiled with -march=armv5t
will use a mutex. That means the shared_ptr control block will have a
different layout in different objects, causing ODR violations and
undefined behaviour. This was the root cause of PR libstdc++/42734 as
well as PR libstdc++/67843.

The solution is to decide on the lock policy at build time, when
libstdc++ is configured. The configure script checks for the
availability of the necessary atomic built-ins for the target and fixes
that choice permanently. Different -march flags used to compile user
code will not cause changes to the lock policy. This results in an ABI
change for certain compilations, but only where there was already an ABI
incompatibility between the libstdc++.so library and objects built with
an incompatible -march option. In general, this means a more stable ABI
that isn't silently altered when -march flags make addition atomic ops
available.

To force a target to use "atomic" or "mutex" the new configure option
--with-libstdcxx-lock-policy can be used.

In order to turn ODR violations into linker errors, the uses of
shared_ptr in filesystem directory iterators have been replaced
with __shared_ptr, and explicit instantiations are declared. This
ensures that object files using those types cannot link to libstdc++
libs unless they use the same lock policy.

        PR libstdc++/67843
        * acinclude.m4 (GLIBCXX_ENABLE_LOCK_POLICY): Add new macro
        that defines _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY.
        * config.h.in: Regenerate.
        * configure: Regenerate.
        * configure.ac: Use GLIBCXX_ENABLE_LOCK_POLICY.
        * doc/xml/manual/configure.xml: Document new configure option.
        * include/bits/fs_dir.h (directory_iterator): Use __shared_ptr
        instead of shared_ptr.
        (recursive_directory_iterator): Likewise.
        (__shared_ptr<_Dir>): Add explicit instantiation declaration.
        (__shared_ptr<recursive_directory_iterator::_Dir_stack>): Likewise.
        * include/bits/shared_ptr_base.h (__allocate_shared, __make_shared):
        Add default template argument for _Lock_policy template parameter.
        * include/ext/concurrence.h (__default_lock_policy): Check macro
        _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY instead of checking if the current
        target supports the builtins for compare-and-swap.
        * src/filesystem/std-dir.cc (__shared_ptr<_Dir>): Add explicit
        instantiation definition.
        (__shared_ptr<recursive_directory_iterator::_Dir_stack>): Likewise.
        (directory_iterator, recursive_directory_iterator): Use __make_shared
        instead of make_shared.

Tested x86_64-linux and aarch64-linux, committed to trunk.

I would appreciate testing on ARM, especially Christophe's
-march=armv5t set up.


Hi Jonathan,

Your patch passed my tests, thanks!

Thanks for checking it.

directory_iterator.cc tests used to be killed by a timeout until last
night where now they do PASS in my configurations involving
-march=armv5t.

That being said, I'm not sure I fully understand the fix. In my tests
involving -march=armv5t, I still configure GCC --with-cpu=cortex-a9,
and I pass -march=armv5t as an override with the --target-board runtest option.

Yup, and that difference between the arch used to build libstdc++ and
the arch used to build the tests was what caused the FAILures. And
that's what my patch addresses, by ensuring that the arch used to
build libstdc++ is what dictates the type of synchronization for
shared_ptr.

In these cases libstdc++ configure detects support for "atomic", so I
suspect the tests pass only because I use QEMU with --cpu cortex-a9.
I think if I used a different QEMU cpu (eg arm926), the tests would
try to use atomics, and fail?

I don't know if a libstdc++ configured for cortex-a9 would run on an
arm296 QEMU. It might depend on cortex-a9 instructions (unrelated to
the use of atomics in my patch). See the end of this mail.

The reason I'm still using cortex-a9 at QEMU level is that many tests
override -mcpu/-march, and I used that as a compromise.

That means you're not fully testing armv5t, because you're still
linking to a libstdc++.so (and libstd++fs.a) that use the cortex-a9
instruction set. Usually that doesn't matter, because most libstdc++
code doesn't care about which specific instructions it gets compiled
to. That
However, I do have configurations using --with-cpu=arm10tdmi or
--with-cpu=default and QEMU --cpu arm926.
In these 2 cases, the tests do PASS, but they used to before your
patch. libstdc++ configure does detect "mutex" in these cases.

What matters is that the tests and the libstdc++ libraries agree on the
lock policy. It doesn't matter if they both use "atomic" or they both
use "mutex", only that they agree. Previously the lock policy was
selected by the preprocessor in every translation unit. That made it
very easy to get incompatible sets of objects. Now the selection is
made once per build of GCC, and is not affected by -march options when
compiling user code.

To be hopefully clearer, the subset of relevant configurations I use
is as follows:
GCC target / with-cpu / RUNTEST target-board / QEMU cpu
1- arm-none-linux-gnueabi[hf]  / cortex-a9 / -march=armv5t / cortex-a9
2- arm-none-linux-gnueabihf / arm10tdmi / "" / arm926
3- arm-none-eabi / default / "" / arm926

(1) uses atomics
(2) and (3) use mutex

All of them now say "PASS", but maybe I should give a try switch to
arm926 for QEMU in (1) ?

That would be a useful test.

If I understand correctly, that will still work because the atomic ops
will use the kernel helper in libgcc. Previously it would have failed
the same way as (1) was already failing, because --with-cpu=cortex-a9
when libstdc++ was built chose atomics, and the -march=armv5t target
board chose mutexes, and so the testsuite files were incompatible with
the library.

Since my patch, only the --with-cpu option matters, and the testsuite
files will still use atomics. I think the arm926 cpu running under
QEMU will implement those atomics via libgcc's kernel helpers.

However, that test might fail for other reasons, if the libstdc++ libs
configured for cortex-a9 use other insns that are not supported by
arm296 then you'd get a SIGILL when running the tests.  The kernel
helpers only implement the atomic operations, not any cortex-a9
instructions that are not supported by arm296.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]