[PATCH] Avoid atomic for guard acquire when that is expensive

Bernd Edlinger bernd.edlinger@hotmail.de
Sat Dec 5 12:37:41 GMT 2020


On 12/2/20 7:57 PM, Jason Merrill wrote:
> On 12/1/20 1:28 PM, Bernd Edlinger wrote:
>> On 11/24/20 11:10 PM, Jason Merrill wrote:
>>> On 11/22/20 3:05 AM, Bernd Edlinger wrote:
>>>> Hi,
>>>>
>>>> this avoids the need to use -fno-threadsafe-statics on
>>>> arm-none-eabi or working around that problem by supplying
>>>> a dummy __sync_synchronize function which might
>>>> just lead to silent code failure of the worst kind
>>>> (non-reproducable, racy) at runtime, as was pointed out
>>>> on previous discussions here.
>>>>
>>>> When the atomic access involves a call to __sync_synchronize
>>>> it is better to call __cxa_guard_acquire unconditionally,
>>>> since it handles the atomics too, or is a non-threaded
>>>> implementation when there is no gthread support for this target.
>>>>
>>>> This fixes also a bug for the ARM EABI big-endian target,
>>>> that is, previously the wrong bit was checked.
>>>
>>> Instead of a new target macro, can't you follow fold_builtin_atomic_always_lock_free/can_atomic_load_p?
>>
>> Yes, thanks, that should work too.
>> Would you like this better?
> 
>> +is_atomic_expensive_p (machine_mode mode)
>> +{
>> +  if (!flag_inline_atomics)
>> +    return false;
> 
> Why not true?
> 

Ooops...
Yes, I ought to return true here.
I must have made a mistake when I tested the last version of this patch,
sorry for the confusion.

>> +  if (!can_compare_and_swap_p (mode, false) || !can_atomic_load_p (mode))
>> +    return false;
> 
> This also seems backwards; I'd think we want to return false if either of those tests are true.  Or maybe just can_atomic_load_p, and not bother about compare-and-swap.
> 


Yes, you are right.
Unfortuately can_atomic_load_p is too weak, since it does not cover
the memory barrier.

And can_compare_and_swap_p (..., false) is actually a bit too strong,
but if it returns true, we should be able to use any atomic without
need for a library call.

>> +      tree type = targetm.cxx.guard_mask_bit ()
>> +          ? TREE_TYPE (guard) : char_type_node;
>> +
>> +      if (is_atomic_expensive_p (TYPE_MODE (type)))
>> +    guard = integer_zero_node;
>> +      else
>> +    guard = build_atomic_load_type (guard, MEMMODEL_ACQUIRE, type);
> 
> It should still work to load a single byte, it just needs to be the least-significant byte.  And this isn't an EABI issue; it looks like the non-EABI code is also broken for big-endian targets, both the atomic load and the normal load in get_guard_bits.
> 

I think the non-EABI code is always using bit 0 in the first byte,
by using the endian-neutral #define _GLIBCXX_GUARD_BIT __guard_test_bit (0, 1).

Only ARM EABI uses bit 0 in byte 3 if big-endian and bit 0 in byte 0 otherwise.

For all other targets when _GLIBCXX_USE_FUTEX is defined,
__cxa_guard_XXX accesses the value as int* while the memory
is a 64-bit long, so I could imagine that is an aliasing violation.


But nothing that needs to be fixed immediately.


Attached is the corrected patch.

Tested again on arm-none-eabi with arm-sim.
Is it OK for trunk?

Thanks
Bernd.

> Jason
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Avoid-atomic-for-guard-acquire-when-that-is-expensiv.patch
Type: text/x-patch
Size: 3073 bytes
Desc: not available
URL: <https://gcc.gnu.org/pipermail/gcc-patches/attachments/20201205/4805ad5c/attachment.bin>


More information about the Gcc-patches mailing list