[PATCH, ARM 2/7, ping3] Adapt atomic and exclusive load and store to ARMv8-M Baseline

Kyrill Tkachov kyrylo.tkachov@foss.arm.com
Mon Oct 24 16:40:00 GMT 2016


Hi Thomas,

On 24/10/16 09:04, Thomas Preudhomme wrote:
> Ping?
>
> Best regards,
>
> Thomas
>
> On 14/10/16 14:48, Thomas Preudhomme wrote:
>> Ping?
>>
>> Best regards,
>>
>> Thomas
>>
>> On 03/10/16 17:42, Thomas Preudhomme wrote:
>>> Ping?
>>>
>>> Best regards,
>>>
>>> Thomas
>>>
>>> On 22/09/16 14:41, Thomas Preudhomme wrote:
>>>> Hi,
>>>>
>>>> This patch is part of a patch series to add support for atomic operations on
>>>> ARMv8-M Baseline targets in GCC. This specific patch adapts atomic and exclusive
>>>> load and store patterns to the constraints of ARMv8-M Baseline. It consists of
>>>> two sets of changes:
>>>>
>>>> - adding non predicated output templates because ARMv8-M Baseline does not have
>>>> IT instruction
>>>> - use low registers for ldr/str
>>>>
>>>> Together these changes require to create 2 new alternatives for atomic_load and
>>>> atomic_store: (i) one for relaxed, consume and release memory model (the new Pf
>>>> constraint) where ldr/str are used and thus low registers must be used and (ii)
>>>> another one for the other memory model where lda/stl are used. These are
>>>> separate from the constraint for 32bit targets whose output templates expect
>>>> predication.
>>>>
>>>> ChangeLog entry is as follows:
>>>>
>>>> *** gcc/ChangeLog ***
>>>>
>>>> 2016-07-05  Thomas Preud'homme <thomas.preudhomme@arm.com>
>>>>
>>>>         * config/arm/constraints.md (Q constraint): Document its use for
>>>>         Thumb-1.
>>>>         (Pf constraint): New constraint for relaxed, consume or relaxed memory
>>>>         models.
>>>>         * config/arm/sync.md (atomic_load<mode>): Add new ARMv8-M Baseline only
>>>>         alternatives to allow any register when memory model matches Pf and
>>>>         thus lda is used, but only low registers otherwise. Use unpredicated
>>>>         output template for Thumb-1 targets.
>>>>         (atomic_store<mode>): Likewise for stl.
>>>>         (arm_load_exclusive<mode>): Add new ARMv8-M Baseline only alternative
>>>>         whose output template does not have predication.
>>>>         (arm_load_acquire_exclusive<mode>): Likewise.
>>>>         (arm_load_exclusivesi): Likewise.
>>>>         (arm_load_acquire_exclusivesi): Likewise.
>>>>         (arm_store_release_exclusive<mode>): Likewise.
>>>>         (arm_store_exclusive<mode>): Use unpredicated output template for
>>>>         Thumb-1 targets.
>>>>
>>>>
>>>> Testing: No code generation difference for ARMv7-A, ARMv7VE and ARMv8-A on all
>>>> atomic and synchronization testcases in the testsuite [2]. Patchset was also
>>>> bootstrapped with --enable-itm --enable-gomp on ARMv8-A in ARM and Thumb mode at
>>>> optimization level -O1 and above [1] without any regression in the testsuite and
>>>> no code generation difference in libitm and libgomp.
>>>>
>>>> Code generation for ARMv8-M Baseline has been manually examined and compared
>>>> against ARMv8-A Thumb-2 for the following configuration without finding any
>>>> issue:
>>>>
>>>> gcc.dg/atomic-op-2.c at -Os
>>>> gcc.dg/atomic-compare-exchange-2.c at -Os
>>>> gcc.dg/atomic-compare-exchange-3.c at -O3
>>>>
>>>>
>>>> Is this ok for trunk?
>>>>
>>>> Best regards,
>>>>
>>>> Thomas
>>>>
>>>> [1] CFLAGS_FOR_TARGET and CXXFLAGS_FOR_TARGET were set to "-O1 -g", "-O3 -g" and
>>>> undefined ("-O2 -g")
>>>> [2] The exact list is:
>>>>
>>>> gcc/testsuite/gcc.dg/atomic-compare-exchange-1.c
>>>> gcc/testsuite/gcc.dg/atomic-compare-exchange-2.c
>>>> gcc/testsuite/gcc.dg/atomic-compare-exchange-3.c
>>>> gcc/testsuite/gcc.dg/atomic-exchange-1.c
>>>> gcc/testsuite/gcc.dg/atomic-exchange-2.c
>>>> gcc/testsuite/gcc.dg/atomic-exchange-3.c
>>>> gcc/testsuite/gcc.dg/atomic-fence.c
>>>> gcc/testsuite/gcc.dg/atomic-flag.c
>>>> gcc/testsuite/gcc.dg/atomic-generic.c
>>>> gcc/testsuite/gcc.dg/atomic-generic-aux.c
>>>> gcc/testsuite/gcc.dg/atomic-invalid-2.c
>>>> gcc/testsuite/gcc.dg/atomic-load-1.c
>>>> gcc/testsuite/gcc.dg/atomic-load-2.c
>>>> gcc/testsuite/gcc.dg/atomic-load-3.c
>>>> gcc/testsuite/gcc.dg/atomic-lockfree.c
>>>> gcc/testsuite/gcc.dg/atomic-lockfree-aux.c
>>>> gcc/testsuite/gcc.dg/atomic-noinline.c
>>>> gcc/testsuite/gcc.dg/atomic-noinline-aux.c
>>>> gcc/testsuite/gcc.dg/atomic-op-1.c
>>>> gcc/testsuite/gcc.dg/atomic-op-2.c
>>>> gcc/testsuite/gcc.dg/atomic-op-3.c
>>>> gcc/testsuite/gcc.dg/atomic-op-6.c
>>>> gcc/testsuite/gcc.dg/atomic-store-1.c
>>>> gcc/testsuite/gcc.dg/atomic-store-2.c
>>>> gcc/testsuite/gcc.dg/atomic-store-3.c
>>>> gcc/testsuite/g++.dg/ext/atomic-1.C
>>>> gcc/testsuite/g++.dg/ext/atomic-2.C
>>>> gcc/testsuite/gcc.target/arm/atomic-comp-swap-release-acquire.c
>>>> gcc/testsuite/gcc.target/arm/atomic-op-acq_rel.c
>>>> gcc/testsuite/gcc.target/arm/atomic-op-acquire.c
>>>> gcc/testsuite/gcc.target/arm/atomic-op-char.c
>>>> gcc/testsuite/gcc.target/arm/atomic-op-consume.c
>>>> gcc/testsuite/gcc.target/arm/atomic-op-int.c
>>>> gcc/testsuite/gcc.target/arm/atomic-op-relaxed.c
>>>> gcc/testsuite/gcc.target/arm/atomic-op-release.c
>>>> gcc/testsuite/gcc.target/arm/atomic-op-seq_cst.c
>>>> gcc/testsuite/gcc.target/arm/atomic-op-short.c
>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_1.c
>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_2.c
>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_3.c
>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_4.c
>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_5.c
>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_6.c
>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_7.c
>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_8.c
>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_9.c
>>>> gcc/testsuite/gcc.target/arm/sync-1.c
>>>> gcc/testsuite/gcc.target/arm/synchronize.c
>>>> gcc/testsuite/gcc.target/arm/armv8-sync-comp-swap.c
>>>> gcc/testsuite/gcc.target/arm/armv8-sync-op-acquire.c
>>>> gcc/testsuite/gcc.target/arm/armv8-sync-op-full.c
>>>> gcc/testsuite/gcc.target/arm/armv8-sync-op-release.c
>>>> libstdc++-v3/testsuite/29_atomics/atomic/60658.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic/62259.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic/64658.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic/65147.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic/65913.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic/70766.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/49445.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/constexpr.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/copy_list.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/default.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/direct_list.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/single_value.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/user_pod.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic/operators/51811.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic/operators/56011.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic/operators/integral_assignment.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic/operators/integral_conversion.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic/operators/pointer_partial_void.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic/requirements/base_classes.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic/requirements/compare_exchange_lowering.cc
>>>>
>>>>
>>>> libstdc++-v3/testsuite/29_atomics/atomic/requirements/explicit_instantiation/1.cc
>>>>
>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/clear/1.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/1.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/56012.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/aggregate.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/default.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/requirements/standard_layout.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/requirements/trivial.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/explicit.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/implicit.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/60940.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/65147.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/constexpr.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/copy_list.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/default.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/direct_list.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/single_value.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/bitwise.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/decrement.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/increment.cc
>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/integral_assignment.cc
>>>>
>>>>
>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/integral_conversion.cc
>>>>
>>>>
>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/requirements/standard_layout.cc
>>>>
>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/requirements/trivial.cc
>>>> libstdc++-v3/testsuite/29_atomics/headers/atomic/functions_std_c++0x.cc
>>>> libstdc++-v3/testsuite/29_atomics/headers/atomic/macros.cc
>>>> libstdc++-v3/testsuite/29_atomics/headers/atomic/types_std_c++0x.cc
<snip>

      else
-      return \"lda<sync_sfx>%?\\t%0, %1\";
+      {
+	if (TARGET_THUMB1)
+	  return \"lda<sync_sfx>\\t%0, %1\";
+	else
+	  return \"lda<sync_sfx>%?\\t%0, %1\";
+      }
    }
-  [(set_attr "predicable" "yes")
+  [(set_attr "arch" "32,v8mb,any")
+   (set_attr "predicable" "yes")
     (set_attr "predicable_short_it" "no")])


Please set the predicable attribute to "no" for the v8mb alternative.
It wouldn't change any functionality as the ifcvt pass for conditional execution
won't run for ARMv8-M Baseline but it's better to be explicit for documentation purposes.
Same for the other patterns where you add new v8mb alternatives.

Ok with that change.
Sorry for the delay,
Kyrill



More information about the Gcc-patches mailing list