[PATCH][libatomic] Add nvptx support

Tom de Vries tdevries@suse.de
Wed Sep 9 14:14:02 GMT 2020


On 9/9/20 3:15 PM, Tom de Vries wrote:
> On 9/9/20 2:36 PM, Tobias Burnus wrote:
>> Hi Tom,
>>
>> On 9/8/20 5:05 PM, Tobias Burnus wrote:
>>
>>> On 9/8/20 8:51 AM, Tom de Vries wrote:
>>>>     PR target/96964
>>>>     * config/nvptx/nvptx.md (define_expand "atomic_test_and_set"): New
>>>>     expansion.
>>>>     * sync-builtins.def (BUILT_IN_ATOMIC_TEST_AND_SET_1): New builtin.
>>
>> I have your patch applied on a current mainline powerpc64le-none-linux-gnu
>> + nvptx offloading build.
> 
> Thanks for trying this out.
> 
>> And I observe the following fails – which seems
>> to be new and related to your patch (but I have not confirmed it by
>> reverting your libatomic patch).
>>
> 
> Could you confirm that?
> 
> Meanwhile, I'll try to reproduce on x86_64.
> 
>> Required option for the fail: "-O2 -ftracer",
>> hence, only the "-O3 ..." testsuite builds fail.
>> (-ftracer = "Perform tail duplication to enlarge superblock size.")
>>
>>
>> during RTL pass: mach
>> asyncwait-1.f90:19: internal compiler error: in nvptx_find_par, at
>> config/nvptx/nvptx.c:3293
>> 0x10bf9f13 nvptx_find_par
>>         gcc/config/nvptx/nvptx.c:3293
>> 0x10bf9b97 nvptx_find_par
>>         gcc/config/nvptx/nvptx.c:3320
>> 0x10bf9b97 nvptx_find_par
>>         gcc/config/nvptx/nvptx.c:3320
>> ...
>>
>>
>> The ICE occurs for the second assert of:
>>         case CODE_FOR_nvptx_join:
>>           /* A loop tail.  Finish the current loop and return to
>>              parent.  */
>>           {
>>             unsigned mask = UINTVAL (XVECEXP (PATTERN (end), 0, 0));
>>
>>             gcc_assert (par->mask == mask);
>>             gcc_assert (par->join_block == NULL);
>>
>> gdb shows:
>> (gdb) p debug_bb(par->join_block )
>> (note 213 30 31 24 [bb 24] NOTE_INSN_BASIC_BLOCK)
>> (insn 31 213 204 24 (unspec_volatile:SI [
>>             (const_int 4 [0x4])
>>         ] UNSPECV_JOIN)
>> "libgomp/testsuite/libgomp.oacc-fortran/deep-copy-8.f90":24:0 237
>> {nvptx_join}
>>      (nil))
>> (jump_insn 204 31 205 24 (set (pc)
>>         (label_ref 198)) 121 {jump}
>>      (nil)
>>  -> 198)
>>
> 
> Yep, code duplication works against the matching of fork/join, it's not
> the first time we see this.
> 
> Usually the fix is to make an optimization pass conservative with
> respect to these fork/join regions, but AFAICT, ftracer already has such
> code in ignore_bb_p that tests gimple_call_internal_unique_p.
> 
> So, perhaps the ftracer pass is the trigger, but not the pass that does
> the problematic transformation? Just a guess at this point.
> 

I can reproduce it, and it's indeed the ftracer pass that does the
duplication.  So, the question is why doesn't ignore_bb_p work.

Thanks,
- Tom

> 
>>
>> That affects the testcases:
>> libgomp.oacc-fortran/asyncwait-1.f90
>> libgomp.oacc-fortran/asyncwait-2.f90
>> libgomp.oacc-fortran/asyncwait-3.f90
>> libgomp.oacc-fortran/atomic_capture-1.f90
>> libgomp.oacc-fortran/atomic_update-1.f90
>> libgomp.oacc-fortran/classtypes-1.f95
>> libgomp.oacc-fortran/collapse-1.f90
>> libgomp.oacc-fortran/collapse-2.f90
>> libgomp.oacc-fortran/collapse-3.f90
>> libgomp.oacc-fortran/collapse-4.f90
>> libgomp.oacc-fortran/collapse-5.f90
>> libgomp.oacc-fortran/collapse-6.f90
>> libgomp.oacc-fortran/collapse-7.f90
>> libgomp.oacc-fortran/collapse-8.f90
>> libgomp.oacc-fortran/combined-directives-1.f90
>> libgomp.oacc-fortran/combined-reduction.f90
>> libgomp.oacc-fortran/common-block-1.f90
>> libgomp.oacc-fortran/common-block-2.f90
>> libgomp.oacc-fortran/common-block-3.f90
>> libgomp.oacc-fortran/deep-copy-1.f90
>> libgomp.oacc-fortran/deep-copy-3.f90
>> libgomp.oacc-fortran/deep-copy-4.f90
>> libgomp.oacc-fortran/deep-copy-5.f90
>> libgomp.oacc-fortran/deep-copy-6-no_finalize.F90
>> libgomp.oacc-fortran/deep-copy-6.f90
>> libgomp.oacc-fortran/deep-copy-7.f90
>> libgomp.oacc-fortran/deep-copy-8.f90
>> libgomp.oacc-fortran/derived-type-1.f90
>> libgomp.oacc-fortran/host_data-2.f90
>> libgomp.oacc-fortran/host_data-3.f
>> libgomp.oacc-fortran/host_data-4.f90
>> libgomp.oacc-fortran/implicit-firstprivate-ref.f90
>> libgomp.oacc-fortran/lib-14.f90
>> libgomp.oacc-fortran/map-1.f90
>> libgomp.oacc-fortran/nested-function-1.f90
>> libgomp.oacc-fortran/nested-function-2.f90
>> libgomp.oacc-fortran/nested-function-3.f90
>> libgomp.oacc-fortran/no_create-3.F90
>> libgomp.oacc-fortran/optional-data-copyin.f90
>> libgomp.oacc-fortran/optional-data-copyout.f90
>> libgomp.oacc-fortran/optional-data-enter-exit.f90
>> libgomp.oacc-fortran/optional-declare.f90
>> libgomp.oacc-fortran/optional-firstprivate.f90
>> libgomp.oacc-fortran/optional-reduction.f90
>> libgomp.oacc-fortran/optional-update-device.f90
>> libgomp.oacc-fortran/optional-update-host.f90
>> libgomp.oacc-fortran/parallel-dims.f90
>> libgomp.oacc-fortran/parallel-loop-1.f90
>> libgomp.oacc-fortran/pr81352.f90
>> libgomp.oacc-fortran/pr84028.f90
>> libgomp.oacc-fortran/reduction-1.f90
>> libgomp.oacc-fortran/reduction-2.f90
>> libgomp.oacc-fortran/reduction-3.f90
>> libgomp.oacc-fortran/reduction-4.f90
>> libgomp.oacc-fortran/reduction-5.f90
>> libgomp.oacc-fortran/reduction-6.f90
>> libgomp.oacc-fortran/reduction-7.f90
>> libgomp.oacc-fortran/reduction-8.f90
>> libgomp.oacc-fortran/routine-1.f90
>> libgomp.oacc-fortran/routine-2.f90
>> libgomp.oacc-fortran/routine-3.f90
>> libgomp.oacc-fortran/routine-4.f90
>> libgomp.oacc-fortran/routine-7.f90
>> libgomp.oacc-fortran/routine-9.f90
>> libgomp.oacc-fortran/subarrays-1.f90
>> libgomp.oacc-fortran/subarrays-2.f90
>> libgomp.oacc-fortran/update-2.f90
>>
>> Tobias
>>
>> -----------------
>> Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München /
>> Germany
>> Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung,
>> Alexander Walter


More information about the Gcc-patches mailing list