[patch,nvptx] Basic -misa support for nvptx

Wed Sep 5 14:31:00 GMT 2018

On 09/05/2018 12:19 AM, Cesar Philippidis wrote:
> On 09/02/2018 07:57 AM, Cesar Philippidis wrote:
>> On 09/01/2018 12:04 PM, Tom de Vries wrote:
>>> On 08/31/2018 04:14 PM, Cesar Philippidis wrote:
>>
>>>> Is this patch OK for trunk?
>>>>
>>>
>>> Well, how did you test this (
>>> https://gcc.gnu.org/contribute.html#patches : "Bootstrapping and
>>> testing. State the host and target combinations you used to do proper
>>> testing as described above, and the results of your testing.") ?
>>
>> I tested the standalone nvptx compiler. I'll retest with libgomp with
>> -misa=sm_35. Bootstrapping won't help much here, unfortunately.
>>>> +++ b/gcc/testsuite/gcc.target/nvptx/atomic_fetch-1.c
>>>> @@ -0,0 +1,24 @@
>>>> +/* Test the nvptx atomic instructions for __atomic_fetch_OP for SM_35
>>>> +   targets.  */
>>>> +
>>>> +/* { dg-do compile } */
>>>> +/* { dg-options "-O2 -misa=sm_35" } */
>>>> +
>>>> +int
>>>> +main()
>>>> +{
>>>> +  unsigned long long a = ~0;
>>>> +  unsigned b = 0xa;
>>>> +
>>>> +  __atomic_fetch_add (&a, b, 0);
>>>> +  __atomic_fetch_and (&a, b, 0);
>>>> +  __atomic_fetch_or (&a, b, 0);
>>>> +  __atomic_fetch_xor (&a, b, 0);
>>>> +  
>>>> +  return a;
>>>> +}
>>>> +
>>>> +/* { dg-final { scan-assembler "atom.add.u64" } } */
>>>> +/* { dg-final { scan-assembler "atom.b64.and" } } */
>>>> +/* { dg-final { scan-assembler "atom.b64.or" } } */
>>>> +/* { dg-final { scan-assembler "atom.b64.xor" } } */
>>>> -- 2.17.1
>>>>
>>>
>>> Hmm, the add.u64 vs b64.and looks odd (and the scan-assembler-not
>>> testcase does not use this difference, so that needs to be fixed, or for
>>> bonus points, changed into a scan-assembler testcase).
>>>
>>> The documentation uses "op.type", we should fix the compiler to emit
>>> that consistently. Separate patch that fixes that pre-approved.
>>
>> ACK. I think there are a lot of other cases like that in the BE.
>>
>>> This is ok (with, as I mentioned above, the SI part split off into a
>>> separate patch), on the condition that you test libgomp with
>>> -foffload=-misa=sm_35.
> 
> Adding -foffload=misa=sm_35 didn't work because the host gcc doesn't
> support the -misa flag.

That doesn't make sense to me. For me this works without any problems.
Have you tried a clean build?

> When I forced the nvptx BE to set TARGET_SM35 to
> always be true, I ran into problems with SM_30 code linking against
> SM_35 code.

I also cannot reproduce this, works for me.

> Therefore, I don't think this patch is ready for trunk yet.
>> By the way, is -misa really necessary for atomic_fetch_<logic><mode>?
> Looking at the PTX documentation I see
> <https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#changes-in-ptx-isa-version-3-1>:
> 
> PTX ISA version 3.1 introduces the following new features:
> 
> * Support for sm_35 target architecture.
> * Extends atomic and reduction instructions to perform 64-bit {and, or,
> xor} operations, and 64-bit integer {min, max} operations.
> 
> Is there a table for which list which GPUs are compatible with which
> instructions?

Yes, every instruction has a table in the ptx manual, and there's a "PTX
ISA Notes" entry.

For the atom instruction in ptx isa 3.1 manual, we have "PTX ISA Notes":
...
atom.global requires sm_11 or higher.
atom.shared requires sm_12 or higher.
64-bit atom.global.{add,cas,exch} require sm_12 or higher.
64-bit atom.shared.{add,cas,exch} require sm_20 or higher.
64-bit atom.{and,or.xor,min,max} require sm_35 or higher.
atom.add.f32 requires sm_20 or higher.
Use of generic addressing requires sm_20 or higher.
...

Thanks,
- Tom