[Bug target/104714] New: [nvptx] Means to specify any sm_xx

Mon Feb 28 11:14:09 GMT 2022

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104714

            Bug ID: 104714
           Summary: [nvptx] Means to specify any sm_xx
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: vries at gcc dot gnu.org
  Target Milestone: ---

I'm testing on a couple of boards, with some different settings, and one of
those settings is: test native architecture.

That is, for an NVIDIA T400 with sm_75, test with -misa=sm_75.

But that doesn't work for all boards, because f.i. for a GeForce GT 1030, with
sm_61, gcc doesn't support -misa=sm_61.  It only support values for which
different code may be generated.  So, we use instead -misa=sm_53.

I have some code in a script, which has this mapped out:
...
case $id in
    GeForce-GT-710)
        sm=35
        opt_sm=35
        ;;
    Quadro-K620)
        sm=50
        opt_sm=35 # Next is 53, too high.
        ;;
    GeForce-GT-1030)
        sm=61
        opt_sm=53 # Next is 75, to high.
        ;;
    NVIDIA-T400)
        sm=75
        opt_sm=75
        ;;
    *)
        echo "Unknown id: $id"
        exit 1
        ;;
esac
...

There are two problems with this:
- it's cumbersome to do the mapping, possibly in various locations
- the mapping may have to be updated for newer releases, which introduce
  additional -misa values

It would be nice to be able to just specify what board sm you have, and then
have gcc figure out the current closest and supported -misa value.

We could do this by just allowing any -misa value, say allow -misa=sm_61 and
internally map it to sm_53.

OTOH, we could use this as an opportunity to sidestep the much regretted name
-misa (given that -mptx is used to specify the ptx isa version, and misa the
ptx architecture) and introduce say -march for this.  This option would then
have to be mutually exclusive with -misa.

There's an open question though: when specifying sm_61, the code generation
internally will switch to sm_53, but what do we emit in the .target field:
...
// BEGIN PREAMBLE
.version 6.0
.target sm_xx
.address_size 64
// END PREAMBLE
...
? sm_53 or sm_61?

I'm not entirely sure yet what the benefit would be of having ".target sm_61". 
F.i. the driver 510.x has given up on the kepler architecture, so we can't use
it for a kepler board.  But we can generate code for ".target sm_30" and have
that same driver map it onto a post-kepler board.  So I don't see any benefits
here in terms of allowed driver version.

So for the moment, I'd go with sm_53.

[ FWIW, it would be great if we could simply specify -march=native, and have
gcc query the nvidia driver to see what board there is using
cuDeviceGetAttribute and CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MAJOR and
CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MINOR.  And possibly handle the
situation of multiple boards by using the minimum.  But, much more involved to
realize. ]