3.20.3 AMD GCN Options

These options are defined specifically for the AMD GCN port.

-march=gpu
-mtune=gpu

Set architecture type or tuning for gpu. Supported values for gpu are

gfx900

Compile for GCN5 Vega 10 devices (gfx900).

gfx902

Compile for GCN5 Vega gfx902 devices. (Experimental)

gfx904

Compile for GCN5 Vega gfx904 devices. (Experimental)

gfx906

Compile for GCN5 Vega 20 devices (gfx906).

gfx908

Compile for CDNA1 Instinct MI100 series devices (gfx908).

gfx909

Compile for GCN5 Vega gfx909 devices. (Experimental)

gfx90a

Compile for CDNA2 Instinct MI200 series devices (gfx90a).

gfx90c

Compile for GCN5 Vega 7 devices (gfx90c).

gfx9-generic

Compile generic code for Vega devices, executable on the following subset of GFX9 devices: gfx900, gfx902, gfx904, gfx906, gfx909 and gfx90c. (Experimental)

gfx1030

Compile for RDNA2 gfx1030 devices (GFX10 series).

gfx1031

Compile for RDNA2 gfx1031 devices (GFX10 series). (Experimental)

gfx1032

Compile for RDNA2 gfx1032 devices (GFX10 series). (Experimental)

gfx1033

Compile for RDNA2 gfx1033 devices (GFX10 series). (Experimental)

gfx1034

Compile for RDNA2 gfx1034 devices (GFX10 series). (Experimental)

gfx1035

Compile for RDNA2 gfx1035 devices (GFX10 series). (Experimental)

gfx1036

Compile for RDNA2 gfx1036 devices (GFX10 series).

gfx10-3-generic

Compile generic code for GFX10-3 devices, executable on gfx1030, gfx1031, gfx1032, gfx1033, gfx1034, gfx1035, and gfx1036. (Experimental)

gfx1100

Compile for RDNA3 gfx1100 devices (GFX11 series).

gfx1101

Compile for RDNA3 gfx1101 devices (GFX11 series). (Experimental)

gfx1102

Compile for RDNA3 gfx1102 devices (GFX11 series). (Experimental)

gfx1103

Compile for RDNA3 gfx1103 devices (GFX11 series).

gfx1150

Compile for RDNA3 gfx1150 devices (GFX11 series). (Experimental)

gfx1151

Compile for RDNA3 gfx1151 devices (GFX11 series). (Experimental)

gfx1152

Compile for RDNA3 gfx1152 devices (GFX11 series). (Experimental)

gfx1153

Compile for RDNA3 gfx1153 devices (GFX11 series). (Experimental)

gfx11-generic

Compile generic code for GFX11 devices, executable on gfx1100, gfx1101, gfx1102, gfx1103, gfx1150, gfx1151, gfx1152, and gfx1153. (Experimental)

-msram-ecc=on
-msram-ecc=off
-msram-ecc=any

Compile binaries suitable for devices with the SRAM-ECC feature enabled, disabled, or either mode. This feature can be enabled per-process on some devices. The compiled code must match the device mode. The default is ‘any’, for devices that support it.

-mstack-size=bytes

Specify how many bytes of stack space will be requested for each GPU thread (wave-front). Beware that there may be many threads and limited memory available. The size of the stack allocation may also have an impact on run-time performance. The default is 32KB when using OpenACC or OpenMP, and 1MB otherwise.

-mxnack=on
-mxnack=off
-mxnack=any

Compile binaries suitable for devices with the XNACK feature enabled, disabled, or either mode. Some devices always require XNACK and some allow the user to configure XNACK. The compiled code must match the device mode. The default is ‘-mxnack=any’ on devices that support Unified Shared Memory, and ‘-mxnack=no’ otherwise.