3.19.3 AMD GCN Options

These options are defined specifically for the AMD GCN port.

-march=gpu
-mtune=gpu

Set architecture type or tuning for gpu. Supported values for gpu are

fiji

Compile for GCN3 Fiji devices (gfx803). Support deprecated; availablility depends on how GCC has been configured, see --with-arch and --with-multilib-list.

gfx900

Compile for GCN5 Vega 10 devices (gfx900).

gfx906

Compile for GCN5 Vega 20 devices (gfx906).

gfx908

Compile for CDNA1 Instinct MI100 series devices (gfx908).

gfx90a

Compile for CDNA2 Instinct MI200 series devices (gfx90a).

gfx1030

Compile for RDNA2 gfx1030 devices (GFX10 series).

gfx1036

Compile for RDNA2 gfx1036 devices (GFX10 series).

gfx1100

Compile for RDNA3 gfx1100 devices (GFX11 series).

gfx1103

Compile for RDNA3 gfx1103 devices (GFX11 series).

-msram-ecc=on
-msram-ecc=off
-msram-ecc=any

Compile binaries suitable for devices with the SRAM-ECC feature enabled, disabled, or either mode. This feature can be enabled per-process on some devices. The compiled code must match the device mode. The default is ‘any’, for devices that support it.

-mstack-size=bytes

Specify how many bytes of stack space will be requested for each GPU thread (wave-front). Beware that there may be many threads and limited memory available. The size of the stack allocation may also have an impact on run-time performance. The default is 32KB when using OpenACC or OpenMP, and 1MB otherwise.

-mxnack=on
-mxnack=off
-mxnack=any

Compile binaries suitable for devices with the XNACK feature enabled, disabled, or either mode. Some devices always require XNACK and some allow the user to configure XNACK. The compiled code must match the device mode. The default is ‘-mxnack=any’ on devices that support Unified Shared Memory, and ‘-mxnack=no’ otherwise.