This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH][AArch64] Enable CLI for Armv8.6-a: armv8.6-a, i8mm and bf16
- From: Richard Sandiford <richard dot sandiford at arm dot com>
- To: Dennis Zhang <Dennis dot Zhang at arm dot com>
- Cc: "gcc-patches\@gcc.gnu.org" <gcc-patches at gcc dot gnu dot org>, nd <nd at arm dot com>, Richard Earnshaw <Richard dot Earnshaw at arm dot com>, James Greenhalgh <James dot Greenhalgh at arm dot com>, Marcus Shawcroft <Marcus dot Shawcroft at arm dot com>
- Date: Fri, 29 Nov 2019 13:00:09 +0000
- Subject: Re: [PATCH][AArch64] Enable CLI for Armv8.6-a: armv8.6-a, i8mm and bf16
- References: <e1f2c7e0-2512-e6c0-58d3-e18970fbffca@arm.com>
Hi Dennis,
Sorry for the slow response.
Dennis Zhang <Dennis.Zhang@arm.com> writes:
> Hi all,
>
> This patch is part of a series adding support for Armv8.6-A features.
> It enables options including -march=armv8.6-a, +i8mm and +bf16.
> The +i8mm and +bf16 features are mandatory for Armv8.6-a and optional
> for Armv8.2-a and onward.
> Documents are at https://developer.arm.com/docs/ddi0596/latest
>
> Regtested for aarch64-none-linux-gnu.
>
> Please help to check if it's ready for trunk.
>
> Many thanks!
> Dennis
>
> gcc/ChangeLog:
>
> 2019-11-26 Dennis Zhang <dennis.zhang@arm.com>
>
> * config/aarch64/aarch64-arches.def (armv8.6-a): New.
> * config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): Define
> __ARM_FEATURE_MATMUL_INT8, __ARM_FEATURE_BF16_VECTOR_ARITHMETIC and
> __ARM_FEATURE_BF16_SCALAR_ARITHMETIC when enabled.
> * config/aarch64/aarch64-option-extensions.def (i8mm, bf16): New.
> * config/aarch64/aarch64.h (AARCH64_FL_V8_6): New macro.
> (AARCH64_FL_I8MM, AARCH64_FL_BF16, AARCH64_FL_FOR_ARCH8_6): Likewise.
> (AARCH64_ISA_V8_6, AARCH64_ISA_I8MM, AARCH64_ISA_BF16): Likewise.
> (TARGET_I8MM, TARGET_BF16_FP, TARGET_BF16_SIMD): Likewise.
> * doc/invoke.texi (armv8.6-a, i8mm, bf16): Document new options.
>
> gcc/testsuite/ChangeLog:
>
> 2019-11-26 Dennis Zhang <dennis.zhang@arm.com>
>
> * gcc.target/aarch64/pragma_cpp_predefs_2.c: Add tests for i8mm
> and bf16 features.
>
> diff --git a/gcc/config/aarch64/aarch64-arches.def b/gcc/config/aarch64/aarch64-arches.def
> index d258bd49244..e464d329c1a 100644
> --- a/gcc/config/aarch64/aarch64-arches.def
> +++ b/gcc/config/aarch64/aarch64-arches.def
> @@ -36,5 +36,6 @@ AARCH64_ARCH("armv8.2-a", generic, 8_2A, 8, AARCH64_FL_FOR_ARCH8_2)
> AARCH64_ARCH("armv8.3-a", generic, 8_3A, 8, AARCH64_FL_FOR_ARCH8_3)
> AARCH64_ARCH("armv8.4-a", generic, 8_4A, 8, AARCH64_FL_FOR_ARCH8_4)
> AARCH64_ARCH("armv8.5-a", generic, 8_5A, 8, AARCH64_FL_FOR_ARCH8_5)
> +AARCH64_ARCH("armv8.6-a", generic, 8_6A, 8, AARCH64_FL_FOR_ARCH8_6)
>
> #undef AARCH64_ARCH
> diff --git a/gcc/config/aarch64/aarch64-c.c b/gcc/config/aarch64/aarch64-c.c
> index f3da07fd28a..20d1e00552b 100644
> --- a/gcc/config/aarch64/aarch64-c.c
> +++ b/gcc/config/aarch64/aarch64-c.c
> @@ -165,6 +165,12 @@ aarch64_update_cpp_builtins (cpp_reader *pfile)
> aarch64_def_or_undef (TARGET_RNG, "__ARM_FEATURE_RNG", pfile);
> aarch64_def_or_undef (TARGET_MEMTAG, "__ARM_FEATURE_MEMORY_TAGGING", pfile);
>
> + aarch64_def_or_undef (TARGET_I8MM, "__ARM_FEATURE_MATMUL_INT8", pfile);
> + aarch64_def_or_undef (TARGET_BF16_SIMD,
> + "__ARM_FEATURE_BF16_VECTOR_ARITHMETIC", pfile);
> + aarch64_def_or_undef (TARGET_BF16_FP,
> + "__ARM_FEATURE_BF16_SCALAR_ARITHMETIC", pfile);
> +
> /* Not for ACLE, but required to keep "float.h" correct if we switch
> target between implementations that do or do not support ARMv8.2-A
> 16-bit floating-point extensions. */
> diff --git a/gcc/config/aarch64/aarch64-option-extensions.def b/gcc/config/aarch64/aarch64-option-extensions.def
> index d3ae1b2431b..5b7c3b8a213 100644
> --- a/gcc/config/aarch64/aarch64-option-extensions.def
> +++ b/gcc/config/aarch64/aarch64-option-extensions.def
> @@ -198,4 +198,14 @@ AARCH64_OPT_EXTENSION("sve2-bitperm", AARCH64_FL_SVE2_BITPERM, AARCH64_FL_SIMD |
> /* Enabling or disabling "tme" only changes "tme". */
> AARCH64_OPT_EXTENSION("tme", AARCH64_FL_TME, 0, 0, false, "")
>
> +/* Enabling "i8mm" also enables "simd".
> + Disabling "i8mm" only disables "i8mm". */
> +AARCH64_OPT_EXTENSION("i8mm", AARCH64_FL_I8MM, AARCH64_FL_SIMD, \
> + 0, false, "i8mm")
We have to maintain the transitive closure of features by hand,
so anything that enables AARCH64_FL_SIMD also needs to enable
AARCH64_FL_FP.
We should also add i8mm to the list of things that +nosimd and +nofp
disable.
(It would be better to do this automatically, but that's future work.)
> +/* Enabling "bf16" also enables "simd" and "fp".
> + Disabling "bf16" only disables "bf16". */
> +AARCH64_OPT_EXTENSION("bf16", AARCH64_FL_BF16, AARCH64_FL_SIMD | AARCH64_FL_FP,
> + 0, false, "bf16")
Similarly here we should add bf16 to the list of things that +nofp disables.
> @@ -308,6 +323,13 @@ extern unsigned aarch64_architecture_version;
> /* Memory Tagging instructions optional to Armv8.5 enabled through +memtag. */
> #define TARGET_MEMTAG (AARCH64_ISA_V8_5 && AARCH64_ISA_MEMTAG)
>
> +/* I8MM instructions are enabled through +i8mm. */
> +#define TARGET_I8MM (TARGET_SIMD && AARCH64_ISA_I8MM)
This should then just be AARCH64_ISA_I8MM (i.e. no need to test
TARGET_SIMD).
> +
> +/* BF16 instructions are enabled through +bf16. */
> +#define TARGET_BF16_FP (AARCH64_ISA_BF16 && TARGET_FLOAT)
Similarly here we don't need a test for TARGET_FLOAT.
> +#define TARGET_BF16_SIMD (AARCH64_ISA_BF16 && TARGET_SIMD)
> +
> /* Make sure this is always defined so we don't have to check for ifdefs
> but rather use normal ifs. */
> #ifndef TARGET_FIX_ERR_A53_835769_DEFAULT
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 403be47d893..30647e52c07 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -16045,7 +16045,10 @@ more feature modifiers. This option has the form
>
> The permissible values for @var{arch} are @samp{armv8-a},
> @samp{armv8.1-a}, @samp{armv8.2-a}, @samp{armv8.3-a}, @samp{armv8.4-a},
> -@samp{armv8.5-a} or @var{native}.
> +@samp{armv8.5-a}, @samp{armv8.6-a} or @var{native}.
> +
> +The value @samp{armv8.6-a} implies @samp{armv8.5-a} and enables compiler
> +support for the ARMv8.6-A architecture extensions.
This then goes on to:
-----------------------------------------------------------------
The value @samp{armv8.5-a} implies @samp{armv8.4-a} and enables compiler
support for the ARMv8.5-A architecture extensions.
The value @samp{armv8.4-a} implies @samp{armv8.3-a} and enables compiler
support for the ARMv8.4-A architecture extensions.
The value @samp{armv8.3-a} implies @samp{armv8.2-a} and enables compiler
support for the ARMv8.3-A architecture extensions.
The value @samp{armv8.2-a} implies @samp{armv8.1-a} and enables compiler
support for the ARMv8.2-A architecture extensions.
The value @samp{armv8.1-a} implies @samp{armv8-a} and enables compiler
support for the ARMv8.1-A architecture extension. In particular, it
enables the @samp{+crc}, @samp{+lse}, and @samp{+rdma} features.
-----------------------------------------------------------------
I don't think we'd have written it like this if we'd been developing
and submitting Armv8-A to Armv8.6-A (or Armv8.5-A) all in one go.
It's just kind of grown by copying what Armv8.(x-1)-A did.
How about replacing "The permissible values..." onwards with something like:
-----------------------------------------------------------------
The table below summarizes the permissable values for @var{arch}
and the features that they enable by default:
@multitable @columnfractions .25 .25 .50
@headitem @var{arch} value @tab Architecture @tab Includes by default
...
@item @samp{armv8.6-a} @tab Armv8.6
@tab @samp{armv8.5-a}, @samp{+bf16}, @samp{+i8mm}
@end multitable
-----------------------------------------------------------------
(Completely untested.)
Or we could put "armv8.5-a" and "bf16, i8mm" (with or without "+"s)
into separate columns, perhaps "Extends" and "Includes by default".
(Any of those options would be fine with me FWIW.)
> @@ -16276,6 +16279,7 @@ generation. This option is enabled by default for @option{-march=armv8.5-a}.
> Enable the Armv8-a Execution and Data Prediction Restriction instructions.
> This option is only to enable the extension at the assembler level and does
> not affect code generation. This option is enabled by default for
> +@option{-march=armv8.5-a}.
> @item sve2
> Enable the Armv8-a Scalable Vector Extension 2. This also enables SVE
> instructions.
> @@ -16287,9 +16291,18 @@ Enable SVE2 sm4 instructions. This also enables SVE2 instructions.
> Enable SVE2 aes instructions. This also enables SVE2 instructions.
> @item sve2-sha3
> Enable SVE2 sha3 instructions. This also enables SVE2 instructions.
> -@option{-march=armv8.5-a}.
> @item tme
> Enable the Transactional Memory Extension.
> +@item i8mm
> +Enable 8-bit Integer Matrix Multiply instructions. This also enables
> +Advanced SIMD instructions. This option is enabled by default for
After the above: "Advanced SIMD and floating-point"
> +@option{-march=armv8.6-a}. Use of this option with architectures prior to
> +Armv8.2-A is not supported.
> +@item bf16
> +Enable brain half-precision floating-point instructions. This also enables
> +Advanced SIMD and floating-point instructions. This option is enabled by
> +default for @option{-march=armv8.6-a}. Use of this option with architectures
> +prior to Armv8.2-A is not supported.
>
> @end table
>
> diff --git a/gcc/testsuite/gcc.target/aarch64/pragma_cpp_predefs_2.c b/gcc/testsuite/gcc.target/aarch64/pragma_cpp_predefs_2.c
> index 608b89d19ce..2983e271114 100644
> --- a/gcc/testsuite/gcc.target/aarch64/pragma_cpp_predefs_2.c
> +++ b/gcc/testsuite/gcc.target/aarch64/pragma_cpp_predefs_2.c
> @@ -13,6 +13,59 @@
> #error "__ARM_FEATURE_TME is defined but should not be!"
> #endif
>
> +#pragma GCC push_options
> +#pragma GCC target ("arch=armv8.6-a")
> +#ifndef __ARM_FEATURE_MATMUL_INT8
> +#error "__ARM_FEATURE_MATMUL_INT8 is not defined but should be!"
> +#endif
> +#pragma GCC pop_options
> +
> +#pragma GCC push_options
> +#pragma GCC target ("arch=armv8.2-a+i8mm")
> +#ifndef __ARM_FEATURE_MATMUL_INT8
> +#error "__ARM_FEATURE_MATMUL_INT8 is not defined but should be!"
> +#endif
> +#pragma GCC pop_options
> +
> +#ifdef __ARM_FEATURE_MATMUL_INT8
> +#error "__ARM_FEATURE_MATMUL_INT8 is defined but should not be!"
> +#endif
Not your bug, but we should start the file with:
#pragma GCC target ("arch=armv8-a")
otherwise anyone running the testsuite with -march=armv8.6-a will
get failures here.
> +
> +#pragma GCC push_options
> +#pragma GCC target ("arch=armv8.6-a")
> +#ifndef __ARM_FEATURE_BF16_VECTOR_ARITHMETIC
> +#error "__ARM_FEATURE_BF16_VECTOR_ARITHMETIC is not defined but should be!"
> +#endif
> +#ifndef __ARM_FEATURE_BF16_SCALAR_ARITHMETIC
> +#error "__ARM_FEATURE_BF16_SCALAR_ARITHMETIC is not defined but should be!"
> +#endif
> +#pragma GCC pop_options
> +
> +#pragma GCC push_options
> +#pragma GCC target ("arch=armv8.2-a+bf16")
> +#ifndef __ARM_FEATURE_BF16_SCALAR_ARITHMETIC
> +#error "__ARM_FEATURE_BF16_SCALAR_ARITHMETIC is not defined but should be!"
> +#endif
> +#ifndef __ARM_FEATURE_BF16_VECTOR_ARITHMETIC
> +#error "__ARM_FEATURE_BF16_VECTOR_ARITHMETIC is not defined but should be!"
> +#endif
> +#pragma GCC pop_options
> +
> +#pragma GCC push_options
> +#pragma GCC target ("arch=armv8.2-a+bf16+nosimd")
> +#ifdef __ARM_FEATURE_BF16_VECTOR_ARITHMETIC
> +#error "__ARM_FEATURE_BF16_VECTOR_ARITHMETIC is defined but should not be!"
> +#endif
> +#pragma GCC pop_options
> +
For completeness, we might as well test __ARM_FEATURE_BF16_SCALAR_ARITHMETIC
is defined.
> +#ifdef __ARM_FEATURE_BF16_SCALAR_ARITHMETIC
> +#error "__ARM_FEATURE_BF16_SCALAR_ARITHMETIC is defined but should not be!"
> +#endif
> +
> +#ifdef __ARM_FEATURE_BF16_VECTOR_ARITHMETIC
> +#error "__ARM_FEATURE_BF16_VECTOR_ARITHMETIC is defined but should not be!"
> +#endif
Very, very minor, but since the others have no blank line between the
two tests, I think it'd be more consistent not to have one here either.
Thanks,
Richard