This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] [aarch64] Introduce flags for SVE2.
Matthew Malcomson <Matthew.Malcomson@arm.com> writes:
> @@ -326,16 +326,22 @@ int opt_ext_cmp (const void* a, const void* b)
> turns on as a dependency. As an example +dotprod turns on FL_DOTPROD and
> FL_SIMD. As such the set of bits represented by this option is
> {FL_DOTPROD, FL_SIMD}. */
> - unsigned long total_flags_a = opt_a->flag_canonical & opt_a->flags_on;
> - unsigned long total_flags_b = opt_b->flag_canonical & opt_b->flags_on;
> + uint64_t total_flags_a = opt_a->flag_canonical & opt_a->flags_on;
> + uint64_t total_flags_b = opt_b->flag_canonical & opt_b->flags_on;
> int popcnt_a = popcount_hwi ((HOST_WIDE_INT)total_flags_a);
> int popcnt_b = popcount_hwi ((HOST_WIDE_INT)total_flags_b);
> int order = popcnt_b - popcnt_a;
>
> /* If they have the same amount of bits set, give it a more
> - deterministic ordering by using the value of the bits themselves. */
> + deterministic ordering by using the value of the bits themselves.
> + Since the value of the bits themselves can be larger than that
> + representable by an integer, we manually truncate the value. */
Comment no longer applies, can just keep the original.
> if (order == 0)
> - return total_flags_b - total_flags_a;
> + {
> + if (total_flags_a == total_flags_b)
> + return 0;
> + return total_flags_a < total_flags_b ? 1 : -1;
> + }
>
> return order;
Maybe a more obvious sequence would be:
if (order != 0)
return order;
if (total_flags_a != total_flags_b)
return total_flags_a < total_flags_b ? 1 : -1;
return 0;
I think that's what most of GCC's sort functions do.
> diff --git a/gcc/config/aarch64/aarch64-option-extensions.def b/gcc/config/aarch64/aarch64-option-extensions.def
> index 53dcd03590d2e4eebac83f03039c442fca7f5d5d..c023bca5c511c132575529b1d154f4a798ce10f9 100644
> --- a/gcc/config/aarch64/aarch64-option-extensions.def
> +++ b/gcc/config/aarch64/aarch64-option-extensions.def
> @@ -57,17 +57,20 @@
>
> /* Enabling "fp" just enables "fp".
> Disabling "fp" also disables "simd", "crypto", "fp16", "aes", "sha2",
> - "sha3", sm3/sm4 and "sve". */
> -AARCH64_OPT_EXTENSION("fp", AARCH64_FL_FP, 0, AARCH64_FL_SIMD | AARCH64_FL_CRYPTO | AARCH64_FL_F16 | AARCH64_FL_AES | AARCH64_FL_SHA2 | AARCH64_FL_SHA3 | AARCH64_FL_SM4 | AARCH64_FL_SVE, false, "fp")
> + "sha3", sm3/sm4, "sve", "sve2-sm4", "sve2-sha3", "sve2-aes", "bitperm" and
> + "sve2". */
> +AARCH64_OPT_EXTENSION("fp", AARCH64_FL_FP, 0, AARCH64_FL_SIMD | AARCH64_FL_CRYPTO | AARCH64_FL_F16 | AARCH64_FL_AES | AARCH64_FL_SHA2 | AARCH64_FL_SHA3 | AARCH64_FL_SM4 | AARCH64_FL_SVE | AARCH64_FL_SVE2 | AARCH64_FL_SVE2_AES | AARCH64_FL_SVE2_SHA3 | AARCH64_FL_SVE2_SM4 | AARCH64_FL_SVE2_BITPERM, false, "fp")
Very minor (and I should have noticed last time, sorry), but it would
be good to keep the order in the code and the comment consistent.
I think the order in the code is more obvious, with "sve2" listed before
the optional extensions to "sve2".
Some for the other lists.
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 5e3e8873d355b443a01fceb5c3df4cb98dd26d60..6ff548a2e902e876d8e2c6f56c75381941307f32 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -16008,6 +16008,21 @@ generation. This option is enabled by default for @option{-march=armv8.5-a}.
> Enable the Armv8-a Execution and Data Prediction Restriction instructions.
> This option is only to enable the extension at the assembler level and does
> not affect code generation. This option is enabled by default for
> +@item sve2
> + Enable the Armv8-a Scalable Vector Extension 2. This option is only to enable
> + the extension at the assembler level and does not affect code generation.
> +@item bitperm
> + Enable SVE2 bitperm Extension. This option is only to enable the extension at
> + the assembler level and does not affect code generation.
> +@item sve2-sm4
> + Enable SVE2 sm4 Extension. This option is only to enable the extension at the
> + assembler level and does not affect code generation.
> +@item sve2-aes
> + Enable SVE2 aes Extension. This option is only to enable the extension at the
> + assembler level and does not affect code generation.
> +@item sve2-sha3
> + Enable SVE2 sha3 Extension. This option is only to enable the extension at the
> + assembler level and does not affect code generation.
> @option{-march=armv8.5-a}.
Should be no indentation, and two spaces after ".".
I think we should leave out the bit about it only being for the assembler,
since hopefully by GCC 10 comes out that will no longer be true. ;-)
I think the sve2 entry should say:
This also enables SVE instructions.
and the the others should say:
This also enables SVE2 instructions.
Thanks,
Richard