Add IFN_COND_{MUL,DIV,MOD,RDIV}
Richard Biener
richard.guenther@gmail.com
Thu May 24 10:30:00 GMT 2018
On Thu, May 24, 2018 at 11:34 AM Richard Sandiford <
richard.sandiford@linaro.org> wrote:
> This patch adds support for conditional multiplication and division.
> It's mostly mechanical, but a few notes:
> * The *_optab name and the .md names are the same as the unconditional
> forms, just with "cond_" added to the front. This means we still
> have the awkward difference between sdiv and div, etc.
> * It was easier to retain the difference between integer and FP
> division in the function names, given that they map to different
> tree codes (TRUNC_DIV_EXPR and RDIV_EXPR).
> * SVE has no direct support for IFN_COND_MOD, but it seemed more
> consistent to add it anyway.
> * Adding IFN_COND_MUL enables an extra fully-masked reduction
> in gcc.dg/vect/pr53773.c.
> * In practice we don't actually use the integer division forms without
> if-conversion support (added by a later patch).
> Tested on aarch64-linux-gnu (with and without SVE), aarch64_be-elf
> and x86_64-linux-gnu. OK for the non-AArch64 bits?
OK.
Richard.
> Richard
> 2018-05-24 Richard Sandiford <richard.sandiford@linaro.org>
> gcc/
> * doc/sourcebuild.texi (vect_double_cond_arith): Include
> multiplication and division.
> * doc/md.texi (cond_mul@var{m}, cond_div@var{m}, cond_mod@var{m})
> (cond_udiv@var{m}, cond_umod@var{m}): Document.
> * optabs.def (cond_smul_optab, cond_sdiv_optab, cond_smod_optab)
> (cond_udiv_optab, cond_umod_optab): New optabs.
> * internal-fn.def (IFN_COND_MUL, IFN_COND_DIV, IFN_COND_MOD)
> (IFN_COND_RDIV): New internal functions.
> * internal-fn.c (get_conditional_internal_fn): Handle
TRUNC_DIV_EXPR,
> TRUNC_MOD_EXPR and RDIV_EXPR.
> * genmatch.c (commutative_op): Handle CFN_COND_MUL.
> * match.pd (UNCOND_BINARY, COND_BINARY): Handle them.
> * config/aarch64/iterators.md (UNSPEC_COND_MUL, UNSPEC_COND_DIV):
> New unspecs.
> (SVE_INT_BINARY): Include mult.
> (SVE_COND_FP_BINARY): Include UNSPEC_MUL and UNSPEC_DIV.
> (optab, sve_int_op): Handle mult.
> (optab, sve_fp_op, commutative): Handle UNSPEC_COND_MUL and
> UNSPEC_COND_DIV.
> * config/aarch64/aarch64-sve.md (cond_<optab><mode>): New pattern
> for SVE_INT_BINARY_SD.
> gcc/testsuite/
> * lib/target-supports.exp
> (check_effective_target_vect_double_cond_arith): Include
> multiplication and division.
> * gcc.dg/vect/pr53773.c: Do not expect a scalar tail when using
> fully-masked loops with a fixed vector length.
> * gcc.dg/vect/vect-cond-arith-1.c: Add multiplication and division
> tests.
> * gcc.target/aarch64/sve/vcond_8.c: Likewise.
> * gcc.target/aarch64/sve/vcond_9.c: Likewise.
> * gcc.target/aarch64/sve/vcond_12.c: Add multiplication tests.
> Index: gcc/doc/sourcebuild.texi
> ===================================================================
> --- gcc/doc/sourcebuild.texi 2018-05-24 09:54:37.508451387 +0100
> +++ gcc/doc/sourcebuild.texi 2018-05-24 10:12:10.145352193 +0100
> @@ -1426,8 +1426,9 @@ have different type from the value opera
> Target supports hardware vectors of @code{double}.
> @item vect_double_cond_arith
> -Target supports conditional addition, subtraction, minimum and maximum
> -on vectors of @code{double}, via the @code{cond_} optabs.
> +Target supports conditional addition, subtraction, multiplication,
> +division, minimum and maximum on vectors of @code{double}, via the
> +@code{cond_} optabs.
> @item vect_element_align_preferred
> The target's preferred vector alignment is the same as the element
> Index: gcc/doc/md.texi
> ===================================================================
> --- gcc/doc/md.texi 2018-05-24 09:32:10.522816506 +0100
> +++ gcc/doc/md.texi 2018-05-24 10:12:10.142352315 +0100
> @@ -6333,6 +6333,11 @@ operand 0, otherwise (operand 2 + operan
> @cindex @code{cond_add@var{mode}} instruction pattern
> @cindex @code{cond_sub@var{mode}} instruction pattern
> +@cindex @code{cond_mul@var{mode}} instruction pattern
> +@cindex @code{cond_div@var{mode}} instruction pattern
> +@cindex @code{cond_udiv@var{mode}} instruction pattern
> +@cindex @code{cond_mod@var{mode}} instruction pattern
> +@cindex @code{cond_umod@var{mode}} instruction pattern
> @cindex @code{cond_and@var{mode}} instruction pattern
> @cindex @code{cond_ior@var{mode}} instruction pattern
> @cindex @code{cond_xor@var{mode}} instruction pattern
> @@ -6342,6 +6347,11 @@ operand 0, otherwise (operand 2 + operan
> @cindex @code{cond_umax@var{mode}} instruction pattern
> @item @samp{cond_add@var{mode}}
> @itemx @samp{cond_sub@var{mode}}
> +@itemx @samp{cond_mul@var{mode}}
> +@itemx @samp{cond_div@var{mode}}
> +@itemx @samp{cond_udiv@var{mode}}
> +@itemx @samp{cond_mod@var{mode}}
> +@itemx @samp{cond_umod@var{mode}}
> @itemx @samp{cond_and@var{mode}}
> @itemx @samp{cond_ior@var{mode}}
> @itemx @samp{cond_xor@var{mode}}
> Index: gcc/optabs.def
> ===================================================================
> --- gcc/optabs.def 2018-05-16 12:48:59.194282896 +0100
> +++ gcc/optabs.def 2018-05-24 10:12:10.146352152 +0100
> @@ -222,6 +222,11 @@ OPTAB_D (notcc_optab, "not$acc")
> OPTAB_D (movcc_optab, "mov$acc")
> OPTAB_D (cond_add_optab, "cond_add$a")
> OPTAB_D (cond_sub_optab, "cond_sub$a")
> +OPTAB_D (cond_smul_optab, "cond_mul$a")
> +OPTAB_D (cond_sdiv_optab, "cond_div$a")
> +OPTAB_D (cond_smod_optab, "cond_mod$a")
> +OPTAB_D (cond_udiv_optab, "cond_udiv$a")
> +OPTAB_D (cond_umod_optab, "cond_umod$a")
> OPTAB_D (cond_and_optab, "cond_and$a")
> OPTAB_D (cond_ior_optab, "cond_ior$a")
> OPTAB_D (cond_xor_optab, "cond_xor$a")
> Index: gcc/internal-fn.def
> ===================================================================
> --- gcc/internal-fn.def 2018-05-24 09:32:10.522816506 +0100
> +++ gcc/internal-fn.def 2018-05-24 10:12:10.146352152 +0100
> @@ -145,6 +145,12 @@ DEF_INTERNAL_OPTAB_FN (FNMS, ECF_CONST,
> DEF_INTERNAL_OPTAB_FN (COND_ADD, ECF_CONST, cond_add, cond_binary)
> DEF_INTERNAL_OPTAB_FN (COND_SUB, ECF_CONST, cond_sub, cond_binary)
> +DEF_INTERNAL_OPTAB_FN (COND_MUL, ECF_CONST, cond_smul, cond_binary)
> +DEF_INTERNAL_SIGNED_OPTAB_FN (COND_DIV, ECF_CONST, first,
> + cond_sdiv, cond_udiv, cond_binary)
> +DEF_INTERNAL_SIGNED_OPTAB_FN (COND_MOD, ECF_CONST, first,
> + cond_smod, cond_umod, cond_binary)
> +DEF_INTERNAL_OPTAB_FN (COND_RDIV, ECF_CONST, cond_sdiv, cond_binary)
> DEF_INTERNAL_SIGNED_OPTAB_FN (COND_MIN, ECF_CONST, first,
> cond_smin, cond_umin, cond_binary)
> DEF_INTERNAL_SIGNED_OPTAB_FN (COND_MAX, ECF_CONST, first,
> Index: gcc/internal-fn.c
> ===================================================================
> --- gcc/internal-fn.c 2018-05-24 09:32:10.522816506 +0100
> +++ gcc/internal-fn.c 2018-05-24 10:12:10.146352152 +0100
> @@ -3246,6 +3246,12 @@ get_conditional_internal_fn (tree_code c
> return IFN_COND_MIN;
> case MAX_EXPR:
> return IFN_COND_MAX;
> + case TRUNC_DIV_EXPR:
> + return IFN_COND_DIV;
> + case TRUNC_MOD_EXPR:
> + return IFN_COND_MOD;
> + case RDIV_EXPR:
> + return IFN_COND_RDIV;
> case BIT_AND_EXPR:
> return IFN_COND_AND;
> case BIT_IOR_EXPR:
> Index: gcc/genmatch.c
> ===================================================================
> --- gcc/genmatch.c 2018-05-24 09:54:37.508451387 +0100
> +++ gcc/genmatch.c 2018-05-24 10:12:10.145352193 +0100
> @@ -487,6 +487,7 @@ commutative_op (id_base *id)
> case CFN_COND_ADD:
> case CFN_COND_SUB:
> + case CFN_COND_MUL:
> case CFN_COND_MAX:
> case CFN_COND_MIN:
> case CFN_COND_AND:
> Index: gcc/match.pd
> ===================================================================
> --- gcc/match.pd 2018-05-24 09:54:37.509451356 +0100
> +++ gcc/match.pd 2018-05-24 10:12:10.146352152 +0100
> @@ -78,10 +78,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> /* Binary operations and their associated IFN_COND_* function. */
> (define_operator_list UNCOND_BINARY
> plus minus
> + mult trunc_div trunc_mod rdiv
> min max
> bit_and bit_ior bit_xor)
> (define_operator_list COND_BINARY
> IFN_COND_ADD IFN_COND_SUB
> + IFN_COND_MUL IFN_COND_DIV IFN_COND_MOD IFN_COND_RDIV
> IFN_COND_MIN IFN_COND_MAX
> IFN_COND_AND IFN_COND_IOR IFN_COND_XOR)
> Index: gcc/config/aarch64/iterators.md
> ===================================================================
> --- gcc/config/aarch64/iterators.md 2018-05-24 09:54:37.508451387
+0100
> +++ gcc/config/aarch64/iterators.md 2018-05-24 10:12:10.142352315
+0100
> @@ -464,6 +464,8 @@ (define_c_enum "unspec"
> UNSPEC_UMUL_HIGHPART ; Used in aarch64-sve.md.
> UNSPEC_COND_ADD ; Used in aarch64-sve.md.
> UNSPEC_COND_SUB ; Used in aarch64-sve.md.
> + UNSPEC_COND_MUL ; Used in aarch64-sve.md.
> + UNSPEC_COND_DIV ; Used in aarch64-sve.md.
> UNSPEC_COND_MAX ; Used in aarch64-sve.md.
> UNSPEC_COND_MIN ; Used in aarch64-sve.md.
> UNSPEC_COND_LT ; Used in aarch64-sve.md.
> @@ -1202,7 +1204,7 @@ (define_code_iterator SVE_INT_UNARY [neg
> ;; SVE floating-point unary operations.
> (define_code_iterator SVE_FP_UNARY [neg abs sqrt])
> -(define_code_iterator SVE_INT_BINARY [plus minus smax umax smin umin
> +(define_code_iterator SVE_INT_BINARY [plus minus mult smax umax smin umin
> and ior xor])
> (define_code_iterator SVE_INT_BINARY_REV [minus])
> @@ -1239,6 +1241,7 @@ (define_code_attr optab [(ashift "ashl")
> (neg "neg")
> (plus "add")
> (minus "sub")
> + (mult "mul")
> (div "div")
> (udiv "udiv")
> (ss_plus "qadd")
> @@ -1382,6 +1385,7 @@ (define_mode_attr lconst_atomic [(QI "K"
> ;; The integer SVE instruction that implements an rtx code.
> (define_code_attr sve_int_op [(plus "add")
> (minus "sub")
> + (mult "mul")
> (div "sdiv")
> (udiv "udiv")
> (neg "neg")
> @@ -1540,9 +1544,10 @@ (define_int_iterator UNPACK_UNSIGNED [UN
> (define_int_iterator MUL_HIGHPART [UNSPEC_SMUL_HIGHPART
UNSPEC_UMUL_HIGHPART])
> (define_int_iterator SVE_COND_FP_BINARY [UNSPEC_COND_ADD UNSPEC_COND_SUB
> + UNSPEC_COND_MUL UNSPEC_COND_DIV
> UNSPEC_COND_MAX UNSPEC_COND_MIN])
> -(define_int_iterator SVE_COND_FP_BINARY_REV [UNSPEC_COND_SUB])
> +(define_int_iterator SVE_COND_FP_BINARY_REV [UNSPEC_COND_SUB
UNSPEC_COND_DIV])
> (define_int_iterator SVE_COND_FP_CMP [UNSPEC_COND_LT UNSPEC_COND_LE
> UNSPEC_COND_EQ UNSPEC_COND_NE
> @@ -1573,6 +1578,8 @@ (define_int_attr optab [(UNSPEC_ANDF "an
> (UNSPEC_XORV "xor")
> (UNSPEC_COND_ADD "add")
> (UNSPEC_COND_SUB "sub")
> + (UNSPEC_COND_MUL "mul")
> + (UNSPEC_COND_DIV "div")
> (UNSPEC_COND_MAX "smax")
> (UNSPEC_COND_MIN "smin")])
> @@ -1787,10 +1794,14 @@ (define_int_attr cmp_op [(UNSPEC_COND_LT
> (define_int_attr sve_fp_op [(UNSPEC_COND_ADD "fadd")
> (UNSPEC_COND_SUB "fsub")
> + (UNSPEC_COND_MUL "fmul")
> + (UNSPEC_COND_DIV "fdiv")
> (UNSPEC_COND_MAX "fmaxnm")
> (UNSPEC_COND_MIN "fminnm")])
> (define_int_attr commutative [(UNSPEC_COND_ADD "true")
> (UNSPEC_COND_SUB "false")
> + (UNSPEC_COND_MUL "true")
> + (UNSPEC_COND_DIV "false")
> (UNSPEC_COND_MIN "true")
> (UNSPEC_COND_MAX "true")])
> Index: gcc/config/aarch64/aarch64-sve.md
> ===================================================================
> --- gcc/config/aarch64/aarch64-sve.md 2018-05-24 09:54:37.506451449
+0100
> +++ gcc/config/aarch64/aarch64-sve.md 2018-05-24 10:12:10.141352356
+0100
> @@ -1803,6 +1803,21 @@ (define_expand "cond_<optab><mode>"
> aarch64_sve_prepare_conditional_op (operands, 5, commutative_p);
> })
> +(define_expand "cond_<optab><mode>"
> + [(set (match_operand:SVE_SDI 0 "register_operand")
> + (unspec:SVE_SDI
> + [(match_operand:<VPRED> 1 "register_operand")
> + (SVE_INT_BINARY_SD:SVE_SDI
> + (match_operand:SVE_SDI 2 "register_operand")
> + (match_operand:SVE_SDI 3 "register_operand"))
> + (match_operand:SVE_SDI 4 "register_operand")]
> + UNSPEC_SEL))]
> + "TARGET_SVE"
> +{
> + bool commutative_p = (GET_RTX_CLASS (<CODE>) == RTX_COMM_ARITH);
> + aarch64_sve_prepare_conditional_op (operands, 5, commutative_p);
> +})
> +
> ;; Predicated integer operations.
> (define_insn "*cond_<optab><mode>"
> [(set (match_operand:SVE_I 0 "register_operand" "=w")
> @@ -1817,6 +1832,19 @@ (define_insn "*cond_<optab><mode>"
> "<sve_int_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, %3.<Vetype>"
> )
> +(define_insn "*cond_<optab><mode>"
> + [(set (match_operand:SVE_SDI 0 "register_operand" "=w")
> + (unspec:SVE_SDI
> + [(match_operand:<VPRED> 1 "register_operand" "Upl")
> + (SVE_INT_BINARY_SD:SVE_SDI
> + (match_operand:SVE_SDI 2 "register_operand" "0")
> + (match_operand:SVE_SDI 3 "register_operand" "w"))
> + (match_dup 2)]
> + UNSPEC_SEL))]
> + "TARGET_SVE"
> + "<sve_int_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, %3.<Vetype>"
> +)
> +
> ;; Predicated integer operations with the operands reversed.
> (define_insn "*cond_<optab><mode>"
> [(set (match_operand:SVE_I 0 "register_operand" "=w")
> @@ -1828,6 +1856,19 @@ (define_insn "*cond_<optab><mode>"
> (match_dup 3)]
> UNSPEC_SEL))]
> "TARGET_SVE"
> + "<sve_int_op>r\t%0.<Vetype>, %1/m, %0.<Vetype>, %2.<Vetype>"
> +)
> +
> +(define_insn "*cond_<optab><mode>"
> + [(set (match_operand:SVE_SDI 0 "register_operand" "=w")
> + (unspec:SVE_SDI
> + [(match_operand:<VPRED> 1 "register_operand" "Upl")
> + (SVE_INT_BINARY_SD:SVE_SDI
> + (match_operand:SVE_SDI 2 "register_operand" "w")
> + (match_operand:SVE_SDI 3 "register_operand" "0"))
> + (match_dup 3)]
> + UNSPEC_SEL))]
> + "TARGET_SVE"
> "<sve_int_op>r\t%0.<Vetype>, %1/m, %0.<Vetype>, %2.<Vetype>"
> )
> Index: gcc/testsuite/lib/target-supports.exp
> ===================================================================
> --- gcc/testsuite/lib/target-supports.exp 2018-05-24
09:54:37.511451293 +0100
> +++ gcc/testsuite/lib/target-supports.exp 2018-05-24
10:12:10.148352070 +0100
> @@ -5590,8 +5590,9 @@ proc check_effective_target_vect_double
> return $et_vect_double_saved($et_index)
> }
> -# Return 1 if the target supports conditional addition, subtraction,
minimum
> -# and maximum on vectors of double, via the cond_ optabs. Return 0
otherwise.
> +# Return 1 if the target supports conditional addition, subtraction,
> +# multiplication, division, minimum and maximum on vectors of double,
> +# via the cond_ optabs. Return 0 otherwise.
> proc check_effective_target_vect_double_cond_arith { } {
> return [check_effective_target_aarch64_sve]
> Index: gcc/testsuite/gcc.dg/vect/pr53773.c
> ===================================================================
> --- gcc/testsuite/gcc.dg/vect/pr53773.c 2018-05-16 12:48:59.115202362
+0100
> +++ gcc/testsuite/gcc.dg/vect/pr53773.c 2018-05-24 10:12:10.147352111
+0100
> @@ -14,5 +14,8 @@ foo (int integral, int decimal, int powe
> return integral+decimal;
> }
> -/* { dg-final { scan-tree-dump-times "\\* 10" 2 "optimized" } } */
> +/* We can avoid a scalar tail when using fully-masked loops with a fixed
> + vector length. */
> +/* { dg-final { scan-tree-dump-times "\\* 10" 2 "optimized" { target { {
! vect_fully_masked } || vect_variable_length } } } } */
> +/* { dg-final { scan-tree-dump-times "\\* 10" 0 "optimized" { target {
vect_fully_masked && { ! vect_variable_length } } } } } */
> Index: gcc/testsuite/gcc.dg/vect/vect-cond-arith-1.c
> ===================================================================
> --- gcc/testsuite/gcc.dg/vect/vect-cond-arith-1.c 2018-05-24
09:54:37.509451356 +0100
> +++ gcc/testsuite/gcc.dg/vect/vect-cond-arith-1.c 2018-05-24
10:12:10.147352111 +0100
> @@ -6,6 +6,8 @@ #define N (VECTOR_BITS * 11 / 64 + 3)
> #define add(A, B) ((A) + (B))
> #define sub(A, B) ((A) - (B))
> +#define mul(A, B) ((A) * (B))
> +#define div(A, B) ((A) / (B))
> #define DEF(OP) \
> void __attribute__ ((noipa)) \
> @@ -34,6 +36,8 @@ #define TEST(OP) \
> #define FOR_EACH_OP(T) \
> T (add) \
> T (sub) \
> + T (mul) \
> + T (div) \
> T (__builtin_fmax) \
> T (__builtin_fmin)
> @@ -54,5 +58,7 @@ main (void)
> /* { dg-final { scan-tree-dump { = \.COND_ADD} "optimized" { target
vect_double_cond_arith } } } */
> /* { dg-final { scan-tree-dump { = \.COND_SUB} "optimized" { target
vect_double_cond_arith } } } */
> +/* { dg-final { scan-tree-dump { = \.COND_MUL} "optimized" { target
vect_double_cond_arith } } } */
> +/* { dg-final { scan-tree-dump { = \.COND_RDIV} "optimized" { target
vect_double_cond_arith } } } */
> /* { dg-final { scan-tree-dump { = \.COND_MAX} "optimized" { target
vect_double_cond_arith } } } */
> /* { dg-final { scan-tree-dump { = \.COND_MIN} "optimized" { target
vect_double_cond_arith } } } */
> Index: gcc/testsuite/gcc.target/aarch64/sve/vcond_8.c
> ===================================================================
> --- gcc/testsuite/gcc.target/aarch64/sve/vcond_8.c 2018-05-24
09:54:37.510451324 +0100
> +++ gcc/testsuite/gcc.target/aarch64/sve/vcond_8.c 2018-05-24
10:12:10.147352111 +0100
> @@ -5,6 +5,8 @@
> #define add(A, B) ((A) + (B))
> #define sub(A, B) ((A) - (B))
> +#define mul(A, B) ((A) * (B))
> +#define div(A, B) ((A) / (B))
> #define max(A, B) ((A) > (B) ? (A) : (B))
> #define min(A, B) ((A) < (B) ? (A) : (B))
> #define and(A, B) ((A) & (B))
> @@ -27,6 +29,7 @@ #define DEF_LOOP(TYPE, CMPTYPE, OP)
\
> #define FOR_EACH_INT_TYPE(T, TYPE) \
> T (TYPE, TYPE, add) \
> T (TYPE, TYPE, sub) \
> + T (TYPE, TYPE, mul) \
> T (TYPE, TYPE, max) \
> T (TYPE, TYPE, min) \
> T (TYPE, TYPE, and) \
> @@ -36,6 +39,8 @@ #define FOR_EACH_INT_TYPE(T, TYPE) \
> #define FOR_EACH_FP_TYPE(T, TYPE, CMPTYPE, SUFFIX) \
> T (TYPE, CMPTYPE, add) \
> T (TYPE, CMPTYPE, sub) \
> + T (TYPE, CMPTYPE, mul) \
> + T (TYPE, CMPTYPE, div) \
> T (TYPE, CMPTYPE, __builtin_fmax##SUFFIX) \
> T (TYPE, CMPTYPE, __builtin_fmin##SUFFIX)
> @@ -67,6 +72,11 @@ FOR_EACH_LOOP (DEF_LOOP)
> /* { dg-final { scan-assembler-times {\tsub\tz[0-9]+\.s, p[0-7]/m,} 2 }
} */
> /* { dg-final { scan-assembler-times {\tsub\tz[0-9]+\.d, p[0-7]/m,} 2 }
} */
> +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.b, p[0-7]/m,} 2 }
} */
> +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.h, p[0-7]/m,} 2 }
} */
> +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.s, p[0-7]/m,} 2 }
} */
> +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.d, p[0-7]/m,} 2 }
} */
> +
> /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.b, p[0-7]/m,} 1 }
} */
> /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.h, p[0-7]/m,} 1 }
} */
> /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.s, p[0-7]/m,} 1 }
} */
> @@ -110,6 +120,14 @@ FOR_EACH_LOOP (DEF_LOOP)
> /* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m,} 1 }
} */
> /* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m,} 1 }
} */
> +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m,} 1 }
} */
> +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m,} 1 }
} */
> +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m,} 1 }
} */
> +
> +/* { dg-final { scan-assembler-times {\tfdiv\tz[0-9]+\.h, p[0-7]/m,} 1 }
} */
> +/* { dg-final { scan-assembler-times {\tfdiv\tz[0-9]+\.s, p[0-7]/m,} 1 }
} */
> +/* { dg-final { scan-assembler-times {\tfdiv\tz[0-9]+\.d, p[0-7]/m,} 1 }
} */
> +
> /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m,} 1
} } */
> /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m,} 1
} } */
> /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m,} 1
} } */
> Index: gcc/testsuite/gcc.target/aarch64/sve/vcond_9.c
> ===================================================================
> --- gcc/testsuite/gcc.target/aarch64/sve/vcond_9.c 2018-05-24
09:54:37.510451324 +0100
> +++ gcc/testsuite/gcc.target/aarch64/sve/vcond_9.c 2018-05-24
10:12:10.148352070 +0100
> @@ -5,6 +5,8 @@
> #define add(A, B) ((A) + (B))
> #define sub(A, B) ((A) - (B))
> +#define mul(A, B) ((A) * (B))
> +#define div(A, B) ((A) / (B))
> #define max(A, B) ((A) > (B) ? (A) : (B))
> #define min(A, B) ((A) < (B) ? (A) : (B))
> #define and(A, B) ((A) & (B))
> @@ -27,6 +29,7 @@ #define DEF_LOOP(TYPE, CMPTYPE, OP)
\
> #define FOR_EACH_INT_TYPE(T, TYPE) \
> T (TYPE, TYPE, add) \
> T (TYPE, TYPE, sub) \
> + T (TYPE, TYPE, mul) \
> T (TYPE, TYPE, max) \
> T (TYPE, TYPE, min) \
> T (TYPE, TYPE, and) \
> @@ -36,6 +39,8 @@ #define FOR_EACH_INT_TYPE(T, TYPE) \
> #define FOR_EACH_FP_TYPE(T, TYPE, CMPTYPE, SUFFIX) \
> T (TYPE, CMPTYPE, add) \
> T (TYPE, CMPTYPE, sub) \
> + T (TYPE, CMPTYPE, mul) \
> + T (TYPE, CMPTYPE, div) \
> T (TYPE, CMPTYPE, __builtin_fmax##SUFFIX) \
> T (TYPE, CMPTYPE, __builtin_fmin##SUFFIX)
> @@ -67,6 +72,11 @@ FOR_EACH_LOOP (DEF_LOOP)
> /* { dg-final { scan-assembler-times {\tsubr\tz[0-9]+\.s, p[0-7]/m,} 2 }
} */
> /* { dg-final { scan-assembler-times {\tsubr\tz[0-9]+\.d, p[0-7]/m,} 2 }
} */
> +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.b, p[0-7]/m,} 2 }
} */
> +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.h, p[0-7]/m,} 2 }
} */
> +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.s, p[0-7]/m,} 2 }
} */
> +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.d, p[0-7]/m,} 2 }
} */
> +
> /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.b, p[0-7]/m,} 1 }
} */
> /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.h, p[0-7]/m,} 1 }
} */
> /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.s, p[0-7]/m,} 1 }
} */
> @@ -110,6 +120,14 @@ FOR_EACH_LOOP (DEF_LOOP)
> /* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m,} 1
} } */
> /* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m,} 1
} } */
> +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m,} 1 }
} */
> +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m,} 1 }
} */
> +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m,} 1 }
} */
> +
> +/* { dg-final { scan-assembler-times {\tfdivr\tz[0-9]+\.h, p[0-7]/m,} 1
} } */
> +/* { dg-final { scan-assembler-times {\tfdivr\tz[0-9]+\.s, p[0-7]/m,} 1
} } */
> +/* { dg-final { scan-assembler-times {\tfdivr\tz[0-9]+\.d, p[0-7]/m,} 1
} } */
> +
> /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m,} 1
} } */
> /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m,} 1
} } */
> /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m,} 1
} } */
> Index: gcc/testsuite/gcc.target/aarch64/sve/vcond_12.c
> ===================================================================
> --- gcc/testsuite/gcc.target/aarch64/sve/vcond_12.c 2018-05-24
09:54:37.510451324 +0100
> +++ gcc/testsuite/gcc.target/aarch64/sve/vcond_12.c 2018-05-24
10:12:10.147352111 +0100
> @@ -5,6 +5,8 @@
> #define add(A, B) ((A) + (B))
> #define sub(A, B) ((A) - (B))
> +#define mul(A, B) ((A) * (B))
> +#define div(A, B) ((A) / (B))
> #define max(A, B) ((A) > (B) ? (A) : (B))
> #define min(A, B) ((A) < (B) ? (A) : (B))
> #define and(A, B) ((A) & (B))
> @@ -29,6 +31,7 @@ #define DEF_LOOP(TYPE, CMPTYPE, OP)
\
> #define FOR_EACH_INT_TYPE(T, TYPE) \
> T (TYPE, TYPE, add) \
> T (TYPE, TYPE, sub) \
> + T (TYPE, TYPE, mul) \
> T (TYPE, TYPE, max) \
> T (TYPE, TYPE, min) \
> T (TYPE, TYPE, and) \
> @@ -38,6 +41,8 @@ #define FOR_EACH_INT_TYPE(T, TYPE) \
> #define FOR_EACH_FP_TYPE(T, TYPE, CMPTYPE, SUFFIX) \
> T (TYPE, CMPTYPE, add) \
> T (TYPE, CMPTYPE, sub) \
> + T (TYPE, CMPTYPE, mul) \
> + /* No div because that gets converted into a mul anyway. */ \
> T (TYPE, CMPTYPE, __builtin_fmax##SUFFIX) \
> T (TYPE, CMPTYPE, __builtin_fmin##SUFFIX)
> @@ -58,10 +63,10 @@ FOR_EACH_LOOP (DEF_LOOP)
> /* { dg-final { scan-assembler-not {\tmov\tz[0-9]+\.., z[0-9]+} } } */
> -/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.b,} 14 } } */
> -/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h,} 18 } } */
> -/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.s,} 18 } } */
> -/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.d,} 18 } } */
> +/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.b,} 16 } } */
> +/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h,} 21 } } */
> +/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.s,} 21 } } */
> +/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.d,} 21 } } */
> /* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.b, p[0-7]/m,} 2 }
} */
> /* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.h, p[0-7]/m,} 2 }
} */
> @@ -73,6 +78,11 @@ FOR_EACH_LOOP (DEF_LOOP)
> /* { dg-final { scan-assembler-times {\tsub\tz[0-9]+\.s, p[0-7]/m,} 2 }
} */
> /* { dg-final { scan-assembler-times {\tsub\tz[0-9]+\.d, p[0-7]/m,} 2 }
} */
> +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.b, p[0-7]/m,} 2 }
} */
> +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.h, p[0-7]/m,} 2 }
} */
> +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.s, p[0-7]/m,} 2 }
} */
> +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.d, p[0-7]/m,} 2 }
} */
> +
> /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.b, p[0-7]/m,} 1 }
} */
> /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.h, p[0-7]/m,} 1 }
} */
> /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.s, p[0-7]/m,} 1 }
} */
> @@ -116,6 +126,10 @@ FOR_EACH_LOOP (DEF_LOOP)
> /* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m,} 1 }
} */
> /* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m,} 1 }
} */
> +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m,} 1 }
} */
> +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m,} 1 }
} */
> +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m,} 1 }
} */
> +
> /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m,} 1
} } */
> /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m,} 1
} } */
> /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m,} 1
} } */
More information about the Gcc-patches
mailing list