This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH i386 3/8] [AVX512] [19/n] Add AVX-512 patterns: Extracts and converts.
- From: Richard Henderson <rth at redhat dot com>
- To: Kirill Yukhin <kirill dot yukhin at gmail dot com>
- Cc: Uros Bizjak <ubizjak at gmail dot com>, Vladimir Makarov <vmakarov at redhat dot com>, Jakub Jelinek <jakub at redhat dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Tue, 15 Oct 2013 08:46:47 -0700
- Subject: Re: [PATCH i386 3/8] [AVX512] [19/n] Add AVX-512 patterns: Extracts and converts.
- Authentication-results: sourceware.org; auth=none
- References: <20130808112524 dot GA40277 at msticlxl57 dot ims dot intel dot com> <20130814072638 dot GD52726 at msticlxl57 dot ims dot intel dot com> <52129604 dot 6040305 at redhat dot com> <20131009103129 dot GR52466 at msticlxl57 dot ims dot intel dot com>
On 10/09/2013 03:31 AM, Kirill Yukhin wrote:
> + rtx op1 = operands[1];
> + if (REG_P (op1))
> + op1 = gen_rtx_REG (V16HImode, REGNO (op1));
> + else
> + op1 = gen_lowpart (V16HImode, op1);
The IF case is incorrect. You need to use gen_lowpart always.
> +(define_insn "*avx512f_unpcklpd512"
> + [(set (match_operand:V8DF 0 "register_operand" "=v,v")
> + (vec_select:V8DF
> + (vec_concat:V16DF
> + (match_operand:V8DF 1 "nonimmediate_operand" "v,vm")
> + (match_operand:V8DF 2 "nonimmediate_operand" "vm,1"))
> + (parallel [(const_int 0) (const_int 8)
> + (const_int 2) (const_int 10)
> + (const_int 4) (const_int 12)
> + (const_int 6) (const_int 14)])))]
> + "TARGET_AVX512F"
> + "@
> + vunpcklpd\t{%2, %1, %0|%0, %1, %2}
> + vmovddup\t{%1, %0|%0, %1}"
> + [(set_attr "type" "sselog")
> + (set_attr "prefix" "evex")
> + (set_attr "mode" "V8DF")])
The second alternative only matches for v/m/1. While I imagine that it doesn't
really matter, it might be better to swap the two so that vmovddup gets used
for the v/v/1 case too.
Why do you have separate define_expand and define_insn for this pattern?
> +(define_insn "*avx512f_<code>v8div16qi2_store"
> + [(set (match_operand:V16QI 0 "memory_operand" "=m")
> + (vec_concat:V16QI
> + (any_truncate:V8QI
> + (match_operand:V8DI 1 "register_operand" "v"))
> + (vec_select:V8QI
> + (match_dup 0)
> + (parallel [(const_int 8) (const_int 9)
> + (const_int 10) (const_int 11)
> + (const_int 12) (const_int 13)
> + (const_int 14) (const_int 15)]))))]
> + "TARGET_AVX512F"
> + "vpmov<trunsuffix>qb\t{%1, %0|%0, %1}"
> + [(set_attr "type" "ssemov")
> + (set_attr "memory" "store")
> + (set_attr "prefix" "evex")
> + (set_attr "mode" "TI")])
Any pattern that uses a match_dup of an input like this must use "+" not "=" to
signal that the operand is in_out not output.
And is this really the best description for this insn? It's a store to memory.
Surely it's better to say that we store V8QI.
> +(define_expand "vec_widen_umult_even_v16si"
> + [(set (match_operand:V8DI 0 "register_operand")
> + (mult:V8DI
> + (zero_extend:V8DI
> + (vec_select:V8SI
> + (match_operand:V16SI 1 "nonimmediate_operand")
> + (parallel [(const_int 0) (const_int 2)
> + (const_int 4) (const_int 6)
> + (const_int 8) (const_int 10)
> + (const_int 12) (const_int 14)])))
> + (zero_extend:V8DI
> + (vec_select:V8SI
> + (match_operand:V16SI 2 "nonimmediate_operand")
> + (parallel [(const_int 0) (const_int 2)
> + (const_int 4) (const_int 6)
> + (const_int 8) (const_int 10)
> + (const_int 12) (const_int 14)])))))]
> + "TARGET_AVX512F"
> + "ix86_fixup_binary_operands_no_copy (MULT, V16SImode, operands);")
We don't need to use ix86_fixup_binary_operands for any three-operand insn.
That function is in order to help SSE's two-operand insns. You can drop the
define-expand and just keep the define_insn.
> +(define_insn "*vec_widen_umult_even_v16si"
> + [(set (match_operand:V8DI 0 "register_operand" "=v")
> + (mult:V8DI
> + (zero_extend:V8DI
> + (vec_select:V8SI
> + (match_operand:V16SI 1 "nonimmediate_operand" "v")
> + (parallel [(const_int 0) (const_int 2)
> + (const_int 4) (const_int 6)
> + (const_int 8) (const_int 10)
> + (const_int 12) (const_int 14)])))
> + (zero_extend:V8DI
> + (vec_select:V8SI
> + (match_operand:V16SI 2 "nonimmediate_operand" "vm")
> + (parallel [(const_int 0) (const_int 2)
> + (const_int 4) (const_int 6)
> + (const_int 8) (const_int 10)
> + (const_int 12) (const_int 14)])))))]
> + "TARGET_AVX512F && ix86_binary_operator_ok (MULT, V16SImode, operands)"
You want a "%v" for operand 1, to make the operands commutative.
> +(define_insn "*vec_widen_smult_even_v16si"
> + [(set (match_operand:V8DI 0 "register_operand" "=x")
> + (mult:V8DI
Similarly, plus errant "x".
r~