This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Introducing SAD (Sum of Absolute Differences) operation to GCC vectorizer.
- From: Uros Bizjak <ubizjak at gmail dot com>
- To: Cong Hou <congh at google dot com>
- Cc: ramana dot gcc at googlemail dot com, Richard Biener <rguenther at suse dot de>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
- Date: Fri, 1 Nov 2013 08:43:08 +0100
- Subject: Re: [PATCH] Introducing SAD (Sum of Absolute Differences) operation to GCC vectorizer.
- Authentication-results: sourceware.org; auth=none
- References: <CAFULd4Znkj7WxsP9kmng069XKXb2CWL3es4myY7tE-5JmykJFw at mail dot gmail dot com> <CAK=A3=33kpXdwCEhbPFfs4=mov0k6Z6J+O0HnBdE0fB41K7vvQ at mail dot gmail dot com>
On Fri, Nov 1, 2013 at 3:03 AM, Cong Hou <congh@google.com> wrote:
> According to your comments, I made the following modifications to this patch:
>
> 1. Now SAD pattern does not require the first and second operands to
> be unsigned. And two versions (signed/unsigned) of the SAD optabs are
> defined: usad_optab and ssad_optab.
>
> 2. Use expand_simple_binop instead of gen_rtx_PLUS to generate the
> plus expression in sse.md. Also change the type of the second/third
> operands to be nonimmediate_operand.
>
> 3. Add the document for SAD_EXPR.
>
> 4. Verify the operands of SAD_EXPR.
>
> 5. Create a new target: vect_usad_char, and use it in the test case.
>
> The updated patch is pasted below.
> +(define_expand "usadv16qi"
> + [(match_operand:V4SI 0 "register_operand")
> + (match_operand:V16QI 1 "register_operand")
> + (match_operand:V16QI 2 "nonimmediate_operand")
> + (match_operand:V4SI 3 "nonimmediate_operand")]
> + "TARGET_SSE2"
> +{
> + rtx t1 = gen_reg_rtx (V2DImode);
> + rtx t2 = gen_reg_rtx (V4SImode);
> + emit_insn (gen_sse2_psadbw (t1, operands[1], operands[2]));
> + convert_move (t2, t1, 0);
> + emit_insn (gen_rtx_SET (VOIDmode, operands[0],
> + expand_simple_binop (V4SImode, PLUS, t2, operands[3],
> + NULL, 0, OPTAB_DIRECT)));
It seems to me that generic expander won't bring any benefit there,
operands are already in correct form, so please change the last lines
simply to:
emit_insn (gen_addv4si3 (operands[0], t2, operands[3]));
> + DONE;
> +})
> +
> +(define_expand "usadv32qi"
> + [(match_operand:V8SI 0 "register_operand")
> + (match_operand:V32QI 1 "register_operand")
> + (match_operand:V32QI 2 "nonimmediate_operand")
> + (match_operand:V8SI 3 "nonimmediate_operand")]
> + "TARGET_AVX2"
> +{
> + rtx t1 = gen_reg_rtx (V4DImode);
> + rtx t2 = gen_reg_rtx (V8SImode);
> + emit_insn (gen_avx2_psadbw (t1, operands[1], operands[2]));
> + convert_move (t2, t1, 0);
> + emit_insn (gen_rtx_SET (VOIDmode, operands[0],
> + expand_simple_binop (V8SImode, PLUS, t2, operands[3],
> + NULL, 0, OPTAB_DIRECT)));
Same here, using gen_addv8si3.
No need to repost the patch with this trivial change.
Sorry for the confusion,
Uros.