This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH]: x86 specific patch, was: Update SSE5 vector multiplication, shift, rotate
- From: "Uros Bizjak" <ubizjak at gmail dot com>
- To: "Michael Meissner" <michael dot meissner at amd dot com>, "Uros Bizjak" <ubizjak at gmail dot com>, gcc-patches at gcc dot gnu dot org, dwarak dot rajagopal at amd dot com, christophe dot harle at amd dot com, hongjiu dot lu at intel dot com, "Richard Guenther" <rguenther at suse dot de>
- Date: Tue, 22 Apr 2008 11:47:27 +0200
- Subject: Re: [PATCH]: x86 specific patch, was: Update SSE5 vector multiplication, shift, rotate
- References: <20080417185036.GA15776@mmeissner-gold.amd.com> <5787cf470804172258t7caedb73m3b499abf6e57cc67@mail.gmail.com> <20080418154721.GB17904@mmeissner-gold.amd.com>
On Fri, Apr 18, 2008 at 5:47 PM, Michael Meissner
<michael.meissner@amd.com> wrote:
> ;; SSE5 packed rotate instructions
> -(define_insn "rotl<mode>3"
> +(define_expand "rotl<mode>3"
> + [(set (match_operand:SSEMODE1248 0 "register_operand" "")
> + (rotate:SSEMODE1248
> + (match_operand:SSEMODE1248 1 "nonimmediate_operand" "")
> + (match_operand:SI 2 "general_operand")))]
> + "TARGET_SSE5"
> +{
> + /* If we were given a scalar, convert it to parallel */
> + if (! const_0_to_<sserotatemax>_operand (operands[2], SImode))
> + {
> + rtvec vs = rtvec_alloc (<ssescalarnum>);
> + rtx par = gen_rtx_PARALLEL (<MODE>mode, vs);
VOIDmode, since x86_expand_vector_init doesn't care about the mode of PARALLEL.
> + rtx reg = gen_reg_rtx (<MODE>mode);
> + rtx op2 = operands[2];
> + int i;
> +
> + if (GET_MODE (op2) != <ssescalarmode>mode)
> + {
> + op2 = gen_reg_rtx (<ssescalarmode>mode);
> + convert_move (op2, operands[2], false);
> + }
Do we need the code above? AFAIK x86_expand_vector_init takes care about this.
> +
> + for (i = 0; i < <ssescalarnum>; i++)
> + RTVEC_ELT (vs, i) = op2;
> +
> + emit_insn (gen_vec_init<mode> (reg, par));
Please use ix86_expand_vector_init directly instead of gen_vec_init in
this and other patterns.
> +(define_expand "ashrv2di3"
> + [(match_operand:V2DI 0 "register_operand" "")
> + (match_operand:V2DI 1 "register_operand" "")
> + (match_operand:DI 2 "nonmemory_operand" "")]
> + "TARGET_SSE5"
> +{
> + rtvec vs = rtvec_alloc (2);
> + rtx par = gen_rtx_PARALLEL (V2DImode, vs);
VOIDmode.
> + rtx reg = gen_reg_rtx (V2DImode);
> + rtx ele;
> +
> + if (GET_CODE (operands[2]) == CONST_INT)
> + ele = GEN_INT (- INTVAL (operands[2]));
> + else if (GET_MODE (operands[2]) != DImode)
Something is wrong here. operands[2] will always be in DImode, please
look at the pattern above.
> + {
> + rtx move = gen_reg_rtx (DImode);
> + ele = gen_reg_rtx (DImode);
> + convert_move (move, operands[2], false);
> + emit_insn (gen_negdi2 (ele, move));
> + }
> + else
> + {
> + ele = gen_reg_rtx (DImode);
> + emit_insn (gen_negdi2 (ele, operands[2]));
> + }
> +
> + RTVEC_ELT (vs, 0) = ele;
> + RTVEC_ELT (vs, 1) = ele;
> + emit_insn (gen_vec_initv2di (reg, par));
I think that the pattern should require operand[2] in SImode, since
SImode requires movd (cheaper than movq?) instead of movq to load from
integer register to XMM register. ix86_expand_vector init (aka
gen_vec_initv2di) will AFAIK handle these just fine even if inner
values are in SImode.
Otherwise the patch is OK.
Thanks,
Uros.