This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [i386] scalar ops that preserve the high part of a vector
- From: Uros Bizjak <ubizjak at gmail dot com>
- To: Marc Glisse <marc dot glisse at inria dot fr>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Fri, 30 Nov 2012 14:42:38 +0100
- Subject: Re: [i386] scalar ops that preserve the high part of a vector
- References: <alpine.DEB.2.02.1210131032460.9651@stedding.saclay.inria.fr> <CAFULd4YHdLF1ZyxrMG8MhRjo40f-EfAJZnDOEBc80pOGa4WNGQ@mail.gmail.com> <alpine.DEB.2.02.1210141057010.3752@laptop-mg.saclay.inria.fr> <alpine.DEB.2.02.1211301317160.3783@laptop-mg.saclay.inria.fr>
On Fri, Nov 30, 2012 at 1:34 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
> Hello,
>
> I experimented with the simplify-rtx transformation you suggested, see:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54855
>
> It works when the argument is a register, but not for memory (which is where
> the constant is in the testcase). And the description of the operation in
> sse.md does seem problematic. It says the second argument is:
>
> (match_operand:VF_128 2 "nonimmediate_operand" "xm,xm"))
>
> but Intel's documentation says "The source operand can be an XMM register or
> a 64-bit memory location", not quite the same.
>
> Do you think the .md description should really stay this way, or could we
> change it to something that better reflects "64-bit memory location"?
For reference, we are talking about:
(define_insn "<sse>_vm<plusminus_insn><mode>3"
[(set (match_operand:VF_128 0 "register_operand" "=x,x")
(vec_merge:VF_128
(plusminus:VF_128
(match_operand:VF_128 1 "register_operand" "0,x")
(match_operand:VF_128 2 "nonimmediate_operand" "xm,xm"))
(match_dup 1)
(const_int 1)))]
"TARGET_SSE"
"@
<plusminus_mnemonic><ssescalarmodesuffix>\t{%2, %0|%0, %2}
v<plusminus_mnemonic><ssescalarmodesuffix>\t{%2, %1, %0|%0, %1, %2}"
[(set_attr "isa" "noavx,avx")
(set_attr "type" "sseadd")
(set_attr "prefix" "orig,vex")
(set_attr "mode" "<ssescalarmode>")])
No, looking at your description, the operand 2 should be scalar
operand (we use _s{s,d} scalar instruction here), and for doubles this
should refer to 64bit memory location. I don't remember all the
details about vec_merge scalar instructions, but it looks to me that
canonical representation should be more like your proposal:
+(define_insn "*sse2_vm<plusminus_insn>v2df3"
+ [(set (match_operand:V2DF 0 "register_operand" "=x,x")
+ (vec_concat:V2DF
+ (plusminus:DF
+ (vec_select:DF
+ (match_operand:V2DF 1 "register_operand" "0,x")
+ (parallel [(const_int 0)]))
+ (match_operand:DF 2 "nonimmediate_operand" "xm,xm"))
+ (vec_select:DF (match_dup 1) (parallel [(const_int 1)]))))]
+ "TARGET_SSE2"
Uros.