This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: V2 [PATCH] x86: Add pmovzx/pmovsx patterns with SI/DI operands


On Thu, Oct 25, 2018 at 9:10 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> On 10/24/18, Uros Bizjak <ubizjak@gmail.com> wrote:
> > On Wed, Oct 24, 2018 at 1:51 AM H.J. Lu <hjl.tools@gmail.com> wrote:
> >>
> >> Hi Uros,
> >>
> >> Can you take a look at this?
> >
> > Many pmov(s|z)x instructions with memory operands are modelled in a wrong
> > way.
> >
> > For example:
> >
> > (define_insn "sse4_1_<code>v8qiv8hi2<mask_name>"
> >   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> >     (any_extend:V8HI
> >       (vec_select:V8QI
> >         (match_operand:V16QI 1 "nonimmediate_operand" "Yrm,*xm,vm")
> >         (parallel [(const_int 0) (const_int 1)
> >                (const_int 2) (const_int 3)
> >                (const_int 4) (const_int 5)
> >                (const_int 6) (const_int 7)]))))]
> >
> > should be defind for memory operands as:
> >
> > (define_insn "sse4_1_<code>v8qiv8hi2<mask_name>"
> >   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> >     (any_extend:V8HI
> >       (match_operand:V8QI "memory_operand" "m,m,m")))]
> >
> > The insn, as is currently modelled, implies 128 bit read from memory,
> > which is not the case. -masm-intel will fail.
> >
> > The correct solution would be to use a different pattern for memory
> > operands (similar to the one above), and handle simplification of
> > vec_select part in a generic way. Please also note that V8QI implies
> > MMX registers, so there is danger of generating an unwanted move to
> > MMX reg, which will disable x87 registers.
> >
>
> Since vectorizer works only with XMM and YMM registers, wee can't
> change existing patterns.  As you pointed out that V8QI might trigger
> MMX registers.  New patterns must match the existing patterns.  Otherwise
> combiner won't work.   This is the updated patch to handle v8qi to v8si2 and
> v4hi to v4di2.

I don't like this patch at all. It just implements one particular
simplification that fits the combiner for the exact sequence of
builtins, and I guess we can live with one extra movq for those cases.
As said, several pmov{s,z}x insn patterns are simply wrong for memory
operands (they model access to 128bit memory, which is wrong). These
should be fixed first, and then either a) implement generic
simplification or b) use define_insn_and_split to convert combined
insn to a single pmov{s,z}x insn.

Uros.
> --
> H.J.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]