PATCH: Add SSE4.1 support

Thu Apr 19 20:57:00 GMT 2007

On Thu, Apr 19, 2007 at 11:33:09AM -0700, Seongbae Park wrote:
> On 4/19/07, H. J. Lu <hjl@lucon.org> wrote:
> >On Wed, Apr 18, 2007 at 10:56:39PM -0700, H. J. Lu wrote:
> >> > > -                 "=r  ,m  ,*y,*y,?rm,?*y,*x,*x,?r ,m ,?*Yi,*x")
> >> > > +                 "=r  ,m  ,*y,*y,?rm,?*y,*x,*x,?r ,m ,?*Yi,*x,?r ")
> >> > >   (match_operand:SI 1 "general_operand"
> >> > > -                 "rinm,rin,C ,*y,*y ,rm ,C ,*x,*Yi,*x,r   ,m "))]
> >> > > +                 "rinm,rin,C ,*y,*y ,rm ,C ,*x,*Yi,*x,r   ,m 
> >,*Y4"))]
> >> > >    "!(MEM_P (operands[0]) && MEM_P (operands[1]))"
> >> > >  {
> >> > >    switch (get_attr_type (insn))
> >> > >      {
> >> > > +    case TYPE_SSELOG:
> >> > > +      return "pextrd\t{$0, %1, %0|%0, %1, $0}";
> >> >
> >> > This behaviour is a bit funny - ie when INTER_UNIT_MOVES is allowed, we
> >> > use mov instruction, but when it is disabled we go via memory while 
> >when
> >> > it is disabled and wehn enabled, we use pextrd that is likely just
> >> > having longer encoding.
> >> >
> >> > I would say that on !INTER_UNIT_MOVES the pextrd instruction with
> >> > integer register operand should probably be also banned (ie expanded
> >> > into memory store) that might result in slightly easier implementation
> >> > and not need for extra constrain doubleletter.
> >> >
> >> > But I am fine with the current implementation until we see how real
> >> > hardware behave.
> >>
> >> I will double check with our hardware people on this.
> >>
> >
> >I checked with our hardware people. We prefer movd over pextrd as
> >well as store/load for
> >
> >(set (reg:SI 0 ax [62]) (reg:SI 21 xmm0))
> >
> >The same goes for
> >
> >(set (reg:DI 0 ax [62]) (reg:DI 21 xmm0))
> >
> >That is we want movd (movq) and it should be independent of
> >INTER_UNIT_MOVES. However, I don't know what AMD prefers.
> 
> Can you tell us why movd is preferred over pextrd ?
> What different performance characteristics do they have
> from the compiler point of view ?

pextrd is shuffle + movd. movd is always faster than pextrd since
there is no shuffle.

H.J.