Re: [PATCH][x86] Match movss and movsd "blend" instructions

On Mittwoch, 1. August 2018 18:51:41 CEST Marc Glisse wrote:
> On Wed, 1 Aug 2018, Allan Sandfeld Jensen wrote:
> >  extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__,
> > 
> > __artificial__))
> > 
> >  _mm_move_sd (__m128d __A, __m128d __B)
> >  {
> > 
> > -  return (__m128d) __builtin_ia32_movsd ((__v2df)__A, (__v2df)__B);
> > +  return __extension__ (__m128d)(__v2df){__B[0],__A[1]};
> > 
> >  }
> If the goal is to have it represented as a VEC_PERM_EXPR internally, I
> wonder if we should be explicit and use __builtin_shuffle instead of
> relying on some forwprop pass to transform it. Maybe not, just asking. And
> the answer need not even be the same for _mm_move_sd and _mm_move_ss.

I forgot. One of the things that makes using __builtin_shuffle ugly is that 
__v4si  as the suffle argument needs to be in _mm_move_ss, is declared
in emmintrin.h, but _mm_move_ss is in xmmintrin.h.

In general the gcc __builtin_shuffle syntax with the argument being a vector 
is kind of ackward. At least for the declaring intrinsics, the clang still 
where the permutator is extra argument is easier to deal with:
__builtin_shuffle(a, b, (__v4si){4, 0, 1, 2})
 __builtin_shuffle(a, b, 4, 0, 1, 2)

