This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: x86: CPU type requirements incorrect for various vectormode operations
- From: "Jan Beulich" <JBeulich at novell dot com>
- To: <rth at redhat dot com>
- Cc: <gcc-patches at gcc dot gnu dot org>
- Date: Mon, 12 Jul 2004 10:30:32 +0200
- Subject: Re: x86: CPU type requirements incorrect for various vectormode operations
>> (movv2sf_internal): Allow beginning with MMX.
>> (movv2df_internal, movv8hi_internal, movv16qi_internal): Allow
>> beginning with SSE and use SSE instruction when SSE2
unavailable.
>> (movv2df, movv8hi, movv16qi, pushv2di, pushv8hi, pushv16qi,
pushv4si):
>> Allow beginning with SSE.
>> (movv2sf, pushv2sf): Allow beginning with MMX.
>
>What's the logic here?
As said in the other, 3DNow!-related email, the move and push patterns
must be available as soon as the types covered by them, when being
passed as arguments or return values, are being passed through the
respective registers. Otherwise you can't store the arguments to memory
in order to emulate the vector operation by operating on more narrow
vector types or even on scalar ones. But (I'm going to write a more
exhaustive description for this soon) the parameter passing model for
MMX (to a certain degree also for SSE) seems very broken to me right
now.
>> +/* Extracts one of the four words of A. The selector N must be
immediate.*/
>> +#if 0
>> +static __inline int
>> +_mm_extract_pi16 (__m64 __A, int __N)
>> +{
>> + return __builtin_ia32_pextrw ((__v4hi)__A, __N);
>> +}
>> +
>> +static __inline int
>> +_m_pextrw (__m64 __A, int __N)
>> +{
>> + return _mm_extract_pi16 (__A, __N);
>> +}
>> +#else
>> +#define _mm_extract_pi16(A, N) \
>> + __builtin_ia32_pextrw ((__v4hi)(A), (N))
>> +#define _m_pextrw(A, N) _mm_extract_pi16((A), (N))
>> +#endif
>
>These should be written
>
>static __inline int __attribute__((__always_inline__))
>_mm_extract_pi16 (__m64 const __A, int const __N)
>{
> return __builtin_ia32_pextrw ((__v4hi)__A, __N);
>}
>
>That should propagate the constant to the builtin even at -O0.
This now indeed would seem to be an unrelated change. The code here was
simply moved from xmmintrin.h
Jan