Intel AVX has variable vector lengths of 128bit and 256bit.
There are 128bit INT and 256bit FP vector arithmetic operations
as well as asymmetric vector conversion operations:
256bit vector (V4DF/V4DI) <-> 256bit vector (D4SI/V4SF)
256bit vector (V8SI) <-> 256bit vecor (V8SF)
The current vectorizer only supports different vector
size based on scalar type. But it doesn't support asymmetric
vector conversion nor different vector size based on
operation. The current AVX branch limits vector size
to 128bit for vectorizer:
/* ??? No autovectorization into MMX or 3DNOW until we can reliably
place emms and femms instructions.
FIXME: AVX has 32byte floating point vector operations and 16byte
integer vector operations. But vectorizer doesn't support
different sizes for integer and floating point vectors. We limit
vector size to 16byte. */
#define UNITS_PER_SIMD_WORD(MODE) \
(TARGET_AVX ? (((MODE) == DFmode || (MODE) == SFmode) ? 16 : 16) \
: (TARGET_SSE ? 16 : UNITS_PER_WORD))
One problem is vectorizable_conversion. Is there a way to support
V4DF/V4DI <-> D4SI/V4SF
V8SI <-> V8SF
(In reply to comment #1)
> One problem is vectorizable_conversion. Is there a way to support
> V4DF/V4DI <-> D4SI/V4SF
> V8SI <-> V8SF
With the current framework, the only way to support
V8SI <-> V8SF
is to implement the TARGET_VECTORIZE_BUILTIN_CONVERSION for these modes.
There's no way in the current framework to support
V4DF <-> V4SI
V4DI <-> V4SF
because of the single-vector-size assumption. These however would be supported:
V4DF <-> V8SI
V4DI <-> V8SF
by modeling the idioms unpack[u/s]_float_[lo/hi] and vec_pack_[u/s]fix_trunc for the respective modes.
I think that in order to really support AVX the vectorizer would need to be extended to consider multiple vector sizes (which would probably involve more than just extending the support for conversions).