Bug 36844

Summary: Vectorizer doesn't support INT<->FP conversions with different size
Product: gcc Reporter: H.J. Lu <hjl.tools>
Component: tree-optimizationAssignee: Not yet assigned to anyone <unassigned>
Status: UNCONFIRMED ---    
Severity: enhancement CC: areg.melikadamyan, crazylht, dorit, gcc-bugs, joey.ye, rguenth, xuepeng.guo
Priority: P3 Keywords: missed-optimization
Version: 4.4.0   
Target Milestone: ---   
Host: Target:
Build: Known to work:
Known to fail: Last reconfirmed:
Bug Depends on:    
Bug Blocks: 53947, 96654    

Description H.J. Lu 2008-07-15 21:41:55 UTC
Intel AVX has variable vector lengths of 128bit and 256bit.
There are 128bit INT and 256bit FP vector arithmetic operations
as well as asymmetric vector conversion operations:
256bit vector (V4DF/V4DI) <-> 256bit vector (D4SI/V4SF)
256bit vector (V8SI) <-> 256bit vecor (V8SF)

The current vectorizer only supports different vector
size based on scalar type. But it doesn't support asymmetric
vector conversion nor different vector size based on
operation. The current AVX branch limits vector size
to 128bit for vectorizer:

/* ??? No autovectorization into MMX or 3DNOW until we can reliably
   place emms and femms instructions.
   FIXME: AVX has 32byte floating point vector operations and 16byte
   integer vector operations.  But vectorizer doesn't support
   different sizes for integer and floating point vectors.  We limit
   vector size to 16byte.  */
#define UNITS_PER_SIMD_WORD(MODE)                                       \
  (TARGET_AVX ? (((MODE) == DFmode || (MODE) == SFmode) ? 16 : 16)      \
              : (TARGET_SSE ? 16 : UNITS_PER_WORD))
Comment 1 H.J. Lu 2008-07-15 21:47:14 UTC
One problem is vectorizable_conversion. Is there a way to support

V8SI <-> V8SF
Comment 2 dorit 2008-07-22 10:39:39 UTC
(In reply to comment #1)
> One problem is vectorizable_conversion. Is there a way to support
> V4DF/V4DI <-> D4SI/V4SF
> V8SI <-> V8SF 

With the current framework, the only way to support 
V8SI <-> V8SF
is to implement the TARGET_VECTORIZE_BUILTIN_CONVERSION for these modes. 

There's no way in the current framework to support  
V4DF <-> V4SI
V4DI <-> V4SF
because of the single-vector-size assumption. These however would be supported:
V4DF <-> V8SI
V4DI <-> V8SF
by modeling the idioms unpack[u/s]_float_[lo/hi] and vec_pack_[u/s]fix_trunc for the respective modes.

I think that in order to really support AVX the vectorizer would need to be extended to consider multiple vector sizes (which would probably involve more than just extending the support for conversions).