[Bug rtl-optimization/31485] C complex numbers, amd64 SSE, missed optimization opportunity

Wed Apr 21 11:44:00 GMT 2010

------- Comment #9 from rguenther at suse dot de  2010-04-21 11:44 -------
Subject: Re:  C complex numbers, amd64 SSE, missed
 optimization opportunity

On Wed, 21 Apr 2010, irar at il dot ibm dot com wrote:

> ------- Comment #8 from irar at il dot ibm dot com  2010-04-21 11:33 -------
> Yes, it's possible to add this to SLP. But I don't understand how 
> D.3154_3 = COMPLEX_EXPR <D.3163_8, D.3164_9>;
> should be vectorized. D.3154_3 is complex and the rhs will be a vector
> {D.3163_8, D.3164_9} (btw, we have to change float to double, otherwise, we
> don't have complete vectors and this is not supported).

Dependent on how D.3154_3 is used afterwards it will be much like
an interleaved/strided store (if {D.3163_8, D.3164_9} is in xmm2 and the
complex is in the lower halves of the register pair xmm0 and xmm1
we'd emit vec_extracts).  On the tree level we can probably
represent this as

 D.3154_3 = VIEW_CONVERT_EXPR <compex_double> (vec_temp_4);

where vec_temp_4 is the {D.3163_8, D.3164_9} vector.
Or similar, but with present known-to-work trees

 realpart = BIT_FIELD_REF <0, ..> (vec_tmp_4);
 imagpart = BIT_FIELD_REF <64, ..> (vec_tmp_4);
 D.3154_3 = COMPLEX_EXPR <realpart, imagpart>;

One could also see the COMPLEX_EXPR as a root for SLP induction
vectorization (I suppose we don't do SLP induction at the moment,
induction in the sense that we pick arbitrary scalars and combine
them into vectors).

Richard.

-- 

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31485