This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug tree-optimization/79102] gcc fails to auto-vectorise the multiplicative reduction of an array of complex floats
- From: "rguenth at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Tue, 17 Jan 2017 09:07:37 +0000
- Subject: [Bug tree-optimization/79102] gcc fails to auto-vectorise the multiplicative reduction of an array of complex floats
- Auto-submitted: auto-generated
- References: <bug-79102-4@http.gcc.gnu.org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79102
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |missed-optimization
Status|UNCONFIRMED |NEW
Last reconfirmed| |2017-01-17
Component|c |tree-optimization
Blocks| |53947
Summary|gcc fails to auto-vectorise |gcc fails to auto-vectorise
|the product of an array of |the multiplicative
|complex floats |reduction of an array of
| |complex floats
Ever confirmed|0 |1
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
The issue is we do not support reduction of _Complex vars. The vectorizer sees
<bb 3> [99.00%]:
# i_16 = PHI <i_11(4), 0(2)>
# p$real_13 = PHI <_21(4), 1.0e+0(2)>
# p$imag_14 = PHI <_22(4), 0.0(2)>
# ivtmp_48 = PHI <ivtmp_47(4), 128(2)>
_1 = (long unsigned int) i_16;
_2 = _1 * 8;
_3 = x_9(D) + _2;
_7 = REALPART_EXPR <*_3>;
_12 = IMAGPART_EXPR <*_3>;
_17 = _7 * p$real_13;
_18 = _12 * p$imag_14;
_19 = _7 * p$imag_14;
_20 = _12 * p$real_13;
_21 = _17 - _18;
_22 = _19 + _20;
i_11 = i_16 + 1;
ivtmp_47 = ivtmp_48 - 1;
if (ivtmp_47 != 0)
goto <bb 4>; [98.99%]
else
goto <bb 5>; [1.01%]
<bb 4> [98.00%]:
goto <bb 3>; [100.00%]
<bb 5> [1.00%]:
# _50 = PHI <_21(3)>
# _49 = PHI <_22(3)>
p_10 = COMPLEX_EXPR <_50, _49>;
return p_10;
which would need to be detected as a single reduction with two components. I
guess not lowering complex operations would help here (with its own
complications of course). Not sinking the COMPLEX_EXPR and having it as
loop carried dep would eventually help as well.
Referenced Bugs:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations