[Bug tree-optimization/88873] missing vectorization for decomposed operations on a vector type
rguenth at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Wed Jan 16 10:51:00 GMT 2019
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88873
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Last reconfirmed| |2019-01-16
Blocks| |53947
Ever confirmed|0 |1
--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
Confirmed. bar is not vectorized because it looks like
<bb 2> [local count: 1073741824]:
_1 = BIT_FIELD_REF <c_10(D), 64, 0>;
_2 = BIT_FIELD_REF <b_11(D), 64, 0>;
_3 = BIT_FIELD_REF <a_12(D), 64, 0>;
_4 = fma (_3, _2, _1);
r_14 = BIT_INSERT_EXPR <r_13(D), _4, 0 (64 bits)>;
_5 = BIT_FIELD_REF <c_10(D), 64, 64>;
_6 = BIT_FIELD_REF <b_11(D), 64, 64>;
_7 = BIT_FIELD_REF <a_12(D), 64, 64>;
_8 = fma (_7, _6, _5);
r_15 = BIT_INSERT_EXPR <r_14, _8, 64 (64 bits)>;
return r_15;
and there are no loads/stores BB vectorization can work with. There's
an enhancement request for BB vectorization to key off
vector constructors and this one is similar. Eventually
r_14 = BIT_INSERT_EXPR <r_13(D), _4, 0 (64 bits)>;
r_15 = BIT_INSERT_EXPR <r_14, _8, 64 (64 bits)>;
should be combined to
r_15 = { _4, _8 };
but then dependence on BB SLP of vector CONSTRUCTORs remains. There's
also still no loads but eventually the BIT_FIELD_REFs are enough here.
Appearantly not:
v2df r;
v2df bar (v2df a, v2df b, v2df c)
{
r[0] = fma (a[0], b[0], c[0]);
r[1] = fma (a[1], b[1], c[1]);
return r;
}
results in
<bb 2> [local count: 1073741824]:
_1 = BIT_FIELD_REF <c_13(D), 64, 0>;
_2 = BIT_FIELD_REF <b_14(D), 64, 0>;
_3 = BIT_FIELD_REF <a_15(D), 64, 0>;
_4 = fma (_3, _2, _1);
_5 = BIT_FIELD_REF <c_13(D), 64, 64>;
_6 = BIT_FIELD_REF <b_14(D), 64, 64>;
_7 = BIT_FIELD_REF <a_15(D), 64, 64>;
_8 = fma (_7, _6, _5);
_16 = {_4, _8};
vect_cst__17 = _16;
MEM[(vector(2) double *)&r] = vect_cst__17;
_12 = r;
return _12;
so we only vectorize the store:
t.c:18:10: missed: Build SLP failed: not grouped load _3 = BIT_FIELD_REF
<a_15(D), 64, 0>;
but that should be possible to fix as well.
Referenced Bugs:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations
More information about the Gcc-bugs
mailing list