[Bug tree-optimization/91954] [10 Regression] gcc.dg/vect/pr66142.c should not need early inlining to be vectorized since r10-3311-gff6686d2e5f797d6
rguenth at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Fri Feb 7 11:22:00 GMT 2020
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91954
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
IPA SRA splits the struct B parameter:
Evaluating analysis results for bar/0
Will split parameter 0
- component at byte offset 0, size 4
- component at byte offset 4, size 4
- component at byte offset 8, size 4
- component at byte offset 12, size 4
which is passed as
D.2220[_14].t.x = 1.0e+0;
D.2220[_14].t.y = u_11;
D.2220[_14].w.x = x_16;
D.2220[_14].w.y = y_18;
_23 = &D.2220[_14];
_24 = bar (_23);
producing
bar.isra (const float ISRA.5, const float ISRA.6, const float ISRA.7, const
float ISRA.8)
the inlining of this then ends up as
_23 = &D.2220[_14];
_27 = MEM[(float *)_23];
_28 = MEM[(float *)_23 + 4B];
_29 = MEM[(float *)_23 + 8B];
_30 = MEM[(float *)_23 + 12B];
where formerly forwprop ended up combining the &D.2220[_14] into the
four dereferences with the nice acccess paths which it cannot do from
the above:
D.2220[_20].t.x = 1.0e+0;
D.2220[_20].t.y = u_13;
D.2220[_20].w.x = x_22;
D.2220[_20].w.y = y_24;
_30 = MEM <struct B[32]> [(const struct B *)&D.2220][_20].t.x;
_34 = MEM <struct B[32]> [(const struct B *)&D.2220][_20].t.y;
_35 = MEM <struct B[32]> [(const struct B *)&D.2220][_20].w.x;
_36 = _30 * _35;
_37 = MEM <struct B[32]> [(const struct B *)&D.2220][_20].w.y;
so instead we get
D.2220[_14].t.x = 1.0e+0;
D.2220[_14].t.y = u_11;
D.2220[_14].w.x = x_16;
D.2220[_14].w.y = y_18;
_23 = &D.2220[_14];
_27 = MEM[(float *)_23];
_28 = MEM[(float *)_23 + 4B];
_29 = MEM[(float *)_23 + 8B];
_30 = MEM[(float *)_23 + 12B];
which we are not able to elide (which is of course kind-of lame).
IIRC we have a duplicate report for this somewhere assigned to me (for the VN
side). VN already can do some tricks but it doesn't handle offsetted
MEM_REFs there (vn_reference_maybe_forwprop_address) and it doesn't somehow
work for the not offseted one either.
More information about the Gcc-bugs
mailing list