Currently we need early inlinining to vectorize the testcase even though late inlining does all necessary job. This seems like pass ordering problem.
Hmm. > make check-gcc RUNTESTFLAGS="--target_board=unix/-fno-early-inlining vect.exp=pr66142.c" ... === gcc Summary === # of expected passes 4 so it works? (ok, this is on the gcc 9 branch)
It's not working since the addition of IPA SRA in r10-3311-gff6686d2e5f797d6.
IPA SRA splits the struct B parameter: Evaluating analysis results for bar/0 Will split parameter 0 - component at byte offset 0, size 4 - component at byte offset 4, size 4 - component at byte offset 8, size 4 - component at byte offset 12, size 4 which is passed as D.2220[_14].t.x = 1.0e+0; D.2220[_14].t.y = u_11; D.2220[_14].w.x = x_16; D.2220[_14].w.y = y_18; _23 = &D.2220[_14]; _24 = bar (_23); producing bar.isra (const float ISRA.5, const float ISRA.6, const float ISRA.7, const float ISRA.8) the inlining of this then ends up as _23 = &D.2220[_14]; _27 = MEM[(float *)_23]; _28 = MEM[(float *)_23 + 4B]; _29 = MEM[(float *)_23 + 8B]; _30 = MEM[(float *)_23 + 12B]; where formerly forwprop ended up combining the &D.2220[_14] into the four dereferences with the nice acccess paths which it cannot do from the above: D.2220[_20].t.x = 1.0e+0; D.2220[_20].t.y = u_13; D.2220[_20].w.x = x_22; D.2220[_20].w.y = y_24; _30 = MEM <struct B[32]> [(const struct B *)&D.2220][_20].t.x; _34 = MEM <struct B[32]> [(const struct B *)&D.2220][_20].t.y; _35 = MEM <struct B[32]> [(const struct B *)&D.2220][_20].w.x; _36 = _30 * _35; _37 = MEM <struct B[32]> [(const struct B *)&D.2220][_20].w.y; so instead we get D.2220[_14].t.x = 1.0e+0; D.2220[_14].t.y = u_11; D.2220[_14].w.x = x_16; D.2220[_14].w.y = y_18; _23 = &D.2220[_14]; _27 = MEM[(float *)_23]; _28 = MEM[(float *)_23 + 4B]; _29 = MEM[(float *)_23 + 8B]; _30 = MEM[(float *)_23 + 12B]; which we are not able to elide (which is of course kind-of lame). IIRC we have a duplicate report for this somewhere assigned to me (for the VN side). VN already can do some tricks but it doesn't handle offsetted MEM_REFs there (vn_reference_maybe_forwprop_address) and it doesn't somehow work for the not offseted one either.
GIMPLE testcase: struct A { float x, y; }; struct B { struct A t; }; float __GIMPLE (ssa,startwith("fre")) foo (float a, int i) { struct B D_2220[32]; float *_23; float _27; float _28; float _31; __BB(2): D_2220[i_14(D)].t.x = 1.0e+0f; D_2220[i_14(D)].t.y = a_11(D); _23 = &D_2220[i_14(D)]; _27 = __MEM <const float> ((float *)_23); _28 = __MEM <const float> ((float *)_23 + _Literal (float *) 4); _31 = _27 + _28; return _31; } note the issue isn't only ref matching but alias disambiguation of the second store against the first load. For first load we don't have an access path (VN has but it's representation is not the same the alias oracle uses...). Need to embrace a canonical decomposed form in ao_ref maybe.
The VN issue is old and has dups. So this issue is about IPA SRA making a mess of the IL and or not playing well with IPA inline? Because it worked with the old IPA SRA? I wonder why we use the IPA SRAed clone for inlining rather than the original function.
GCC 10.1 has been released.
GCC 10.2 is released, adjusting target milestone.
GCC 10.3 is being released, retargeting bugs to GCC 10.4.
GCC 10.4 is being released, retargeting bugs to GCC 10.5.
GCC 10 branch is being closed.