This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug tree-optimization/68956] [6 regression] Vectorizer miscompilation of 416.gamess


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68956

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |NEW
                 CC|                            |rguenth at gcc dot gnu.org
           Assignee|rguenth at gcc dot gnu.org         |unassigned at gcc dot gnu.org

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
t.f:13:0: note: loop vectorized

is the offending vectorization.  We if-convert with masked-loads:

  <bb 7>:
  # i_1 = PHI <1(6), i_37(8)>
  # ij_3 = PHI <ij_2(6), ij_25(8)>
  ij_25 = ij_3 + 1;
  ic_26 = i_1 <= _39;
  _27 = jc_24 & ic_26;
  _54 = &*in1_28(D)[ij_3];
  _ifc__55 = _27;
  _29 = MASK_LOAD (_54, 64B, _ifc__55);
  _56 = &*in2_30(D)[ij_3];
  _31 = MASK_LOAD (_56, 64B, _ifc__55);
  _32 = _29 + _31;
  sum_33 = (real(kind=4)) _32;
  _43 = (real(kind=8)) sum_33;
  prephitmp_41 = _27 ? _43 : 0.0;
  *out_35(D)[ij_3] = prephitmp_41;
  i_37 = i_1 + 1;
  if (i_1 == j_5)
    goto <bb 14>;

but to me there is nothing obviously wrong with .optimized:

  <bb 7>:
  # vect_vec_iv_.20_99 = PHI <{ 1, 2, 3, 4, 5, 6, 7, 8 }(6),
vect_vec_iv_.20_100(7)>
  # ivtmp.51_57 = PHI <0(6), ivtmp.51_15(7)>
  # ivtmp.52_18 = PHI <ivtmp.52_13(6), ivtmp.52_42(7)>
  # ivtmp.55_45 = PHI <ivtmp.55_50(6), ivtmp.55_16(7)>
  # ivtmp.57_51 = PHI <ivtmp.57_34(6), ivtmp.57_52(7)>
  vectp.30_122 = (vector(4) real(kind=8) *) ivtmp.55_45;
  vectp.26_112 = (vector(4) real(kind=8) *) ivtmp.52_18;
  vect_vec_iv_.20_100 = vect_vec_iv_.20_99 + { 8, 8, 8, 8, 8, 8, 8, 8 };
  mask_ic_26.21_102 = vect_vec_iv_.20_99 <= vect_cst__101;
  mask__27.22_105 = mask_ic_26.21_102 & vect_cst__104;
  mask_patt_58.24_107 = [vec_unpack_lo_expr] mask__27.22_105;
  mask_patt_58.24_108 = [vec_unpack_hi_expr] mask__27.22_105;
  vect_patt_59.25_114 = MASK_LOAD (vectp.26_112, 8B, mask_patt_58.24_107);
  _47 = ivtmp.52_18 + 32;
  _46 = (vector(4) real(kind=8) *) _47;
  vect_patt_59.25_116 = MASK_LOAD (_46, 8B, mask_patt_58.24_108);
  vect_patt_61.29_124 = MASK_LOAD (vectp.30_122, 8B, mask_patt_58.24_107);
  _48 = ivtmp.55_45 + 32;
  _49 = (vector(4) real(kind=8) *) _48;
  vect_patt_61.29_126 = MASK_LOAD (_49, 8B, mask_patt_58.24_108);
  vect__32.32_127 = vect_patt_59.25_114 + vect_patt_61.29_124;
  vect__32.32_128 = vect_patt_59.25_116 + vect_patt_61.29_126;
  vect_sum_33.33_129 = VEC_PACK_TRUNC_EXPR <vect__32.32_127, vect__32.32_128>;
  vect__43.34_130 = [vec_unpack_lo_expr] vect_sum_33.33_129;
  vect__43.34_131 = [vec_unpack_hi_expr] vect_sum_33.33_129;
  vect_patt_63.36_135 = VEC_COND_EXPR <mask_patt_58.24_107, vect__43.34_130, {
0.0, 0.0, 0.0, 0.0 }>;
  vect_patt_63.36_136 = VEC_COND_EXPR <mask_patt_58.24_108, vect__43.34_131, {
0.0, 0.0, 0.0, 0.0 }>;
  _62 = (void *) ivtmp.57_51;
  MEM[base: _62, offset: 0B] = vect_patt_63.36_135;
  MEM[base: _62, offset: 32B] = vect_patt_63.36_136;
  ivtmp.51_15 = ivtmp.51_57 + 1;
  ivtmp.52_42 = ivtmp.52_18 + 64;
  ivtmp.55_16 = ivtmp.55_45 + 64;
  ivtmp.57_52 = ivtmp.57_51 + 64;
  if (ivtmp.51_15 >= bnd.16_65)
    goto <bb 11>;

so I suspect a backend / RTL optimization issue.

Confirmed at least.  Bisection would be nice.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]