This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug tree-optimization/69760] [4.9/5/6 Regression] Wrong 64-bit memory address caused by an unneeded overflowing 32-bit integer multiplication on x86_64 under -O2 and -O3 code optimization


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69760

--- Comment #8 from Richard Biener <rguenth at gcc dot gnu.org> ---
Patch in comment #6 bootstrapped and tested ok.

The following patch also fixes the issue:

Index: gcc/tree-chrec.c
===================================================================
--- gcc/tree-chrec.c    (revision 233634)
+++ gcc/tree-chrec.c    (working copy)
@@ -28,6 +28,7 @@ along with GCC; see the file COPYING3.
 #include "coretypes.h"
 #include "backend.h"
 #include "tree.h"
+#include "gimple.h"
 #include "gimple-expr.h"
 #include "tree-pretty-print.h"
 #include "fold-const.h"
@@ -1196,6 +1197,12 @@ convert_affine_scev (struct loop *loop,
        happen.  */
   must_check_src_overflow = TYPE_PRECISION (ct) < TYPE_PRECISION (type);

+  /* If at_stmt is not always executed we cannot use overflow semantics
+     as that might end up generating a CHREC that overflows.  */
+  if (! at_stmt
+      || ! dominated_by_p (CDI_DOMINATORS, loop->latch, gimple_bb (at_stmt)))
+    use_overflow_semantics = false;
+
   enforce_overflow_semantics = (use_overflow_semantics
                                && nowrap_type_p (type));
   if (enforce_overflow_semantics)

note that we still generate the questionable multiplication on the loop latch
so VRP might mess up things there:

  <bb 4>:
  _21 = -m_6(D);
  ivtmp.6_22 = (unsigned int) _21;
  _27 = (unsigned int) L_12(D);
  _30 = -m_6(D);
  _31 = L_12(D) * _30;
  ivtmp.7_29 = (unsigned int) _31;
  _34 = (unsigned int) N_5(D);
  _35 = (unsigned int) m_6(D);
  _36 = _34 - _35;

  <bb 5>:
  # ivtmp.6_26 = PHI <ivtmp.6_22(4), ivtmp.6_25(9)>
  # ivtmp.7_3 = PHI <ivtmp.7_29(4), ivtmp.7_1(9)>
  k_32 = (int) ivtmp.6_26;
  _8 = k_32 >= 0;
  k_33 = (int) ivtmp.6_26;
  _10 = k_33 < n_9(D);
  _11 = _8 & _10;
  if (_11 != 0)
    goto <bb 6>;
  else
    goto <bb 7>;

  <bb 6>:
  _13 = (int) ivtmp.7_3;
  _14 = (long unsigned int) _13;
  _15 = _14 * 8;
  _17 = a_16(D) + _15;
  *_17 = 0.0;

  <bb 7>:
  ivtmp.6_25 = ivtmp.6_26 + 1;
  ivtmp.7_1 = ivtmp.7_3 + _27;
  if (ivtmp.6_25 != _36)

but the IV at least works ok and we'd only need to make sure to rewrite
the initial value compute into unsigned arithmetic.  The CHREC is
still questionable, it's still {-m_6(D) * L_12(D), +, L_12(D)}_1
which means it overflows during the loop evolution.  But at least that
overflowing is preserved with the widening conversion:
(long unsigned int) {-m_6(D) * L_12(D), +, L_12(D)}_1

Note that instantiate_scev doesn't pass a stmt context and thus we have
to assume the worst here (otherwise it re-does the invalid transform
for this testcase).  Eventually we'd simply need to do the unsigned
re-write during the initial SCEV analysis to avoid the need to mess
with use_overflow_semantics.  Thus when analyzing the multiply on a
not always executed stmt create (int)((unsigned)(-m) * (unsigned)L).

The above patch causes at least

FAIL: gcc.dg/tree-ssa/pr21959.c scan-tree-dump-times vrp1 "Folding predicate "
0
FAIL: gcc.dg/tree-ssa/scev-6.c (test for excess errors)
FAIL: gcc.dg/tree-ssa/scev-6.c execution test
FAIL: gcc.dg/tree-ssa/scev-6.c scan-tree-dump-not optimized "abort"
FAIL: gcc.dg/tree-ssa/scev-cast.c (test for excess errors)

and a load of vect.exp FAILs (also execute fails ...)

I'm going to investigate the effects of re-writing to unsigned during SCEV
analysis now.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]