This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug tree-optimization/69760] [4.9/5/6 Regression] Wrong 64-bit memory address caused by an unneeded overflowing 32-bit integer multiplication on x86_64 under -O2 and -O3 code optimization
- From: "rguenth at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Wed, 24 Feb 2016 10:33:51 +0000
- Subject: [Bug tree-optimization/69760] [4.9/5/6 Regression] Wrong 64-bit memory address caused by an unneeded overflowing 32-bit integer multiplication on x86_64 under -O2 and -O3 code optimization
- Auto-submitted: auto-generated
- References: <bug-69760-4 at http dot gcc dot gnu dot org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69760
--- Comment #8 from Richard Biener <rguenth at gcc dot gnu.org> ---
Patch in comment #6 bootstrapped and tested ok.
The following patch also fixes the issue:
Index: gcc/tree-chrec.c
===================================================================
--- gcc/tree-chrec.c (revision 233634)
+++ gcc/tree-chrec.c (working copy)
@@ -28,6 +28,7 @@ along with GCC; see the file COPYING3.
#include "coretypes.h"
#include "backend.h"
#include "tree.h"
+#include "gimple.h"
#include "gimple-expr.h"
#include "tree-pretty-print.h"
#include "fold-const.h"
@@ -1196,6 +1197,12 @@ convert_affine_scev (struct loop *loop,
happen. */
must_check_src_overflow = TYPE_PRECISION (ct) < TYPE_PRECISION (type);
+ /* If at_stmt is not always executed we cannot use overflow semantics
+ as that might end up generating a CHREC that overflows. */
+ if (! at_stmt
+ || ! dominated_by_p (CDI_DOMINATORS, loop->latch, gimple_bb (at_stmt)))
+ use_overflow_semantics = false;
+
enforce_overflow_semantics = (use_overflow_semantics
&& nowrap_type_p (type));
if (enforce_overflow_semantics)
note that we still generate the questionable multiplication on the loop latch
so VRP might mess up things there:
<bb 4>:
_21 = -m_6(D);
ivtmp.6_22 = (unsigned int) _21;
_27 = (unsigned int) L_12(D);
_30 = -m_6(D);
_31 = L_12(D) * _30;
ivtmp.7_29 = (unsigned int) _31;
_34 = (unsigned int) N_5(D);
_35 = (unsigned int) m_6(D);
_36 = _34 - _35;
<bb 5>:
# ivtmp.6_26 = PHI <ivtmp.6_22(4), ivtmp.6_25(9)>
# ivtmp.7_3 = PHI <ivtmp.7_29(4), ivtmp.7_1(9)>
k_32 = (int) ivtmp.6_26;
_8 = k_32 >= 0;
k_33 = (int) ivtmp.6_26;
_10 = k_33 < n_9(D);
_11 = _8 & _10;
if (_11 != 0)
goto <bb 6>;
else
goto <bb 7>;
<bb 6>:
_13 = (int) ivtmp.7_3;
_14 = (long unsigned int) _13;
_15 = _14 * 8;
_17 = a_16(D) + _15;
*_17 = 0.0;
<bb 7>:
ivtmp.6_25 = ivtmp.6_26 + 1;
ivtmp.7_1 = ivtmp.7_3 + _27;
if (ivtmp.6_25 != _36)
but the IV at least works ok and we'd only need to make sure to rewrite
the initial value compute into unsigned arithmetic. The CHREC is
still questionable, it's still {-m_6(D) * L_12(D), +, L_12(D)}_1
which means it overflows during the loop evolution. But at least that
overflowing is preserved with the widening conversion:
(long unsigned int) {-m_6(D) * L_12(D), +, L_12(D)}_1
Note that instantiate_scev doesn't pass a stmt context and thus we have
to assume the worst here (otherwise it re-does the invalid transform
for this testcase). Eventually we'd simply need to do the unsigned
re-write during the initial SCEV analysis to avoid the need to mess
with use_overflow_semantics. Thus when analyzing the multiply on a
not always executed stmt create (int)((unsigned)(-m) * (unsigned)L).
The above patch causes at least
FAIL: gcc.dg/tree-ssa/pr21959.c scan-tree-dump-times vrp1 "Folding predicate "
0
FAIL: gcc.dg/tree-ssa/scev-6.c (test for excess errors)
FAIL: gcc.dg/tree-ssa/scev-6.c execution test
FAIL: gcc.dg/tree-ssa/scev-6.c scan-tree-dump-not optimized "abort"
FAIL: gcc.dg/tree-ssa/scev-cast.c (test for excess errors)
and a load of vect.exp FAILs (also execute fails ...)
I'm going to investigate the effects of re-writing to unsigned during SCEV
analysis now.