gcc 7.1 on x86 produces incorrect output for correct code. gcc 6.3 produced correct assembly. See https://godbolt.org/g/mLyxGz or below for the code and generated assembly. code: .---- // -O1 -fno-strict-overflow void bar(); void foo(char *dst) { char *const end = dst; do { bar(); dst += 2; } while (dst < end); } `---- assembly: .---- foo(char*): push rbx movabs rbx, -9223372036854775808 .L2: call bar() sub rbx, 1 jne .L2 pop rbx ret `----
-fno-strict-overflow Does not effect Pointers. Pointers overflow is separate from integer signed overflow and it is always undefined if the pointer overflow. That is once a pointer is non-null, adding to it can never reach be null.
void bar(); void foo(char *dst) { char *const end = dst; do { bar(); dst += 2; } while (dst < end); } Hmm actually I am backwards here. end + 2 < end is always false.
(it helps to describe in what way the code is wrong, for instance in a way that makes the testcase abort) Apparently ivcanon thinks the number of iterations is constant, as it would be for the condition dst != end.
Good point Marc. The generated assembly is wrong because the loop should be run once, but it is run 2^63 times. Changing the amount dst is incremented by to 4 makes it run 2^62 times, and changing it to 8 makes it run 2^61 times.
Comment #1 says that "-fno-strict-overflow Does not effect Pointers", but the manual says for -fstrict-overflow: "This option also allows the compiler to assume strict pointer semantics: given a pointer to an object, if adding an offset to that pointer does not produce a pointer to the same object, the addition is undefined. This permits the compiler to conclude that p + u > p is always true for a pointer p and unsigned integer u. This assumption is only valid because pointer wraparound is undefined, as the expression is false if p + u overflows using twos complement arithmetic." At least I read this as -fno-strict-overflow permit pointer overflow too. Is that incorrect? If so, then I think the manual should be corrected/clarified...
Smells like a niter analysis issue to me. With -fno-tree-ivcanon I get unsigned long ivtmp.3; <bb 3> [100.00%]: # ivtmp.3_7 = PHI <0(2), ivtmp.3_8(3)> bar (); ivtmp.3_8 = ivtmp.3_7 + 1; if (ivtmp.3_8 != 9223372036854775808) goto <bb 3>; [99.00%] else goto <bb 4>; [1.00%] GCC 6 computes: Loop 1 iterates at most -1 times. while GCC 7 does: Analyzing # of iterations of loop 1 exit condition [dst_3(D) + 2, + , 2](no_overflow) < dst_3(D) bounds on difference of bases: -2 ... -2 Applying pattern match.pd:1308, generic-match.c:13648 result: # of iterations 9223372036854775807, bounded by 9223372036854775807 so likely (T)P - (T)(P + A) -> -(T) A makes things go downhill here, producing pointer (unsigned) typed negated 2 and us interpreting that as unsigned. The same pattern is in GCC 6 so sth in niter analysis broke. -fno-strict-overflow doesn't seem to be necessary btw, -fwrapv does do the trick as well. Bin?
Thanks for reporting, I will investigate this.
My change @r238585 assumed that "pointer + 2 < pointer" must be folded before calling to number_of_iterations_lt_to_ne. This is not true when pointer_plus can overflow. I will look for either fix or revert. Sorry for the breakage.
Author: amker Date: Thu Jul 20 12:02:19 2017 New Revision: 250384 URL: https://gcc.gnu.org/viewcvs?rev=250384&root=gcc&view=rev Log: PR tree-optimization/81388 Revert r238585: 2016-07-21 Bin Cheng <bin.cheng@arm.com> * tree-ssa-loop-niter.c (number_of_iterations_lt_to_ne): Clean up by removing computation of may_be_zero. gcc/testsuite PR tree-optimization/81388 * gcc.dg/tree-ssa/pr81388-1.c: New test. * gcc.dg/tree-ssa/pr81388-2.c: New test. Added: trunk/gcc/testsuite/gcc.dg/tree-ssa/pr81388-1.c trunk/gcc/testsuite/gcc.dg/tree-ssa/pr81388-2.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-ssa-loop-niter.c
Author: amker Date: Tue Jul 25 08:56:26 2017 New Revision: 250497 URL: https://gcc.gnu.org/viewcvs?rev=250497&root=gcc&view=rev Log: Backport from 2017-07-20 trunk r250384. PR tree-optimization/81388 Revert r238585: 2016-07-21 Bin Cheng <bin.cheng@arm.com> * tree-ssa-loop-niter.c (number_of_iterations_lt_to_ne): Clean up by removing computation of may_be_zero. gcc/testsuite PR tree-optimization/81388 * gcc.dg/tree-ssa/pr81388-1.c: New test. * gcc.dg/tree-ssa/pr81388-2.c: New test. Added: branches/gcc-7-branch/gcc/testsuite/gcc.dg/tree-ssa/pr81388-1.c branches/gcc-7-branch/gcc/testsuite/gcc.dg/tree-ssa/pr81388-2.c Modified: branches/gcc-7-branch/gcc/ChangeLog branches/gcc-7-branch/gcc/testsuite/ChangeLog branches/gcc-7-branch/gcc/tree-ssa-loop-niter.c
Backport to gcc-7-branch with test cases slightly adjusted.
Fixed.