Bug 81388 - [7/8 Regression] Incorrect code generation with -O1
Summary: [7/8 Regression] Incorrect code generation with -O1
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 7.1.0
: P2 normal
Target Milestone: 7.2
Assignee: bin cheng
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2017-07-11 02:55 UTC by Nick Terrell
Modified: 2017-07-28 11:34 UTC (History)
0 users

See Also:
Host:
Target:
Build:
Known to work:
Known to fail: 7.1.0
Last reconfirmed: 2017-07-17 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Nick Terrell 2017-07-11 02:55:39 UTC
gcc 7.1 on x86 produces incorrect output for correct code. gcc 6.3 produced correct assembly. See https://godbolt.org/g/mLyxGz or below for the code and generated assembly.

code:
.----
// -O1 -fno-strict-overflow

void bar();
void foo(char *dst)
{
	char *const end = dst;
	do {
                bar();
                dst += 2;
        } while (dst < end);
}
`----

assembly:
.----
foo(char*):
        push    rbx
        movabs  rbx, -9223372036854775808
.L2:
        call    bar()
        sub     rbx, 1
        jne     .L2
        pop     rbx
        ret
`----
Comment 1 Andrew Pinski 2017-07-11 03:07:46 UTC
-fno-strict-overflow Does not effect Pointers.
Pointers overflow is separate from integer signed overflow and it is always undefined if the pointer overflow.  That is once a pointer is non-null, adding to it can never reach be null.
Comment 2 Andrew Pinski 2017-07-11 03:13:12 UTC
void bar();
void foo(char *dst)
{
	char *const end = dst;
	do {
                bar();
                dst += 2;
        } while (dst < end);
}

Hmm actually I am backwards here.  end + 2 < end is always false.
Comment 3 Marc Glisse 2017-07-11 06:30:56 UTC
(it helps to describe in what way the code is wrong, for instance in a way that makes the testcase abort)
Apparently ivcanon thinks the number of iterations is constant, as it would be for the condition dst != end.
Comment 4 Nick Terrell 2017-07-11 06:39:59 UTC
Good point Marc.

The generated assembly is wrong because the loop should be run once, but it is run 2^63  times. Changing the amount dst is incremented by to 4 makes it run 2^62 times, and changing it to 8 makes it run 2^61 times.
Comment 5 krister.walfridsson 2017-07-12 08:26:29 UTC
Comment #1 says that "-fno-strict-overflow Does not effect Pointers", but the manual says for -fstrict-overflow:

"This option also allows the compiler to assume strict pointer semantics: given a pointer to an object, if adding an offset to that pointer does not produce a pointer to the same object, the addition is undefined. This permits the compiler to conclude that p + u > p is always true for a pointer p and unsigned integer u.  This assumption is only valid because pointer wraparound is undefined, as the expression is false if p + u overflows using twos complement arithmetic."

At least I read this as -fno-strict-overflow permit pointer overflow too. Is that incorrect? If so, then I think the manual should be corrected/clarified...
Comment 6 Richard Biener 2017-07-17 11:11:26 UTC
Smells like a niter analysis issue to me.  With -fno-tree-ivcanon I get

  unsigned long ivtmp.3;

  <bb 3> [100.00%]:
  # ivtmp.3_7 = PHI <0(2), ivtmp.3_8(3)>
  bar ();
  ivtmp.3_8 = ivtmp.3_7 + 1;
  if (ivtmp.3_8 != 9223372036854775808)
    goto <bb 3>; [99.00%]
  else
    goto <bb 4>; [1.00%]

GCC 6 computes:

Loop 1 iterates at most -1 times.

while GCC 7 does:

Analyzing # of iterations of loop 1
  exit condition [dst_3(D) + 2, + , 2](no_overflow) < dst_3(D)
  bounds on difference of bases: -2 ... -2
Applying pattern match.pd:1308, generic-match.c:13648
  result:
    # of iterations 9223372036854775807, bounded by 9223372036854775807

so likely (T)P - (T)(P + A) -> -(T) A makes things go downhill here,
producing pointer (unsigned) typed negated 2 and us interpreting that
as unsigned.  The same pattern is in GCC 6 so sth in niter analysis
broke.  -fno-strict-overflow doesn't seem to be necessary btw, -fwrapv
does do the trick as well.

Bin?
Comment 7 bin cheng 2017-07-17 11:45:41 UTC
Thanks for reporting, I will investigate this.
Comment 8 bin cheng 2017-07-18 14:51:28 UTC
My change @r238585 assumed that "pointer + 2 < pointer" must be folded before calling to number_of_iterations_lt_to_ne.  This is not true when pointer_plus can overflow.  I will look for either fix or revert.  Sorry for the breakage.
Comment 9 bin cheng 2017-07-20 12:02:51 UTC
Author: amker
Date: Thu Jul 20 12:02:19 2017
New Revision: 250384

URL: https://gcc.gnu.org/viewcvs?rev=250384&root=gcc&view=rev
Log:

	PR tree-optimization/81388
	Revert r238585:
	2016-07-21  Bin Cheng  <bin.cheng@arm.com>

	* tree-ssa-loop-niter.c (number_of_iterations_lt_to_ne): Clean up
	by removing computation of may_be_zero.

	gcc/testsuite
	PR tree-optimization/81388
	* gcc.dg/tree-ssa/pr81388-1.c: New test.
	* gcc.dg/tree-ssa/pr81388-2.c: New test.

Added:
    trunk/gcc/testsuite/gcc.dg/tree-ssa/pr81388-1.c
    trunk/gcc/testsuite/gcc.dg/tree-ssa/pr81388-2.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-ssa-loop-niter.c
Comment 10 bin cheng 2017-07-25 08:56:58 UTC
Author: amker
Date: Tue Jul 25 08:56:26 2017
New Revision: 250497

URL: https://gcc.gnu.org/viewcvs?rev=250497&root=gcc&view=rev
Log:
	Backport from 2017-07-20 trunk r250384.

	PR tree-optimization/81388
	Revert r238585:
	2016-07-21  Bin Cheng  <bin.cheng@arm.com>

	* tree-ssa-loop-niter.c (number_of_iterations_lt_to_ne): Clean up
	by removing computation of may_be_zero.

	gcc/testsuite
	PR tree-optimization/81388
	* gcc.dg/tree-ssa/pr81388-1.c: New test.
	* gcc.dg/tree-ssa/pr81388-2.c: New test.

Added:
    branches/gcc-7-branch/gcc/testsuite/gcc.dg/tree-ssa/pr81388-1.c
    branches/gcc-7-branch/gcc/testsuite/gcc.dg/tree-ssa/pr81388-2.c
Modified:
    branches/gcc-7-branch/gcc/ChangeLog
    branches/gcc-7-branch/gcc/testsuite/ChangeLog
    branches/gcc-7-branch/gcc/tree-ssa-loop-niter.c
Comment 11 bin cheng 2017-07-25 08:57:32 UTC
Backport to gcc-7-branch with test cases slightly adjusted.
Comment 12 Richard Biener 2017-07-28 11:34:36 UTC
Fixed.