User account creation filtered due to spam.

Bug 71691 - [6 Regression] wrong code at -O3 in both 32-bit and 64-bit modes on x86_64-linux-gnu (Floating point exception)
Summary: [6 Regression] wrong code at -O3 in both 32-bit and 64-bit modes on x86_64-li...
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 7.0
: P2 normal
Target Milestone: 6.5
Assignee: Aldy Hernandez
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2016-06-29 06:13 UTC by Chengnian Sun
Modified: 2017-07-04 11:07 UTC (History)
7 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2016-06-29 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Chengnian Sun 2016-06-29 06:13:18 UTC
$ gcc-trunk -v
Using built-in specs.
COLLECT_GCC=gcc-trunk
COLLECT_LTO_WRAPPER=/usr/local/gcc-trunk/libexec/gcc/x86_64-pc-linux-gnu/7.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-source-trunk/configure --enable-languages=c,c++,lto --prefix=/usr/local/gcc-trunk --disable-bootstrap
Thread model: posix
gcc version 7.0.0 20160628 (experimental) [trunk revision 237830] (GCC) 
$ 
$ gcc-trunk -O3 -m32 small.c ; ./a.out
Floating point exception (core dumped)
$ 
$ gcc-trunk -O3 -m64 small.c ; ./a.out
Floating point exception (core dumped)
$ 
$ gcc-trunk -O2 -m64 small.c ; ./a.out
$ 
$ cat small.c
char b;
short f;
unsigned e;
int g = 20;
void fn1() {
  int l = 0;
  for (; l <= 7; l++) {
    int h, j = 38;
    if (g)
      h = 0;
    for (; h <= 7; h++) {
      int i, k = b % (j % 4);
      g = f;
      for (;;) {
        j = 6 || b;
        if (e) {
          for (; j; --j)
            if (k)
              __builtin_printf("%d", 9);
          if (i)
            __builtin_printf("%d", j);
        }
        if (l)
          continue;
        break;
      }
      i = f || b; 
    }
  }
}

int main() { 
  fn1(); 
  return 0;
}
$
Comment 1 Martin Liška 2016-06-29 08:39:52 UTC
Confirmed that -m64 started to fail with GCC 6.1.0.
Comment 2 Jakub Jelinek 2016-06-29 12:25:55 UTC
Started with r232453.
Comment 3 Jakub Jelinek 2016-07-04 10:11:29 UTC
The testcase looks invalid to me.
In the second iteration of the outermost loop, l = 1, g = 0, so it compares uninitialized h with 7.
Comment 4 Jakub Jelinek 2016-07-04 10:22:14 UTC
That said,
char b;
short f;
unsigned e;
int g = 20;

void
foo ()
{
  int l, h;
  for (l = 0; l <= 7; l++)
    {
      int j = 38;
      if (g)
	h = 0;
      for (; h <= 7; h++)
	{
	  int i, k = b % (j % 4);
	  g = f;
	  for (;;)
	    {
	      j = 6 || b;
	      if (e)
		{
		  for (; j; --j)
		    if (k)
		      __builtin_printf ("%d", 9);
		  if (i)
		    __builtin_printf ("%d", j);
		}
	      if (l)
		continue;
	      break;
	    }
	  i = f || b;
	}
    }
}

int
main ()
{
  foo ();
  return 0;
}

(just moving the int h; declaration before the outermost loop) is IMHO already valid, and still SIGFPEs.
Comment 5 Jakub Jelinek 2016-07-04 11:51:37 UTC
I bet this is related to the uninitialized i.
We have in bb3:

  # RANGE [0, 1] NONZERO 1
  # i_57 = PHI <prephitmp_78(36), i_27(D)(2)>

and in bb5:

  # RANGE [0, 1] NONZERO 1
  # i_48 = PHI <i_57(4), prephitmp_78(34)>

and in bb6:

  if (i_48 != 0)

and in basic blocks dominated by the succ edges of bb6 uncprop replaces some 0s or 1s in phi with i_48.  The value ranges ignore the uninitialized values, with the rationale that if the SSA_NAME is used somewhere, then in valid program it should be initialized.  The problem here is that the if (i_48 != 0) comparison has been added artificially in the ch2 pass, hoisted before the condition that guarded its use.  But at the point where the artificial i_48 != 0 is present even valid program can reach the point with uninitialized i and thus with the value range assumption violated.  And then uncprop adds further uses of i_48, in places that are dominated by this artificial i_48 != 0, but aren't dominated by the original uses of i in the program.

Not sure what can be done here though.
Penalizing VRP by using always VARING for undefined values would probably slow down lots of code, not hoisting uses of partially undefined values is problematic, reverting the uncprop patch would be unfortunate.
Comment 6 Richard Biener 2016-07-07 11:18:17 UTC
So this is loop unswitching unswitching on a condition that is not always executed
(i_48 != 0), it is guarded by e.3_30 != 0 (which is not invariant).  It's not
trivial to see the use of i_48 can be uninitialized and thus we have to
error conservatively.
Comment 7 Jeffrey A. Law 2016-08-05 22:03:41 UTC
Damn interesting test.

I really wonder if what uncprop is doing is valid.

Essentially if i_48 has a potentially undefined value, but there are no paths where its used undefined, then we're OK.  Uncprop comes along and introduces a use of i_48 on a path where it didn't exist before, now causing us to use the uninitialized value.  And boom.

And note this is a potential issue for all the transformations made by uncprop.
Comment 8 Jeffrey A. Law 2016-08-05 23:15:40 UTC
Actually it's just the code that implies a range on the "other" side of an equality test against a boolean.  So it'd just be reverting the patch for 69270 that hits uncprop -- which just improves codesize & coalescing in a tiny way.  I don't offhand think the DOM changes related to 69270 would have to be removed.
Comment 9 Jeffrey A. Law 2016-08-09 17:29:43 UTC
Based on c#6 I started thinking about how to make tree-ssa-loop-unswitch.c appropriately conservative when a condition references a maybe-undefined SSA_NAME.

To recap, tree_may_unswitch_on has this test:

     /* Unswitching on undefined values would introduce undefined
         behavior that the original program might never exercise.  */
      if (ssa_undefined_value_p (use, true))
        return NULL_TREE;

The problem is ssa_undefined_value_p returns an optimistic result -- ie, it will returns true iff the definition statement is empty.  So it will return false for an SSA_NAME that is set from a PHI node where one or more arguments have undefined values.

ISTM this can be pretty easily fixed.

Create a bitmap and initialize it to all the SSA_NAMEs where ssa_undefined_value_p is true.

Examine each PHI, if the PHI has an argument on its RHS that has the bit set in the bitmap, then set the bit for the LHS of the PHI

Iterate on the PHIs until nothing has changed.

[ Yes, this isn't the most efficient.  Implementation would be different to improve efficiency. ]

The result is the conservative set of SSA_NAMEs that may be undefined.  This satisfies the need of unswitching and potentially other passes that really want the conservative set.  


Thoughts?
Comment 10 rguenther@suse.de 2016-08-10 07:57:56 UTC
On Tue, 9 Aug 2016, law at redhat dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71691
> 
> --- Comment #9 from Jeffrey A. Law <law at redhat dot com> ---
> Based on c#6 I started thinking about how to make tree-ssa-loop-unswitch.c
> appropriately conservative when a condition references a maybe-undefined
> SSA_NAME.
> 
> To recap, tree_may_unswitch_on has this test:
> 
>      /* Unswitching on undefined values would introduce undefined
>          behavior that the original program might never exercise.  */
>       if (ssa_undefined_value_p (use, true))
>         return NULL_TREE;
> 
> The problem is ssa_undefined_value_p returns an optimistic result -- ie, it
> will returns true iff the definition statement is empty.  So it will return
> false for an SSA_NAME that is set from a PHI node where one or more arguments
> have undefined values.

ssa_undefined_value_p returns a conservative result for the
must-be-undefined question which it is used for.  It doesn't 
conservatively answer maybe-undefined.

> ISTM this can be pretty easily fixed.
> 
> Create a bitmap and initialize it to all the SSA_NAMEs where
> ssa_undefined_value_p is true.
> 
> Examine each PHI, if the PHI has an argument on its RHS that has the bit set in
> the bitmap, then set the bit for the LHS of the PHI
> 
> Iterate on the PHIs until nothing has changed.

Simpler: in dominator order walk all PHIs and stmts, for PHIs mark
the def defined if all args are defined, for stmts mark all defs defined

no iteration necessary, no pre-seeding with ssa_undefined_value_p needed,
simply fall back to it when looking at PHI args (for parameter handling).

> [ Yes, this isn't the most efficient.  Implementation would be different to
> improve efficiency. ]
> 
> The result is the conservative set of SSA_NAMEs that may be undefined.  This
> satisfies the need of unswitching and potentially other passes that really want
> the conservative set.  

Yeah, though it computes undefinedness in the strict SSA sense.

OTOH what unswitching wants to know is whether the value is either known
to be defined or is known to be already used in the path leading from
function entry to the place we want to place the new condition.  That
can be tested with iterating over all immediate uses and a dominance
check which should improve things a bit.
Comment 11 Richard Biener 2016-08-22 08:32:24 UTC
GCC 6.2 is being released, adjusting target milestone.
Comment 12 Richard Biener 2016-08-22 08:33:27 UTC
GCC 6.2 is being released, adjusting target milestone.
Comment 13 Richard Biener 2016-08-22 08:55:55 UTC
GCC 6.2 is being released, adjusting target milestone.
Comment 14 Richard Biener 2016-08-22 08:57:17 UTC
GCC 6.2 is being released, adjusting target milestone.
Comment 15 Aldy Hernandez 2016-12-07 10:17:12 UTC
Mine for now.
Comment 16 Aldy Hernandez 2016-12-07 10:41:39 UTC
This is no longer reproducible in trunk because of the patch below.  Can we close  this PR, or is the problem just latent now?:

commit c2423c1d39514e05b710fe6a038cfe704c69860a
Author: rguenth <rguenth@138bc75d-0d04-0410-961f-82ee72b054a4>
Date:   Mon Oct 24 11:22:42 2016 +0000

    2016-10-24  Richard Biener  <rguenther@suse.de>
    
        * tree-vrp.c (evrp_dom_walker::before_dom_children): Ignore
        backedges when identifying the single predecessor to take
        conditional info from.  Use SCEV to get at ranges for loop IVs.
        * lto-streamer-out.c (lto_write_mode_table): CSE inner mode to
        avoid false warning.
    
        * gcc.dg/tree-ssa/cunroll-13.c: Disable EVRP.
        * gcc.dg/tree-ssa/pr21458.c: Likewise.
        * gcc.dg/tree-ssa/pr21458-2.c: New testcase for EVRP.
    
    
    git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@241470 138bc75d-0d04-0410-961f-82ee72b054a4
Comment 17 Jeffrey A. Law 2016-12-07 16:38:27 UTC
It's just latent.  We still need to fix it appropriately.
Comment 18 Aldy Hernandez 2016-12-07 17:17:06 UTC
On 12/07/2016 11:38 AM, law at redhat dot com wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71691
>
> --- Comment #17 from Jeffrey A. Law <law at redhat dot com> ---
> It's just latent.  We still need to fix it appropriately.
>

Ok.  I see that compiling with -O3 -fno-tree-vrp shows a segfault at 
execution.

I'll take a look.  Hopefully it's still the same bug.
Comment 19 Jakub Jelinek 2016-12-21 10:57:19 UTC
GCC 6.3 is being released, adjusting target milestone.
Comment 20 Aldy Hernandez 2017-01-26 12:10:53 UTC
A short recap for anyone wanting to tackle this:

Jeff had an initial patch here:
https://gcc.gnu.org/ml/gcc-patches/2016-08/msg01359.html

Then I proposed an on-demand approach here:
https://gcc.gnu.org/ml/gcc-patches/2016-12/msg01481.html

...which continued onto various suggestions through here:
https://gcc.gnu.org/ml/gcc-patches/2017-01/msg00104.html
Comment 21 Aldy Hernandez 2017-01-31 10:31:19 UTC
Author: aldyh
Date: Tue Jan 31 10:30:47 2017
New Revision: 245057

URL: https://gcc.gnu.org/viewcvs?rev=245057&root=gcc&view=rev
Log:
	PR tree-optimization/71691
	* bitmap.h (class auto_bitmap): New.
	* tree-ssa-loop-unswitch.c (tree_may_unswitch_on): Call
	is_maybe_undefined instead of ssa_undefined_value_p.

Added:
    trunk/gcc/testsuite/gcc.dg/loop-unswitch-5.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/bitmap.h
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/gcc.dg/loop-unswitch-1.c
    trunk/gcc/tree-ssa-loop-unswitch.c
Comment 22 Aldy Hernandez 2017-01-31 10:33:27 UTC
Fixed in trunk.  Removing GCC7 regression marker.
Comment 23 Richard Biener 2017-07-04 08:47:15 UTC
GCC 6.4 is being released, adjusting target milestone.