Bug 69400 - [5 Regression] wrong code with -O and int128
Summary: [5 Regression] wrong code with -O and int128
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 6.0
: P2 normal
Target Milestone: 5.4
Assignee: rsandifo@gcc.gnu.org
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2016-01-20 21:33 UTC by Zdenek Sojka
Modified: 2016-05-19 08:39 UTC (History)
5 users (show)

See Also:
Host: x86_64-pc-linux-gnu
Target: x86_64-*, aarch64-*, powerpc64-*, sparc64-*
Build:
Known to work: 4.6.4, 4.7.4, 4.8.5, 4.9.4
Known to fail: 5.3.1, 6.0
Last reconfirmed: 2016-01-21 00:00:00


Attachments
reduced testcase (162 bytes, text/plain)
2016-01-20 21:33 UTC, Zdenek Sojka
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Zdenek Sojka 2016-01-20 21:33:53 UTC
Created attachment 37412 [details]
reduced testcase

This PR shows the problem might be more common than just ccp (PR69399).

Output:
$ x86_64-pc-linux-gnu-gcc -O testcase.c
$ ./a.out 
Aborted

First broken dump is .ccp1:
...
Folding statement: return _3;
Folded into: return 0xfffffffffffffffffffffffffffffffe;
...
Thus:
$ x86_64-pc-linux-gnu-gcc -O testcase.c -fno-tree-ccp
$ ./a.out 
Aborted

First broken dump is .forwprop1:
...
gimple_simplified to u_2 = 0xfffffffffffffffffffffffffffffffe;
gimple_simplified to _3 = 0xfffffffffffffffffffffffffffffffe;
...
Thus:
$ x86_64-pc-linux-gnu-gcc -O testcase.c -fno-tree-ccp -fno-tree-forwprop
$ ./a.out 
Aborted

First broken dump is .fre1:
...
Replaced u_1 % 18446744073709551615 with 0xfffffffffffffffffffffffffffffffe in all uses of u_2 = u_1 % 18446744073709551615;
...
Thus:
$ x86_64-pc-linux-gnu-gcc -O testcase.c -fno-tree-ccp -fno-tree-forwprop -fno-tree-fre
$ ./a.out 
Aborted

I stopped there.


$ x86_64-pc-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest/bin/x86_64-pc-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-232548-checking-yes-rtl-df-nographite/bin/../libexec/gcc/x86_64-pc-linux-gnu/6.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++ --enable-checking=yes,rtl,df --without-cloog --without-ppl --without-isl --disable-libstdcxx-pch --prefix=/repo/gcc-trunk//binary-trunk-232548-checking-yes-rtl-df-nographite
Thread model: posix
gcc version 6.0.0 20160119 (experimental) (GCC) 



Tested revisions:
trunk r232548 - FAIL
5-branch r232545 - FAIL
4_[6789]-branch - OK
Comment 1 H.J. Lu 2016-01-21 01:36:22 UTC
It is very likely a wide-int bug.
Comment 2 H.J. Lu 2016-01-21 03:31:30 UTC
It is caused by r210113.
Comment 3 H.J. Lu 2016-01-21 05:49:25 UTC
Fix this may also fix PR 69399.
Comment 4 H.J. Lu 2016-01-21 06:08:30 UTC
Before wide-int:

Lattice value changed to CONSTANT 18446744073709551614.  Adding SSA edges to worklist.

After wide-int:

Lattice value changed to CONSTANT 0xfffffffffffffffffffffffffffffffe.  Adding SSA edges to worklist.

It is wrong to sign-extend (__int128) 0xfffffffffffffffe.
Comment 5 Jakub Jelinek 2016-01-21 08:32:01 UTC
Seems the bug is somewhere around wi::divmod_internal.
Comment 6 Richard Biener 2016-01-21 09:20:40 UTC
The dividend is val = {-1, 0}, len = 2  (bah, we need a debug_wide_int for
gdb use!)

We do

  n = divisor_blocks_needed;
  while (n > 1 && b_divisor[n - 1] == 0)
    n--;

stripping the leading zeros and feed that n to

  if (remainder)
    {
      wi_pack ((unsigned HOST_WIDE_INT *) remainder, b_remainder, n);
      *remainder_len = canonize (remainder, (n + 1) / 2, dividend_prec);
      /* The remainder is always the same sign as the dividend.  */
      if (dividend_neg)
        *remainder_len = wi::sub_large (remainder, zeros, 1, remainder,
                                        *remainder_len, dividend_prec,
                                        UNSIGNED, 0);

so in int_const_binop we end up with val = {-2 }, len = 1 (that's still correct
I think given wide-int rules).

Interestingly things go downhill in wide_int_to_tree where we compute
ext_len as 3 and go the build_new_int_cst path which does

  if (len < ext_len)
    {
      --ext_len;
      TREE_INT_CST_ELT (nt, ext_len)
        = zext_hwi (-1, cst.get_precision () % HOST_BITS_PER_WIDE_INT);
      for (unsigned int i = len; i < ext_len; ++i)
        TREE_INT_CST_ELT (nt, i) = -1;

thus puts { -2, -1, 0 } in the INTEGER_CST storage (instead of { -2, 0, 0 }).

I think it needs to put 0 there for TYPE_UNSIGNED?  But then not sure
why it computes ext_len to 3 instead of 2 for this case anyway...  it does

  if (TYPE_UNSIGNED (type) && wi::neg_p (cst))
    return cst.get_precision () / HOST_BITS_PER_WIDE_INT + 1;

but I'd have expected

    return cst.get_len () + 1;

well, the three kinds of lengths are somewhat odd, so I simply assume 3
is correct for now.

Index: gcc/tree.c
===================================================================
--- gcc/tree.c  (revision 232666)
+++ gcc/tree.c  (working copy)
@@ -1267,7 +1267,7 @@ build_new_int_cst (tree type, const wide
       TREE_INT_CST_ELT (nt, ext_len)
        = zext_hwi (-1, cst.get_precision () % HOST_BITS_PER_WIDE_INT);
       for (unsigned int i = len; i < ext_len; ++i)
-       TREE_INT_CST_ELT (nt, i) = -1;
+       TREE_INT_CST_ELT (nt, i) = TYPE_UNSIGNED (type) ? 0 : -1;
     }
   else if (TYPE_UNSIGNED (type)
           && cst.get_precision () < len * HOST_BITS_PER_WIDE_INT)

doesn't fix this unfortunately, so the bug must be elsewhere (eventually
-2 and len = 1 is not correct as result from divmod).
Comment 7 Jakub Jelinek 2016-01-21 09:44:10 UTC
(In reply to Richard Biener from comment #6)
> so in int_const_binop we end up with val = {-2 }, len = 1 (that's still

I think already this is wrong, because I think val = { -2 }, len = 1 for precision 128 is 0xfffffffffffffffffffffffffffffffeU128 rather than
0xfffffffffffffffeU128.  But I could be wrong.  It would be nice if all the rules were clearly documented.

At least, if I e.g. in the testcase swap -2 with 0xffffffffffffffffULL, so that int_const_binop_1 converts the (__uint128_t) -2 to wide_int when calling wi::mod_trunc, then arg2 is:
$10 = {<wide_int_storage> = {val = {-2, 0, 140737488343280}, len = 1, precision = 128}, static is_sign_extended = <optimized out>}
Thus, IMNSHO { -2 }, 1 is (__uint128_t) -2 rather than (__uint128_t) -2ULL.
Comment 8 rguenther@suse.de 2016-01-21 10:07:19 UTC
nOn Thu, 21 Jan 2016, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69400
> 
> --- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
> (In reply to Richard Biener from comment #6)
> > so in int_const_binop we end up with val = {-2 }, len = 1 (that's still
> 
> I think already this is wrong, because I think val = { -2 }, len = 1 for
> precision 128 is 0xfffffffffffffffffffffffffffffffeU128 rather than
> 0xfffffffffffffffeU128.  But I could be wrong.  It would be nice if all the
> rules were clearly documented.
> 
> At least, if I e.g. in the testcase swap -2 with 0xffffffffffffffffULL, so that
> int_const_binop_1 converts the (__uint128_t) -2 to wide_int when calling
> wi::mod_trunc, then arg2 is:
> $10 = {<wide_int_storage> = {val = {-2, 0, 140737488343280}, len = 1, precision
> = 128}, static is_sign_extended = <optimized out>}
> Thus, IMNSHO { -2 }, 1 is (__uint128_t) -2 rather than (__uint128_t) -2ULL.

Ok, so things go wrong then with building the wide-int result using
the "reduced" divisor_blocks_needed.  Thus

  if (remainder)
    {
      wi_pack ((unsigned HOST_WIDE_INT *) remainder, b_remainder, n);
      *remainder_len = canonize (remainder, (n + 1) / 2, dividend_prec);

should "extend" the remainder (if UNSINGED?) to the divisor len and
canonize should be fed with that instead?

Index: wide-int.cc
===================================================================
--- wide-int.cc (revision 232666)
+++ wide-int.cc (working copy)
@@ -1860,6 +1860,11 @@ wi::divmod_internal (HOST_WIDE_INT *quot
   if (remainder)
     {
       wi_pack ((unsigned HOST_WIDE_INT *) remainder, b_remainder, n);
+      if (sgn == UNSIGNED && n < divisor_blocks_needed)
+       {
+         remainder[(n + 1) / 2] = 0;
+         n++;
+       }
       *remainder_len = canonize (remainder, (n + 1) / 2, dividend_prec);
       /* The remainder is always the same sign as the dividend.  */
       if (dividend_neg)

fixes the particular testcase.  Similar adjustment needed for
the quotiend
Comment 9 rsandifo@gcc.gnu.org 2016-01-21 13:48:09 UTC
Testing a patch.
Comment 10 rsandifo@gcc.gnu.org 2016-01-26 09:54:05 UTC
Author: rsandifo
Date: Tue Jan 26 09:53:33 2016
New Revision: 232817

URL: https://gcc.gnu.org/viewcvs?rev=232817&root=gcc&view=rev
Log:
PR 69400: Invalid 128-bit modulus result

As described in the PR, wi::divmod_internal was sign- rather than
zero-extending a modulus result in cases where the result has fewer
HWIs than the precision and the upper bit of the upper HWI was set.

This patch tries to make things more robust by getting wi_pack
to handle the canonicalisation step itself.

Tested on x86_64-linux-gnu.  I added tests to the wide-int
plugin since that seemed more direct.

gcc/
	PR tree-optimization/69400
	* wide-int.cc (wi_pack): Take the precision as argument and
	perform canonicalization here rather than in the callers.
	Use the main loop to handle all full-width HWIs.  Add a
	zero HWI if in_len isn't a full result.
	(wi::divmod_internal): Update accordingly.
	(wi::mul_internal): Likewise.  Simplify.

gcc/testsuite/
	PR tree-optimization/69400
	* gcc.dg/plugin/wide-int_plugin.c (test_wide_int_mod_trunc): New
	function.
	(plugin_init): Call it.
	* gcc.dg/torture/pr69400.c: New test.

Added:
    trunk/gcc/testsuite/gcc.dg/torture/pr69400.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/gcc.dg/plugin/wide-int_plugin.c
    trunk/gcc/wide-int.cc
Comment 11 Jakub Jelinek 2016-01-26 10:00:26 UTC
Fixed on the trunk so far.
Comment 12 rsandifo@gcc.gnu.org 2016-05-19 08:38:55 UTC
Author: rsandifo
Date: Thu May 19 08:38:23 2016
New Revision: 236444

URL: https://gcc.gnu.org/viewcvs?rev=236444&root=gcc&view=rev
Log:
PR 69400: Invalid 128-bit modulus result

Backport from mainline:

gcc/
2016-01-26  Richard Sandiford  <richard.sandiford@arm.com>

	PR tree-optimization/69400
	* wide-int.cc (wi_pack): Take the precision as argument and
	perform canonicalization here rather than in the callers.
	Use the main loop to handle all full-width HWIs.  Add a
	zero HWI if in_len isn't a full result.
	(wi::divmod_internal): Update accordingly.
	(wi::mul_internal): Likewise.  Simplify.

gcc/testsuite/
2016-01-26  Richard Sandiford  <richard.sandiford@arm.com>

	PR tree-optimization/69400
	* gcc.dg/plugin/wide-int_plugin.c (test_wide_int_mod_trunc): New
	function.
	(plugin_init): Call it.
	* gcc.dg/torture/pr69400.c: New test.

Added:
    branches/gcc-5-branch/gcc/testsuite/gcc.dg/torture/pr69400.c
Modified:
    branches/gcc-5-branch/gcc/ChangeLog
    branches/gcc-5-branch/gcc/testsuite/ChangeLog
    branches/gcc-5-branch/gcc/testsuite/gcc.dg/plugin/wide-int_plugin.c
    branches/gcc-5-branch/gcc/wide-int.cc
Comment 13 rsandifo@gcc.gnu.org 2016-05-19 08:39:58 UTC
Patch applied to mainline before gcc 6, and now to gcc 5.