Bug 88223 - [8 Regression] Wrong code for intrinsic memmove
Summary: [8 Regression] Wrong code for intrinsic memmove
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 8.2.0
: P2 normal
Target Milestone: 8.3
Assignee: Richard Biener
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2018-11-27 14:11 UTC by Robert Clausecker
Modified: 2019-02-07 08:22 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work: 8.2.1, 9.0
Known to fail: 8.2.0
Last reconfirmed: 2018-11-28 00:00:00


Attachments
preprocessed test case (1.10 KB, text/plain)
2018-11-27 14:11 UTC, Robert Clausecker
Details
Verbose compiler output (2.40 KB, text/plain)
2018-11-27 14:11 UTC, Robert Clausecker
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Robert Clausecker 2018-11-27 14:11:18 UTC
Created attachment 45103 [details]
preprocessed test case

A user on reddit has pointed out that gcc 8 generates wrong code
for intrinsic memmove() when compiling with optimisations:

    https://www.reddit.com/r/C_Programming/comments/a0tdux

The following minimal reproduction was provided:

    char s[] = "12345";
    memmove(s + 1, s, 4);    /* 11234 */
    memmove(s + 1, s, 4);    /* 11123 */
    memmove(s + 1, s, 4);    /* 11112 */
    puts(s);

Without optimisations, this prints "11112" as expected whereas
with optimisations, it prints "11123".

I was able to reproduce this problem on FreeBSD with gcc 8.2.0.
The relevant files are attached.
Comment 1 Robert Clausecker 2018-11-27 14:11:54 UTC
Created attachment 45104 [details]
Verbose compiler output
Comment 2 Alexander Monakov 2018-11-27 14:27:25 UTC
Nice find; wrong code due to FRE (in fre1).
Comment 3 Richard Biener 2018-11-28 08:47:29 UTC
Mine.
Comment 4 Richard Biener 2018-11-28 09:06:20 UTC
extern void *memmove(void *, const void *, __SIZE_TYPE__);
extern void abort(void);

extern int
main(void)
{
 char s[] = "12345";
 memmove(s + 1, s, 4);
 memmove(s + 1, s, 4);
 memmove(s + 1, s, 4);
 if (s[0] != '1' || s[1] != '1' || s[2] != '1' || s[3] != '1' || s[4] != '2')
   abort ();
 return (0);
}
Comment 5 Richard Biener 2018-11-28 09:20:37 UTC
OK, so the bug is in

      /* If we reach a clobbering statement try to skip it and see if
         we find a VN result with exactly the same value as the
         possible clobber.  In this case we can ignore the clobber
         and return the found value.
         Note that we don't need to worry about partial overlapping
         accesses as we then can use TBAA to disambiguate against the
         clobbering statement when looking up a load (thus the
         VN_WALKREWRITE guard).  */
      if (vn_walk_kind == VN_WALKREWRITE
          && is_gimple_reg_type (TREE_TYPE (lhs))
          && types_compatible_p (TREE_TYPE (lhs), vr->type))
        {
          tree *saved_last_vuse_ptr = last_vuse_ptr;
          /* Do not update last_vuse_ptr in vn_reference_lookup_2.  */
          last_vuse_ptr = NULL;
          tree saved_vuse = vr->vuse;
          hashval_t saved_hashcode = vr->hashcode;
          void *res = vn_reference_lookup_2 (ref,
                                             gimple_vuse (def_stmt), 0, vr);
          /* Need to restore vr->vuse and vr->hashcode.  */
          vr->vuse = saved_vuse;
          vr->hashcode = saved_hashcode;
          last_vuse_ptr = saved_last_vuse_ptr;
          if (res && res != (void *)-1)
            {
              vn_reference_t vnresult = (vn_reference_t) res;
              if (vnresult->result
                  && operand_equal_p (vnresult->result,
                                      gimple_assign_rhs1 (def_stmt), 0))
                return res;
            }
        }

where the "we don't need to worry about partial overlapping accesses" is
of course wrong in the case of ref-all accesses.
Comment 6 Richard Biener 2018-11-28 11:18:22 UTC
Hmm, and with unions we can break the TBAA rule.

union U {
  struct X { int a : 3; int b : 8; } a;
  struct Y { int a : 4; int b : 8; } b;
};

union U u;
int __attribute__((noinline)) bar ()
{
  int x = u.a.b;
  u.b.b = x;
  return u.a.b;
}

int main()
{
  u.a.b = 1;
  if (bar () != 3)
    abort ();
  return 0;
}

testcase is specific on bitfield layout of course and to the GCC extension
allowing type-punning via unions.
Comment 7 Richard Biener 2018-11-28 11:32:14 UTC
(In reply to Richard Biener from comment #6)
> Hmm, and with unions we can break the TBAA rule.
> 
> union U {
>   struct X { int a : 3; int b : 8; } a;
>   struct Y { int a : 4; int b : 8; } b;
> };
> 
> union U u;
> int __attribute__((noinline)) bar ()
> {
>   int x = u.a.b;
>   u.b.b = x;
>   return u.a.b;
> }
> 
> int main()
> {
>   u.a.b = 1;
>   if (bar () != 3)
>     abort ();
>   return 0;
> }
> 
> testcase is specific on bitfield layout of course and to the GCC extension
> allowing type-punning via unions.

Hmm, OTOH the C standard specifies that the store to u.b.b makes it the
u.b the active member and it makes the old contents undefined.  That
would mean when loading u.a.b after the store we cannot rely on the
exact value we get but at least the bits touched by u.b.b should be
up-to-date (which is enough to show breakage).  I thought about
using

  u.a.b = 1;
  union U x;
  x.a.b = 1;
  x.b.b = 1;
  if (bar () != x.a.b)
    abort ();

to make the test portable across bitfield layout but I guess I need to
somehow compute a mask to ignore the bit...

The C standard says the values are unspecified.  And it talks about
bytes, not bits...
Comment 8 Richard Biener 2018-11-28 12:02:34 UTC
The C standard is unclear here (C18, 6.2.6.1/7).

union { struct X { int i; int j; } x; struct Y { int i; int j; } y; } u;
u.x.i = 1;
u.y.j = 1;

after the second store does u.y.i have unspecified value?  Because the
union member is y and there are no bytes that are not also in the member x.

I'm probably missing the correct place to look at here.
Comment 9 Richard Biener 2018-11-28 13:52:14 UTC
Author: rguenth
Date: Wed Nov 28 13:51:42 2018
New Revision: 266560

URL: https://gcc.gnu.org/viewcvs?rev=266560&root=gcc&view=rev
Log:
2018-11-28  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/88223
	* tree-ssa-sccvn.c (vn_reference_lookup_3): When skipping
	over a stored-same value may-alias store make sure to consider
	partial overlaps which are valid when TBAA reasonings do not
	apply and byte-granular overlaps are possible at all.

	* gcc.dg/torture/pr88223.c: New testcase.

Added:
    trunk/gcc/testsuite/gcc.dg/torture/pr88223.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-ssa-sccvn.c
Comment 10 jsm-csl@polyomino.org.uk 2018-11-28 18:04:05 UTC
On Wed, 28 Nov 2018, rguenth at gcc dot gnu.org wrote:

> Hmm, OTOH the C standard specifies that the store to u.b.b makes it the
> u.b the active member and it makes the old contents undefined.  That
> would mean when loading u.a.b after the store we cannot rely on the
> exact value we get but at least the bits touched by u.b.b should be
> up-to-date (which is enough to show breakage).  I thought about
> using

In my view, storing to u.b.b should be equivalent to loading u.b (a 
type-punning operation if u.a was previously the active member), storing 
the new value of u.b.b in a copy of the loaded value of u.b, and storing 
that result back in u.b (making u.b the active member of u).  (So if bits 
set in u.a are not padding in u.b, and not part of the space occupied by 
u.b.b, they should be preserved.)
Comment 11 rguenther@suse.de 2018-11-29 09:20:52 UTC
On Wed, 28 Nov 2018, joseph at codesourcery dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88223
> 
> --- Comment #10 from joseph at codesourcery dot com <joseph at codesourcery dot com> ---
> On Wed, 28 Nov 2018, rguenth at gcc dot gnu.org wrote:
> 
> > Hmm, OTOH the C standard specifies that the store to u.b.b makes it the
> > u.b the active member and it makes the old contents undefined.  That
> > would mean when loading u.a.b after the store we cannot rely on the
> > exact value we get but at least the bits touched by u.b.b should be
> > up-to-date (which is enough to show breakage).  I thought about
> > using
> 
> In my view, storing to u.b.b should be equivalent to loading u.b (a 
> type-punning operation if u.a was previously the active member), storing 
> the new value of u.b.b in a copy of the loaded value of u.b, and storing 
> that result back in u.b (making u.b the active member of u).  (So if bits 
> set in u.a are not padding in u.b, and not part of the space occupied by 
> u.b.b, they should be preserved.)

OK, that's convenient enough to support for the compiler.  Unfortunately
that means my testcases become valid and we still miscompile them...

Now, for type-punning to be valid the union has to be visible in the
access, but this visibility can be lost in the middle-end given
the middle-end implements sth more conservative.

Still that a store is a punning access makes the following valid?

union u { struct { int i; int j; } x; struct { float f; float g; } y; };

union u A;
float foo (union u *B)
{
  u->x.i = 0;
  A.y.g = 1.0;
  return u->y.f;
}

is this valid punning of u->x.i to u->y.f (int->float) because the
actual type-punning is implicitely done via A.y.g = 1.0?  Of course the
constraint is that B points to A.

We certainly happily re-order the store and load via u.
Comment 12 jsm-csl@polyomino.org.uk 2018-11-29 18:08:37 UTC
This seems a reasonable enough example for punning to me, given that all 
accesses are via the union type.
Comment 13 Richard Biener 2019-02-07 08:22:17 UTC
Fixed.
Comment 14 Richard Biener 2019-02-07 08:22:33 UTC
Author: rguenth
Date: Thu Feb  7 08:22:01 2019
New Revision: 268609

URL: https://gcc.gnu.org/viewcvs?rev=268609&root=gcc&view=rev
Log:
2019-02-07  Richard Biener  <rguenther@suse.de>

	Backport from mainline
	2018-11-20  Richard Biener  <rguenther@suse.de>
 
	PR tree-optimization/88105
	* tree-ssa-dom.c (pass_dominator::execute): Do not walk
	backedges.

	* gcc.dg/gomp/pr88105.c: New testcase.

	2018-11-28  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/88223
	* tree-ssa-sccvn.c (vn_reference_lookup_3): When skipping
	over a stored-same value may-alias store make sure to consider
	partial overlaps which are valid when TBAA reasonings do not
	apply and byte-granular overlaps are possible at all.

	* gcc.dg/torture/pr88223.c: New testcase.

Added:
    branches/gcc-8-branch/gcc/testsuite/gcc.dg/gomp/pr88105.c
    branches/gcc-8-branch/gcc/testsuite/gcc.dg/torture/pr88223.c
Modified:
    branches/gcc-8-branch/gcc/ChangeLog
    branches/gcc-8-branch/gcc/testsuite/ChangeLog
    branches/gcc-8-branch/gcc/tree-ssa-dom.c
    branches/gcc-8-branch/gcc/tree-ssa-sccvn.c