Bug 32667

Summary: block copy with exact overlap is expanded as memcpy
Product: gcc Reporter: merkert
Component: middle-endAssignee: Not yet assigned to anyone <unassigned>
Status: NEW ---    
Severity: normal CC: fang, fweimer, gcc-bugs, g_sauthoff, jakub, lopresti, lu_zero, mans, msebor, rguenth
Priority: P3 Keywords: wrong-code
Version: 4.2.0   
Target Milestone: ---   
Host: i686-pc-linux-gnu Target:
Build: Known to work:
Known to fail: 4.7.0 Last reconfirmed: 2007-07-08 19:46:19

Description merkert 2007-07-07 17:55:22 UTC
This code generates a warning when run with valgrind:
#include <vector>

using namespace ::std;

struct X {
  double values[10];
};


int main()
{
  vector<X> x;

  x.push_back(X());
  for (vector<X>::iterator i=x.begin();i!=x.end();++i) {
    *i = *(x.end()-1);
  }
  return 0;
}

g++ test.cpp -o foo -O3 

Valgrind error:
valgrind --tool=memcheck foo
==24513== Memcheck, a memory error detector for x86-linux.
==24513== Copyright (C) 2002-2004, and GNU GPL'd, by Julian Seward et al.
==24513== Using valgrind-2.2.0, a program supervision framework for x86-linux.
==24513== Copyright (C) 2000-2004, and GNU GPL'd, by Julian Seward et al.
==24513== For more details, rerun with: -v
==24513==
==24513== Source and destination overlap in memcpy(0x1BB68028, 0x1BB68028, 80)
==24513==    at 0x1B9057E5: memcpy (in /usr/lib/valgrind/vgpreload_memcheck.so)
==24513==    by 0x8048696: main (in /home/ray/tmp/foo)
==24513==
==24513== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 17 from 1)
==24513== malloc/free: in use at exit: 0 bytes in 0 blocks.
==24513== malloc/free: 1 allocs, 1 frees, 80 bytes allocated.
==24513== For a detailed leak analysis,  rerun with: --leak-check=yes
==24513== For counts of detected errors, rerun with: -v

On my machine "man memcpy" says "...The memory areas may not overlap. Use memmove(3) if the memory areas do overlap. ..."
Comment 1 Paolo Carlini 2007-07-07 18:19:08 UTC
Interesting: mainline is not affected by the problem. I would guess thanks to fixing libstdc++/29286 ??? 
Comment 2 merkert 2007-07-07 22:36:14 UTC
This may be an old bug and may have crept in between 3.3.3 and 3.4.0 (latter has it, former doesn't)
Comment 3 Richard Biener 2007-07-08 19:46:19 UTC
No, this doesn't have anything to do with aliasing.  4.3 simply inlines the memcpys.  This is probably a middle-end issue with expanding structure assignments.  2.95 and 3.3 also expanded the calls inline.  3.4 does not,
likewise 4.0, 4.1 and 4.2.

Note that this is likely not a problem in practice as memcpy (p, p, sizeof (*p))
is difficult to implement in a way that would make it not work.  So, downgrading
severity.
Comment 4 Paolo Carlini 2007-07-08 20:53:20 UTC
Thanks for the clarification, Richard.
Comment 5 Paul Pluzhnikov 2009-05-06 16:36:24 UTC
(In reply to comment #3)

> Note that this is likely not a problem in practice as 
>  memcpy (p, p, sizeof (*p)) is difficult to implement
> in a way that would make it not work.

From Julian Seward:

JS> AIUI, POSIX says the src==dst case is not allowed (along with all other
JS> overlapping cases) because (eg) on PowerPC, it is possible to make a high
JS> performance memcpy that preallocates the destination area in D1 using
JS> dcbz instructions, which create the line in D1 and fill it full of
JS> zeroes.  This avoids dragging the destination line up the memory
JS> hierarchy only to completely overwrite it with stuff from the source.
JS>
JS> Result is however that if the src and dst overlap, in any way, including
JS> completely, then this causes zeroes to be written into the src area (!)
JS> which is certainly not what you want.

This bug is likely fixed by:
http://gcc.gnu.org/ml/gcc-patches/2009-04/msg00932.html
Comment 6 lu_zero 2011-12-05 16:59:07 UTC
Doesn't seem.

Here a reduced testcase courtesy of Mans Rullgard

struct foo {
    int x[64];
};

void __attribute__((noinline)) foo(struct foo *a, struct foo *b)
{
    *a = *b;
}

int main(void)
{
    struct foo a = { 0 };
    foo(&a, &a);
    return 0;
}
Comment 7 Mans Rullgard 2011-12-05 20:37:18 UTC
Note that the test case in comment #6 does not trigger the problem on x86 where the copying is inlined. It does trigger on ARM and other targets where this assignment results in a memcpy() call. All current stable releases show the bug. I did not test with trunk.
Comment 8 Richard Biener 2011-12-06 08:44:46 UTC
This is a generic middle-end issue (at best).  There is at least one duplicate
of this bug (can't find it with a quick search).
Comment 9 Andrew Pinski 2015-02-12 05:38:22 UTC
*** Bug 65029 has been marked as a duplicate of this bug. ***
Comment 10 Richard Biener 2015-02-12 09:00:08 UTC
One way to "fix" this is to emit the memcpy as

  if (p != q)
    memcpy (p, q, ...);

but of course that comes at a cost in code-size and runtime for no obvious good
reason (in practice).
Comment 11 Jakub Jelinek 2015-02-12 09:42:40 UTC
And another way to fix this is add a new library entrypoint, that would have the same semantics as memcpy, but would allow full overlap (dst == src).
When we e.g. expand this without a library call, we could do exactly what we do for memcpy, because we know we do handle the dst == src case fine.  Similarly, e.g. in glibc it could very well be just another alias to memcpy, because we know it handles that too.  On targets which would not have the library function we'd use the #c10 approach.  Of course, this would require coordination in between glibc, gcc, valgrind, libsanitizer, memstomp etc.