in this example file compiled with gcc-4.10 -O3 inlines the memcpy, but not the memmove: #include <string.h> void a(int *a, int *b) { memcpy(a, b, sizeof(*a)); } void b(int *a, int *b) { memmove(a, b, sizeof(*a)); } at least on x86 the integer can be stored in fully in a register and saved back to arbitrary aligned memory, so it should not matter if the two address overlap and gcc should be able to replace it with a much faster inline memcpy. Currently gcc only inlines the memmove for 1 byte types. I am using a recent svn copy of gcc: $ gcc-4.10 --version gcc (GCC) 4.10.0 20140605 (experimental)
I am using glibc 2.19-0ubuntu6 from the ubuntu 14.04 trusty repository
The transform is simply not implemented for memmove but simplification always goes through memmove -> memcpy and then applies this optimization to memcpy. But of course here the memmove -> memcpy transform is not valid.
Author: rguenth Date: Fri Jul 11 13:42:55 2014 New Revision: 212452 URL: https://gcc.gnu.org/viewcvs?rev=212452&root=gcc&view=rev Log: 2014-07-11 Richard Biener <rguenther@suse.de> PR middle-end/61473 * builtins.c (fold_builtin_memory_op): Inline memory moves that can be implemented with a single load followed by a single store. (c_strlen): Only warn when only_value is not 2. * gcc.dg/memmove-4.c: New testcase. * gcc.dg/strlenopt-8.c: XFAIL. * gfortran.dg/coarray_lib_realloc_1.f90: Adjust. Added: trunk/gcc/testsuite/gcc.dg/memmove-4.c Modified: trunk/gcc/ChangeLog trunk/gcc/builtins.c trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.dg/strlenopt-8.c trunk/gcc/testsuite/gfortran.dg/coarray_lib_realloc_1.f90
Fixed.