[Bug c/57571] New: linux kernel function memcpy() execute with low efficiency on Intel Ivybridge platform
yiyi8761 at gmail dot com
gcc-bugzilla@gcc.gnu.org
Sun Jun 9 06:56:00 GMT 2013
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57571
Bug ID: 57571
Summary: linux kernel function memcpy() execute with low
efficiency on Intel Ivybridge platform
Product: gcc
Version: 4.7.2
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: yiyi8761 at gmail dot com
OS type: OpenSuse 12.3 or SUSE 11 SP2
CPU type: Intel Ivybridge i7-3612QE or Intel Ivybridge i7-3615QE
GCC Ver: 4.7.2(Open Suse 12.3) or 4.3.4(SUSE 11 SP2)
GCC 4.7.2 Configured with: ../configure --prefix=/usr
--infodir=/usr/share/info --mandir=/usr/share/man --libdir=/usr/lib64
--libexecdir=/usr/lib64 --enable-languages=c,c++,objc,fortran,obj-c++,java,ada
--enable-checking=release --with-gxx-include-dir=/usr/include/c++/4.7
--enable-ssp --disable-libssp --disable-libitm --disable-plugin
--with-bugurl=http://bugs.opensuse.org/ --with-pkgversion='SUSE Linux'
--disable-libgcj --disable-libmudflap --with-slibdir=/lib64 --with-system-zlib
--enable-__cxa_atexit --enable-libstdcxx-allocator=new --disable-libstdcxx-pch
--enable-version-specific-runtime-libs --enable-linker-build-id
--program-suffix=-4.7 --enable-linux-futex --without-system-libunwind
--with-arch-32=i586 --with-tune=generic --build=x86_64-suse-linux
GCC 4.3.4 Configured with: ../configure --prefix=/usr
--infodir=/usr/share/info --mandir=/usr/share/man --libdir=/usr/lib64
--libexecdir=/usr/lib64 --enable-languages=c,c++,objc,fortran,obj-c++,java,ada
--enable-checking=release --with-gxx-include-dir=/usr/include/c++/4.7
--enable-ssp --disable-libssp --disable-libitm --disable-plugin
--with-bugurl=http://bugs.opensuse.org/ --with-pkgversion='SUSE Linux'
--disable-libgcj --disable-libmudflap --with-slibdir=/lib64 --with-system-zlib
--enable-__cxa_atexit --enable-libstdcxx-allocator=new --disable-libstdcxx-pch
--enable-version-specific-runtime-libs --enable-linker-build-id
--program-suffix=-4.7 --enable-linux-futex --without-system-libunwind
--with-arch-32=i586 --with-tune=generic --build=x86_64-suse-linux
description:
1. With the configurations above, the memcpy() used by linux kernel has a very
low performance. use gdb to view memcpy() in disassembled code, it works like
this:
(gdb) set disassembly-flavor intel
(gdb) x/20i 0xffffffff812ca220
0xffffffff812ca220: mov rax,rdi
0xffffffff812ca223: mov rcx,rdx
0xffffffff812ca226: rep movs BYTE PTR es:[rdi],BYTE PTR ds:[rsi]
0xffffffff812ca228: ret
0xffffffff812ca229: add eax,DWORD PTR [rbx+0x48f307e2]
0xffffffff812ca22f: movs DWORD PTR es:[rdi],DWORD PTR ds:[rsi]
0xffffffff812ca230: mov ecx,edx
0xffffffff812ca232: rep movs BYTE PTR es:[rdi],BYTE PTR ds:[rsi]
0xffffffff812ca234: ret
2. However, using the same OS(same GCC version and config), but on Intel
Arrandle platform (i7 CPU L620), in gdb the function memcpy() in disassembled
code like this:
(gdb) set disassembly-flavor intel
(gdb) x/20i 0xffffffff81250e80
0xffffffff81250e80: mov rax,rdi
0xffffffff81250e83: mov ecx,edx
0xffffffff81250e85: shr ecx,0x3
0xffffffff81250e88: and edx,0x7
0xffffffff81250e8b: rep movs QWORD PTR es:[rdi],QWORD PTR ds:[rsi]
0xffffffff81250e8e: mov ecx,edx
0xffffffff81250e90: rep movs BYTE PTR es:[rdi],BYTE PTR ds:[rsi]
0xffffffff81250e92: ret
3. So, the memcpy()'s efficiency on i7 L620 is eight times on the Intel
Ivybridge Platform when the copy length is bigger than 8.
4. Have already referred to Intel and novell, the engineers said this issue may
related with the compiler.
More information about the Gcc-bugs
mailing list