This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/53712] Does not combine unaligned load with _mm_cmpistri, redundant instruction at -O0
- From: "pinskia at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Fri, 18 Sep 2015 01:56:40 +0000
- Subject: [Bug target/53712] Does not combine unaligned load with _mm_cmpistri, redundant instruction at -O0
- Auto-submitted: auto-generated
- References: <bug-53712-4 at http dot gcc dot gnu dot org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53712
--- Comment #10 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Marco Leise from comment #9)
> If this was fixed three years ago, then how does the same test program
> produce this assembly with gcc 5.2.0 (and earlier)?
Because the test program in comment #0 is invalid. What was allowed instead is
using an unaligned load and then using the other intrinsic and that is what
the bug changed into.
>
> Dump of assembler code for function test:
> 0x0000000000400596 <+0>: push rbp
> 0x0000000000400597 <+1>: mov rbp,rsp
> 0x000000000040059a <+4>: mov QWORD PTR [rbp-0x28],rdi
> 0x000000000040059e <+8>: mov QWORD PTR [rbp-0x30],rsi
> 0x00000000004005a2 <+12>: mov rax,QWORD PTR [rbp-0x30]
> 0x00000000004005a6 <+16>: mov QWORD PTR [rbp-0x18],rax
> 0x00000000004005aa <+20>: mov rax,QWORD PTR [rbp-0x18]
> 0x00000000004005ae <+24>: movdqu xmm0,XMMWORD PTR [rax]
> 0x00000000004005b2 <+28>: movaps XMMWORD PTR [rbp-0x10],xmm0
> 0x00000000004005b6 <+32>: mov rax,QWORD PTR [rbp-0x28]
> => 0x00000000004005ba <+36>: movdqa xmm0,XMMWORD PTR [rax]
> 0x00000000004005be <+40>: movdqa xmm1,xmm0
> 0x00000000004005c2 <+44>: movdqa xmm0,XMMWORD PTR [rbp-0x10]
> 0x00000000004005c7 <+49>: pcmpistri xmm0,xmm1,0x0
> 0x00000000004005cd <+55>: mov eax,ecx
> 0x00000000004005cf <+57>: pcmpistrm xmm0,xmm1,0x0
> 0x00000000004005d5 <+63>: pop rbp
> 0x00000000004005d6 <+64>: ret
>
> gcc -v
> Using built-in specs.
> COLLECT_GCC=/usr/x86_64-pc-linux-gnu/gcc-bin/5.2.0/gcc
> COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-pc-linux-gnu/5.2.0/lto-wrapper
> Target: x86_64-pc-linux-gnu
> Configured with:
> /var/tmp/portage/sys-devel/gcc-5.2.0/work/gcc-5.2.0/configure
> --host=x86_64-pc-linux-gnu --build=x86_64-pc-linux-gnu --prefix=/usr
> --bindir=/usr/x86_64-pc-linux-gnu/gcc-bin/5.2.0
> --includedir=/usr/lib/gcc/x86_64-pc-linux-gnu/5.2.0/include
> --datadir=/usr/share/gcc-data/x86_64-pc-linux-gnu/5.2.0
> --mandir=/usr/share/gcc-data/x86_64-pc-linux-gnu/5.2.0/man
> --infodir=/usr/share/gcc-data/x86_64-pc-linux-gnu/5.2.0/info
> --with-gxx-include-dir=/usr/lib/gcc/x86_64-pc-linux-gnu/5.2.0/include/g++-v5
> --with-python-dir=/share/gcc-data/x86_64-pc-linux-gnu/5.2.0/python
> --enable-languages=c,c++ --enable-obsolete --enable-secureplt
> --disable-werror --with-system-zlib --enable-nls --without-included-gettext
> --enable-checking=release --with-bugurl=https://bugs.gentoo.org/
> --with-pkgversion='Gentoo 5.2.0 p1.1, pie-0.6.4' --enable-libstdcxx-time
> --enable-shared --enable-threads=posix --enable-__cxa_atexit
> --enable-clocale=gnu --enable-multilib --with-multilib-list=m32,m64
> --disable-altivec --disable-fixed-point --enable-targets=all
> --disable-libgcj --enable-libgomp --disable-libmudflap --disable-libssp
> --disable-libcilkrts --disable-libquadmath --enable-lto --without-isl
> --enable-libsanitizer
> Thread model: posix
> gcc version 5.2.0 (Gentoo 5.2.0 p1.1, pie-0.6.4)