Test code as follows: ------------------------ typedef float v4sf __attribute__ ((vector_size (4*4))); typedef float v2sf __attribute__ ((vector_size (4*2))); v2sf mem[1]; int main() { v4sf reg = (v4sf){0,0,0,0}; reg = __builtin_ia32_loadlps(reg, mem); return reg[0]; } ------------------------ With -msse, gcc emits the following code: xorps %xmm0, %xmm0 movlps mem, %xmm0 However with -mavx, gcc emits: vxorps %xmm0, %xmm0, %xmm0 vmovlps mem, %xmm1, %xmm1 vshufps $0xe4, %xmm0, %xmm1, %xmm0 Shouldn't this rather have been something like vxorps %xmm0, %xmm0, %xmm0 vmovlps mem, %xmm0, %xmm0 ???
GCC 4.6 doesn't have this problem: [hjl@gnu-6 pr53759]$ cat x.i typedef float v4sf __attribute__ ((vector_size (4*4))); typedef float v2sf __attribute__ ((vector_size (4*2))); v2sf mem[1]; int main() { v4sf reg = (v4sf){0,0,0,0}; reg = __builtin_ia32_loadlps(reg, mem); return reg[0]; } [hjl@gnu-6 pr53759]$ gcc -S -mavx -O x.i [hjl@gnu-6 pr53759]$ cat x.s .file "x.i" .text .globl main .type main, @function main: .LFB0: .cfi_startproc vxorps %xmm0, %xmm0, %xmm0 vmovlps mem(%rip), %xmm0, %xmm0 vcvttss2si %xmm0, %eax ret .cfi_endproc .LFE0: .size main, .-main .comm mem,8,8 .ident "GCC: (GNU) 4.6.3 20120306 (Red Hat 4.6.3-2)" .section .note.GNU-stack,"",@progbits [hjl@gnu-6 pr53759]$
It is caused by revision 172123: http://gcc.gnu.org/ml/gcc-cvs/2011-04/msg00316.html
Created attachment 27699 [details] gcc48-pr53759.patch Sounds like an obvious typo in that change, the x, x, x alternative is already earlier and shouldn't use vmovlps insn, so that obviously should have been x, m, x.
Author: jakub Date: Mon Jun 25 14:52:59 2012 New Revision: 188937 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=188937 Log: PR target/53759 * config/i386/sse.md (sse_loadlps): Use x m x constraints instead of x x x in the vmovlps load alternative. * gcc.target/i386/pr53759.c: New test. Added: trunk/gcc/testsuite/gcc.target/i386/pr53759.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/sse.md trunk/gcc/testsuite/ChangeLog
Author: jakub Date: Mon Jun 25 14:56:17 2012 New Revision: 188938 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=188938 Log: PR target/53759 * config/i386/sse.md (sse_loadlps): Use x m x constraints instead of x x x in the vmovlps load alternative. * gcc.target/i386/pr53759.c: New test. Added: branches/gcc-4_7-branch/gcc/testsuite/gcc.target/i386/pr53759.c Modified: branches/gcc-4_7-branch/gcc/ChangeLog branches/gcc-4_7-branch/gcc/config/i386/sse.md branches/gcc-4_7-branch/gcc/testsuite/ChangeLog
Should be fixed now, thanks.