when i compile the below source with gcc 3.2.3 20030316 (debian prerelease), or with gcc version 3.3 20030129 (Debian prerelease) the result is code which only compares the bottom half of the 64-bit value which i've asked it to compare. notice the "movd %mm1,%esi" to get the bottom 32... and mm1 is also stored to the stack, and the bottom 32-bits are compared. i think there's an offset by 4 missing somewhere. Release: gcc version 3.2.3 20030316 (Debian prerelease) Environment: debian unstable How-To-Repeat: dean@zim:~/verification$ cat gccbug.c #include <stdint.h> #include <mmintrin.h> typedef union { uint64_t uq[1]; __m64 m; } mm_t __attribute__((aligned(8))); int foo(mm_t *p, const mm_t *q) { mm_t t; //asm("psllw $1,%0" : "=&y" (t.m) : "0" (p)); t.m = _mm_slli_pi16(p->m, 1); return t.uq[0] == q->uq[0]; } dean@zim:~/verification$ gcc -c -O3 -mmmx -Wall gccbug.c dean@zim:~/verification$ objdump -dr gccbug.o gccbug.o: file format elf32-i386 Disassembly of section .text: 00000000 <foo>: 0: 55 push %ebp 1: 89 e5 mov %esp,%ebp 3: 57 push %edi 4: 56 push %esi 5: 53 push %ebx 6: 83 ec 0c sub $0xc,%esp 9: 8b 75 08 mov 0x8(%ebp),%esi c: 0f 6f 0e movq (%esi),%mm1 f: 8b 4d 0c mov 0xc(%ebp),%ecx 12: 0f 71 f1 01 psllw $0x1,%mm1 16: 8b 39 mov (%ecx),%edi 18: 8b 51 04 mov 0x4(%ecx),%edx 1b: 0f 7e ce movd %mm1,%esi 1e: 0f 7f 4d e8 movq %mm1,0xffffffe8(%ebp) 22: 31 d6 xor %edx,%esi 24: 8b 5d e8 mov 0xffffffe8(%ebp),%ebx 27: 31 fb xor %edi,%ebx 29: 89 f0 mov %esi,%eax 2b: 09 d8 or %ebx,%eax 2d: 0f 94 c2 sete %dl 30: 83 c4 0c add $0xc,%esp 33: 5b pop %ebx 34: 5e pop %esi 35: 0f b6 c2 movzbl %dl,%eax 38: 5f pop %edi 39: c9 leave 3a: c3 ret
State-Changed-From-To: open->analyzed State-Changed-Why: Confirmed on 3.2 branch. 3.3 and mainline are not affected.
Responsible-Changed-From-To: unassigned->ebotcazou Responsible-Changed-Why: Investigating.
State-Changed-From-To: analyzed->closed State-Changed-Why: Fixed in 3.3.