[Bug c/21239] New: Illegal elimination of SSE2 load/store using xmm intrinsics

kurt at garloff dot de gcc-bugzilla@gcc.gnu.org
Tue Apr 26 23:46:00 GMT 2005


/** intrin.c   
 *   
 * gcc-4.0 misoptimizes the _mm_load_sd() away with   
 * -O1 (x86-64), with or without -m32 -msse2.   
 *   
 * (c) Kurt Garloff <garloff@suse.de>, Artistic v2   
 */   
   
#include <stdlib.h>                                                                              
#include <emmintrin.h>                                                                           
   
#ifdef WORKAROUND                                                                                
# define ACCESS(X) asm("": : "x"(X))                                                             
#else                                                                                            
# define ACCESS(X)                                                                               
#endif                                                                                           
   
void do_copy(const unsigned int ln, double* const dst,      
                const double* const src)   
{   
        int i = ln;   
        const register double *s = src;   
        register double *d = dst;   
        __m128d TMP;   
        while (i) {   
                TMP = _mm_load_sd(s);   
                ACCESS(TMP);   
                _mm_store_sd(d, TMP);   
                --i; ++s; ++d;   
        }                                                                                        
}                                                                                                
   
int main()   
{   
        unsigned int i;   
        double *a, *b ,*c;   
        a = (double*) malloc(19*sizeof(double));   
        b = (double*) malloc(19*sizeof(double));   
        for (i = 0; i < 19; ++i) {   
                a[i] = 1; b[i] = 2;   
        }                                                                                        
        do_copy(19, a, b);   
        return (a[18] != 2);   
}                                                                                                
   
The test program should return 0, which it does if gcc-3.3/3.4 is used or if   
compiled with -DWORKAROUND. gcc-4.0, 4_0-branch, HEAD, and   
tree-profiling-branch all fail: The _mm_load_sd() is optimized away.   
I guess the compiler does not consider the _mm_store_sd() as a consumer of   
the vector register. Adding the fake consumer asm(""::x(XMMREG)); helps thus.  
Compiling with -m32 -msse2 exposes the same problem, I have a strong suspicion  
the native compiler on x86 would have the same problem.  
  
   
Here's the wrong assembly produced by gcc-4.0 (on x86-64, using -O2):  
do_copy:  
.LFB495:  
        testl   %edi, %edi  
        jne     .L8  
        rep ; ret  
        .p2align 4,,7  
.L8:  
        xorl    %eax, %eax  
        .p2align 4,,7  
.L4:  
        incl    %eax  
        movq    $0, (%rsi)  
        addq    $8, %rsi  
        cmpl    %eax, %edi  
        jne     .L4  
        rep ; ret  
  
... and here the correct assembly with -DWORKAROUND added:  
do_copy:  
.LFB495:  
        testl   %edi, %edi  
        jne     .L8  
        rep ; ret  
        .p2align 4,,7  
.L8:  
        xorl    %eax, %eax  
        .p2align 4,,7  
.L4:  
        movsd   (%rdx), %xmm0  
        incl    %eax  
        movlpd  %xmm0, (%rsi)  
        addq    $8, %rdx  
        addq    $8, %rsi  
        cmpl    %eax, %edi  
        jne     .L4  
        rep ; ret

-- 
           Summary: Illegal elimination of SSE2 load/store using xmm
                    intrinsics
           Product: gcc
           Version: 4.0.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: c
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: kurt at garloff dot de
                CC: gcc-bugs at gcc dot gnu dot org
 GCC build triplet: x86_64-suse-linux
  GCC host triplet: x86_64-suse-linux
GCC target triplet: x86_64-suse-linux


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21239



More information about the Gcc-bugs mailing list