This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: vector extension bug?


"David Mathog" <mathog@caltech.edu> writes:

> I tried to track down the bug mentioned previously in testing my
> software SSE2 when compiled with -m64 and ended up removing all 
> of the CHECK and my own includes without eliminating the bug.  The test
> program works fine with -m32, or with -m64 -msse2, but it fails with
> -m64 -mno-sse2.  Here is the greatly reduced gccprob2.c:
>
> 8<8<8<8<8<8<8<8<8<8<8<8<8<8<8<8<8<8<8<8<8<8<8<8<8<8<8<8<8<8<
> #include <stdio.h> /* for printf */
> typedef double    __m128d __attribute__ ((__vector_size__ (16),
> __may_alias__));
> typedef union
> {
>   __m128d x;
>   double a[2];
> } union128d;
> #define EMM_FLT8(a)    ((double *)&(a))
>
> void test ( __m128d s1, __m128d s2)
> {
> printf("test s1 %lf %lf\n",EMM_FLT8(s1)[0],EMM_FLT8(s1)[1]);
> printf("test s2 %lf %lf\n",EMM_FLT8(s2)[0],EMM_FLT8(s2)[1]);
> }
>
> int main (void)
> {
> __attribute__ ((aligned (16)))  union128d s1;
>   s1.a[0] = 1.0;
>   s1.a[1] = 2.0;
> printf("s1      %lf %lf\n",s1.a[0],s1.a[1]);
>   test (s1.x, s1.x);
> }
> 8<8<8<8<8<8<8<8<8<8<8<8<8<8<8<8<8<8<8<8<8<8<8<8<8<8<8<8<8<8<
>
> Test runs:
>
> % gcc -msse -mno-sse2 -m64   -o foo gccprob2.c
> % ./foo  #first value in s2 is wrong
> s1      1.000000 2.000000
> test s1 1.000000 2.000000
> test s2 2.000000 2.000000
> % gcc -msse -msse2 -m64   -o foo gccprob2.c
> % ./foo
> s1      1.000000 2.000000
> test s1 1.000000 2.000000
> test s2 1.000000 2.000000
> % gcc -msse -mno-sse2 -lm -m32   -o foo gccprob2.c
> % ./foo
> s1      1.000000 2.000000
> test s1 1.000000 2.000000
> test s2 1.000000 2.000000
> % gcc --version
> gcc (GCC) 4.4.1
> % cat /etc/release
> Mandriva Linux release 2010.0 (Official) for x86_64
> % cat /proc/cpuinfo | head -10 
> processor       : 0
> vendor_id       : AuthenticAMD
> cpu family      : 15
> model           : 33
> model name      : Dual Core AMD Opteron(tm) Processor 280
> stepping        : 2
> cpu MHz         : 1000.000
> cache size      : 1024 KB
> physical id     : 0
> siblings        : 2
>
> Is there something wrong with this program or is this a compiler bug?


I think this is a compiler bug in the i386 backend.  The
classify_argument function uses X86_64_SSEUP_CLASS for V2DFmode, and
examine_argument counts that as requiring a single SSE register.
However, since the SSE2 instructions are not available, the argument is
split into two SSE registers.  The result is that the first argument is
passed in %xmm0/%xmm1, and the second argument is passed in %xmm1/%xmm2.
That is, the arguments overlap, leading to the incorrect result.

Basically, the 64-bit calling convention support assumes that the SSE2
instructions are always available, and silently fails when -mno-sse2 is
used.  I don't really have an opinion as to whether the compiler needs
to support this case correctly, but I think that clearly it must not
silently fail.

Please consider opening a bug report for this.  Thanks.

Ian


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]