From: Tim Prince <timothyprince@sbcglobal.net> To: vgrebinski@yahoo.com, gcc-gnats@gcc.gnu.org Cc: vgrebinski@yahoo.com Subject: Re: target/10395: sse2 datatype is not 16bytes aligned in threaded code Date: Sun, 13 Apr 2003 16:31:16 -0700 On Sunday 13 April 2003 15:44, vgrebinski@yahoo.com wrote: > >Number: 10395 > >Category: target > >Synopsis: sse2 types are incorrectly aligned causing crash in > > multi-threaded apps Confidential: no > >Severity: serious > >Priority: medium > >Responsible: unassigned > >State: open > >Class: wrong-code > >Submitter-Id: net > >Arrival-Date: Sun Apr 13 22:46:00 UTC 2003 > >Closed-Date: > >Last-Modified: > >Originator: Vladimir Grebinskiy > >Release: 3.3 20030410 (prerelease) > >Organization: > >Environment: > > System: Linux vag 2.4.21-pre5 #2 Sun Mar 2 00:28:31 EST 2003 i686 unknown > unknown GNU/Linux Architecture: i686 > > > host: i386-pc-linux-gnu > build: i386-pc-linux-gnu > target: i386-pc-linux-gnu > configured with: > /build/packages/gcc/snap/gcc-snapshot-20030410/src/configure -v > --enable-languages=c,c++,java,f77,pascal,objc,ada > --prefix=/usr/lib/gcc-snapshot --infodir=/share/info --mandir=/share/man > --enable-shared --with-system-zlib --enable-nls --without-included-gettext > --enable-__cxa_atexit --enable-clocale=gnu --enable-java-gc=boehm > --enable-java-awt=xlib --with-cpp-install-dir=bin --enable-multilibs > --enable-objc-gc i386-linux > > >Description: > > support for sse2 instruction is an important addtion to gcc-3.3. > Unfortunately, code generated for functions called via pthread_create() > does not provide 16bytes alignment for local sse2 data, which causes crash > when these variables are used. > > >How-To-Repeat: > > The following short program demonstrates problem. The second call to > function "f" shows that variable is not aligned to 16 bytes: > > /* *** start ***/ > #include <pthread.h> > #include <stdio.h> > #include <assert.h> > > #include <xmmintrin.h> > #include <mmintrin.h> > > #ifdef __ICC > #include <emmintrin.h> > #endif > > void * f(void *p) > { > int x = (p == NULL) ? 0 : * (int *) p; > __m128i s; > printf("&x = %p &s= %p\n", &x, &s); > return NULL; > } > > int main(int argc, char ** argv) > { > pthread_t th; > > f(& argc); > assert(pthread_create(& th, NULL, f, &argc)==0); > assert(pthread_join(th, NULL)==0); > return 0; > } > > /* ***end *** / > > $ /usr/lib/gcc-snapshot/bin/gcc -pthread -msse2 gcc_test.c -o > gcc_test.LINUX $ ./gcc_test.LINUX > &x = 0xbffffb6c &s= 0xbffffb50 > &x = 0xbf7ffae8 &s= 0xbf7ffacc <---- error > > >Fix: > > Similar problem here, but I had to change --m128i to --m128. Did you try rebuilding libpthread with the proper options passed to gcc? Otherwise, it looks like an error in the library build. gcc won't work unless libpthread passes aligned stack to your function. I tried another compiler, which doesn't expect aligned stack, and it was OK. -- Tim Prince
From: Vladimir Grebinskiy <vgrebinski@yahoo.com> To: tprince@computer.org, gcc-gnats@gcc.gnu.org Cc: vgrebinski@yahoo.com Subject: Re: target/10395: sse2 datatype is not 16bytes aligned in threaded code Date: Sun, 13 Apr 2003 18:24:27 -0700 (PDT) --- Tim Prince <timothyprince@sbcglobal.net> wrote: > On Sunday 13 April 2003 15:44, vgrebinski@yahoo.com wrote: > > >Number: 10395 > > >Category: target > > >Synopsis: sse2 types are incorrectly aligned causing crash in > > > multi-threaded apps Confidential: no > > >Severity: serious > > >Priority: medium > > >Responsible: unassigned > > >State: open > > >Class: wrong-code > > >Submitter-Id: net > > >Arrival-Date: Sun Apr 13 22:46:00 UTC 2003 > > >Closed-Date: > > >Last-Modified: > > >Originator: Vladimir Grebinskiy > > >Release: 3.3 20030410 (prerelease) > > >Organization: > > >Environment: > > > > System: Linux vag 2.4.21-pre5 #2 Sun Mar 2 00:28:31 EST 2003 i686 unknown > > unknown GNU/Linux Architecture: i686 > > > > > > host: i386-pc-linux-gnu > > build: i386-pc-linux-gnu > > target: i386-pc-linux-gnu > > configured with: > > /build/packages/gcc/snap/gcc-snapshot-20030410/src/configure -v > > --enable-languages=c,c++,java,f77,pascal,objc,ada > > --prefix=/usr/lib/gcc-snapshot --infodir=/share/info --mandir=/share/man > > --enable-shared --with-system-zlib --enable-nls --without-included-gettext > > --enable-__cxa_atexit --enable-clocale=gnu --enable-java-gc=boehm > > --enable-java-awt=xlib --with-cpp-install-dir=bin --enable-multilibs > > --enable-objc-gc i386-linux > > > > >Description: > > > > support for sse2 instruction is an important addtion to gcc-3.3. > > Unfortunately, code generated for functions called via pthread_create() > > does not provide 16bytes alignment for local sse2 data, which causes crash > > when these variables are used. > > > > >How-To-Repeat: > > > > The following short program demonstrates problem. The second call to > > function "f" shows that variable is not aligned to 16 bytes: > > > > /* *** start ***/ > > #include <pthread.h> > > #include <stdio.h> > > #include <assert.h> > > > > #include <xmmintrin.h> > > #include <mmintrin.h> > > > > #ifdef __ICC > > #include <emmintrin.h> > > #endif > > > > void * f(void *p) > > { > > int x = (p == NULL) ? 0 : * (int *) p; > > __m128i s; > > printf("&x = %p &s= %p\n", &x, &s); > > return NULL; > > } > > > > int main(int argc, char ** argv) > > { > > pthread_t th; > > > > f(& argc); > > assert(pthread_create(& th, NULL, f, &argc)==0); > > assert(pthread_join(th, NULL)==0); > > return 0; > > } > > > > /* ***end *** / > > > > $ /usr/lib/gcc-snapshot/bin/gcc -pthread -msse2 gcc_test.c -o > > gcc_test.LINUX $ ./gcc_test.LINUX > > &x = 0xbffffb6c &s= 0xbffffb50 > > &x = 0xbf7ffae8 &s= 0xbf7ffacc <---- error > > > > >Fix: > > > > > Similar problem here, but I had to change --m128i to --m128. Did you try > rebuilding libpthread with the proper options passed to gcc? Otherwise, it > looks like an error in the library build. gcc won't work unless libpthread > passes aligned stack to your function. I tried another compiler, which > doesn't expect aligned stack, and it was OK. Thanks for suggestion. I'll try to find out if libpthread should be compiled in a special way (I guess whole libc should be recompiled). Currently, all libraries are whatever is in Debian/unstable as of today. The code works as expected with Intel's compiler (build 7.1.011), I guess it uses the same libpthread as gcc does. I just checked that there is a similar problem with MMX datatypes (in which case unaligned access is just slower?) -- data are not 8 bytes aligned when called inside a thread (checked also with 3.2.3 20030407 (Debian prerelease)). Vladimir > -- > Tim Prince __________________________________________________ Do you Yahoo!? Yahoo! Tax Center - File online, calculators, forms, and more http://tax.yahoo.com
support for sse2 instruction is an important addtion to gcc-3.3. Unfortunately, code generated for functions called via pthread_create() does not provide 16bytes alignment for local sse2 data, which causes crash when these variables are used. Release: 3.3 20030410 (prerelease) Environment: System: Linux vag 2.4.21-pre5 #2 Sun Mar 2 00:28:31 EST 2003 i686 unknown unknown GNU/Linux Architecture: i686 host: i386-pc-linux-gnu build: i386-pc-linux-gnu target: i386-pc-linux-gnu configured with: /build/packages/gcc/snap/gcc-snapshot-20030410/src/configure -v --enable-languages=c,c++,java,f77,pascal,objc,ada --prefix=/usr/lib/gcc-snapshot --infodir=/share/info --mandir=/share/man --enable-shared --with-system-zlib --enable-nls --without-included-gettext --enable-__cxa_atexit --enable-clocale=gnu --enable-java-gc=boehm --enable-java-awt=xlib --with-cpp-install-dir=bin --enable-multilibs --enable-objc-gc i386-linux How-To-Repeat: The following short program demonstrates problem. The second call to function "f" shows that variable is not aligned to 16 bytes: /* *** start ***/ #include <pthread.h> #include <stdio.h> #include <assert.h> #include <xmmintrin.h> #include <mmintrin.h> #ifdef __ICC #include <emmintrin.h> #endif void * f(void *p) { int x = (p == NULL) ? 0 : * (int *) p; __m128i s; printf("&x = %p &s= %p\n", &x, &s); return NULL; } int main(int argc, char ** argv) { pthread_t th; f(& argc); assert(pthread_create(& th, NULL, f, &argc)==0); assert(pthread_join(th, NULL)==0); return 0; } /* ***end *** / $ /usr/lib/gcc-snapshot/bin/gcc -pthread -msse2 gcc_test.c -o gcc_test.LINUX $ ./gcc_test.LINUX &x = 0xbffffb6c &s= 0xbffffb50 &x = 0xbf7ffae8 &s= 0xbf7ffacc <---- error
Fix: Do alloca call in the first function of the thread
Responsible-Changed-From-To: unassigned->hubicka Responsible-Changed-Why: Master of SSE
Responsible-Changed-From-To: hubicka->bernds Responsible-Changed-Why: Bernd, what is the status of dynamic alignment patch? Should I look for the last version and update it?
State-Changed-From-To: open->suspended State-Changed-Why: The bug is partly caused by glibc not aligning the stack frame of threads properly (I've notified glibc folks) and partly by GCC (unlike ICC) still not being able to deal with dynamic stack alignment. This won't be solved for 3.3.
*** Bug 11802 has been marked as a duplicate of this bug. ***
*** Bug 11961 has been marked as a duplicate of this bug. ***
*** Bug 9633 has been marked as a duplicate of this bug. ***
*** Bug 12105 has been marked as a duplicate of this bug. ***
*** Bug 13190 has been marked as a duplicate of this bug. ***
*** Bug 13245 has been marked as a duplicate of this bug. ***
*** Bug 13240 has been marked as a duplicate of this bug. ***
AFAICS GCC 3.3.2 generates wrong code (cf. #13245): ebp is 0xbffff328 0x08048243 <main_(int, char**)+63>: call 0x80481f4 <prepare_fpu()> 0x08048248 <main_(int, char**)+68>: fldl 0x80ed420 0x0804824e <main_(int, char**)+74>: fstpl 0xfffffff0(%ebp) 0x08048251 <main_(int, char**)+77>: fldl 0xfffffff0(%ebp) ===>0x08048254 <main_(int, char**)+80>: fstpl 0x4(%esp,1) <===== 0x08048258 <main_(int, char**)+84>: lea 0xffffffff(%ebp),%eax 0x0804825b <main_(int, char**)+87>: mov %eax,(%esp,1) 0x0804825e <main_(int, char**)+90>: call 0x80482fe <XM& operator<<AF This is main_ () not main ().
I have doubts that Bernd Schmidt is working on this any more.
*** Bug 14451 has been marked as a duplicate of this bug. ***
*** Bug 17934 has been marked as a duplicate of this bug. ***
My bug #17934 was merged with this one. However, my test case demonstrates the problem WITHOUT threads. The same issue appears with a single-threaded application of sufficient complexity.
Comment #32 in PR 17990 says: In response to comment #30: it is libpthread's responsibility to align the stack of subthreads properly. It doesn't do that, however, but I believe that we have another PR for that (this is something that we can't really do anything about, however). This is the mentioned PR. The testcase in PR 17990 is the testcase from comment #2 from this PR. The default pthread library in RedHat 8.0 does align stack properly, as I am able to trigger the bug. Testcase shows: &x = 0xbffff9cc &s= 0xbffff9b0 &x = 0x4085fac8 &s= 0x4085faac Should this bug be closed as WONTFIX?
> The default pthread library in RedHat 8.0 does align stack properly, as I am > able to trigger the bug. Testcase shows: Uros, did you mean "does *not* align the stack properly"? If so, I would say yes, we should close the PR as WONTFIX, since we can't do anything about it. W.
(In reply to comment #21) > > The default pthread library in RedHat 8.0 does align stack properly, as I am > > able to trigger the bug. Testcase shows: > > Uros, did you mean "does *not* align the stack properly"? If so, I would > say yes, we should close the PR as WONTFIX, since we can't do anything > about it. Ouch... RH 8.0 does _NOT_ align stack properly. However, on second thought, is it worth to add some kind of -mforce-stack-align parameter to gcc? Or maybe an attribute to the function? This way, one could add a stack alignment code, similar to main() stack alignment: pushl %ebp movl %esp, %ebp subl $8, %esp andl $-16, %esp <- this subl $16, %esp
An attribute could work. I doubt that a general flag would be useful, since one in general doesn't know which functions are thread entry points, so the compiler would have to emit such stack alignment code into the prolog of each function it finds. W.
Fixed pthread library should be installed to solve this bug: http://gcc.gnu.org/ml/gcc/2004-12/msg00918.html
FYI: glibc fix is here: http://sources.redhat.com/ml/libc-hacker/2004-12/msg00068.html