Bug 28069 - __m128 local variables don't get properly aligned.
Summary: __m128 local variables don't get properly aligned.
Status: RESOLVED DUPLICATE of bug 27537
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.2.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on: 13685
Blocks:
  Show dependency treegraph
 
Reported: 2006-06-17 06:14 UTC by Zuxy
Modified: 2007-08-29 00:38 UTC (History)
13 users (show)

See Also:
Host:
Target: i686
Build:
Known to work:
Known to fail:
Last reconfirmed: 2006-09-20 07:54:00


Attachments
The file that gcc fails to compile correctly. (1.89 KB, text/plain)
2006-06-17 06:25 UTC, Zuxy
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Zuxy 2006-06-17 06:14:15 UTC
GCC dosen't allocate __m128 locals on a 16-byte boundary, but continues to use movaps to access them, causing general protection faults at run-time.
Comment 1 Zuxy 2006-06-17 06:25:41 UTC
Created attachment 11685 [details]
The file that gcc fails to compile correctly.

Use gcc -S -msse and look at the assembly. GCC allocates __m128 locals directly on the stack without adjusting ESP, which might not be 16-byte aligned. But GCC uses movaps, which requires its operand to be 16-byte aligned, to access those locals.

ICC solves this problem by adding
        pushl     %ebp
        movl      %esp, %ebp
        andl      $-16, %esp
to the function prolog
Comment 2 Andrew Pinski 2006-06-17 06:30:42 UTC
(In reply to comment #1) 
> Use gcc -S -msse and look at the assembly. GCC allocates __m128 locals directly
> on the stack without adjusting ESP, which might not be 16-byte aligned. But GCC
> uses movaps, which requires its operand to be 16-byte aligned, to access those
> locals.

In a way this is a dup of bug 27537.  Though there is an attribute to realign the stack in 4.2.0 so using that might just fix this issue instead.
Comment 3 Francois-Xavier Coudert 2006-09-20 07:54:00 UTC
Not specific to mingw32.
Comment 4 Danny Smith 2006-09-23 06:34:52 UTC
(In reply to comment #2)

> In a way this is a dup of bug 27537.  Though there is an attribute to realign
> the stack in 4.2.0 so using that might just fix this issue instead.

Indeed,

5c5
< void dct64_sse(float *a,float *b,float *c)  
---
> void __attribute__ ((force_align_arg_pointer)) dct64_sse(float *a,float *b,float *c)  

fixes on 4.2.

BTW, this issue has particular importance for mingw32 multithreaded programs,
since Win32 API CreateThread or the corresponding CRT _beginthreadex  functions do  not guarantee that the stack will be 16-byte-aligned on entry to
the thread start-function callback. Marking the thread start function with force_align_arg_pointer attribute fixes.  Hmm should that go in gcc.info? 

Danny
Comment 5 Zuxy 2007-08-26 07:58:18 UTC

*** This bug has been marked as a duplicate of 27537 ***