Bug List: (This bug is not in your last search results)   Show last search results      Search page      Enter new bug
Bug#: 16660
Product:  
Component:  
Status: NEW
Resolution:
Assigned To: Not yet assigned to anyone <unassigned@gcc.gnu.org>
Host:
Reported against  
Priority:  
Severity:  
Target Milestone:  
 
 
Target:
Reporter: Jens Maurer <jens.maurer@gmx.net>
Add CC:
CC:
Remove selected CCs
Build:
URL:
Summary:
Keywords:
Known to work:
Known to fail:

Attachment Description Type Created Size Actions
Create a New Attachment (proposed patch, testcase, etc.) View All

Bug 16660 depends on: 33721 Show dependency tree
Show dependency graph
Bug 16660 blocks:

Additional Comments:





Mark bug as waiting for feedback
Mark bug as suspended




View Bug Activity   |   Format For Printing   |   Clone This Bug


Description:   Last confirmed: 2007-07-09 08:24 Opened: 2004-07-21 20:43
Running the program below compiled with "-mpreferred-stack-boundary=2"
gets a "segmentation fault" because the variable "tmp"
is not properly aligned on a 16-byte boundary (required for
movaps), violating the aligned(16) request in the attribute.

void f()
{
  unsigned long tmp[4] __attribute__((aligned(16)));
  asm("movaps %%xmm0, (%0)" : : "r" (tmp) : "memory");
}

int main()
{
  f();
}

------- Comment #1 From Andrew Pinski 2004-08-15 04:27 -------
Confirmed, note that the inline-asm can be improved:
  asm("movaps %%xmm0, %0" : : "m" (*tmp) );

------- Comment #2 From Andrew Pinski 2005-11-06 06:11 -------
*** Bug 24691 has been marked as a duplicate of this bug. ***

------- Comment #3 From Andrew Pinski 2006-03-18 17:43 -------
Actually this is just a missed diagnostic.  The compiler cannot align the stack
variables where the alignment is greater than stack alignment that the compiler
can give for the stack.

------- Comment #4 From Chris Lattner 2006-03-18 23:43 -------
Huh?  Why can't it?

------- Comment #5 From Thomas Weidenmueller 2006-09-06 07:14 -------
(In reply to comment #3)
> Actually this is just a missed diagnostic.  The compiler cannot align the stack
> variables where the alignment is greater than stack alignment that the compiler
> can give for the stack.

The least GCC could and should do then is warn about it...

If the code is not very complex, the alignment appears to work, though. But as
soon as the code becomes complex, GCC screwes the alignment and even
accesses variables that don't even exist (I'll go into detail later).

Basically code like this is affected (this is *NOT* a test case, I'm going to
post a test case as soon as I get it to work):

typedef struct _somestruct {
    int a;
};

static void checkstruct (volatile struct _somestruct *palignedvar)
{
    if ((size_t)palignedvar & 0xF)
        printf("structure misaligned!\n");
}

void somefunc(int a, int b, int c) {
    __attribute__((aligned (16))) volatile struct _somestruct alignedvar;

    while (1)
    {
        /* [other code] */
        if (a) {
            if (c) {
                /* [other code] */
                alignedvar.a = c;
                checkstruct(&alignedvar);
            } else {
                /* [other code] */
                break;
            }
        } else {
            if (b) {
                /* [other code] */
                alignedvar.a = a;
                checkstruct(&alignedvar);
            } else {
                if (c) {
                    break;
                } else {
                    /* [other code] */
                    alignedvar.a = a;
                    checkstruct(&alignedvar);
                }
            }
        }
        /* [other code] */
    }
}

I analyzed the generated assembly code. GCC reserves an area big enough to hold
the structure plus padding, so it can align the structure dynamically at
runtime. It stores a pointer to the reserved area and a pointer to the
structure within the area. As long as the code is simple, GCC uses the pointer
to the structure to access the data. However, if the code is complex enough,
GCC mistakenly uses the pointer to the reserved area - which of course is
sometimes not properly aligned. As a result, also the data of the structure
members are read/write incorrectly.

the stack is organized like this (the order may not match as showed in this
abstracted example):

struct {
    void *reserved_area;     /* this is the pointer GCC sometimes accidently
grabs */
    void *aligned_structure; /* this is the pointer GCC should always grab */

    char reserved[sizeof(structure) + sizeof(padding)];
};

I encountered this bug with -O3, I don't know if GCC also generates broken code
without optimizations. I tried to create a simple test case that triggers the
problem, but I failed. I'm going to do that in the next few days.

------- Comment #6 From Andrew Pinski 2007-01-13 01:58 -------
I am implementing something for this.

------- Comment #7 From H.J. Lu 2007-10-03 22:04 -------
What is the performance impact of

http://gcc.gnu.org/ml/gcc-patches/2007-05/msg01167.html

Intel compiler has a very efficient way to align the stack:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28074

It saves stack pointer in frame pointer. Can we implement it for suitable
cases/backends and properly handle

1. tail call optimization.
2. stack unwind.
3. nested functions.

------- Comment #8 From Andrew Pinski 2007-10-03 22:07 -------
Subject: Re:  attribute((aligned)) doesn't work for variables on the stack for
greater than required alignement

On 3 Oct 2007 22:04:28 -0000, hjl at lucon dot org
<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #7 from hjl at lucon dot org  2007-10-03 22:04 -------
> What is the performance impact of
>
> http://gcc.gnu.org/ml/gcc-patches/2007-05/msg01167.html

The performance impact is non if the variables don't need aligned.
Otherwise you get a small penality at the very begining for the
alignment of the variable itself.  Really this is only to be used with
big alignments like 128byte alignment (for using with a DMA system
like in the Cell).

-- Pinski

------- Comment #9 From H.J. Lu 2007-10-03 22:17 -------
(In reply to comment #8)
> Subject: Re:  attribute((aligned)) doesn't work for variables on the stack for
> greater than required alignement
> 
> On 3 Oct 2007 22:04:28 -0000, hjl at lucon dot org
> <gcc-bugzilla@gcc.gnu.org> wrote:
> >
> >
> > ------- Comment #7 from hjl at lucon dot org  2007-10-03 22:04 -------
> > What is the performance impact of
> >
> > http://gcc.gnu.org/ml/gcc-patches/2007-05/msg01167.html
> 
> The performance impact is non if the variables don't need aligned.
> Otherwise you get a small penality at the very begining for the
> alignment of the variable itself.  Really this is only to be used with
> big alignments like 128byte alignment (for using with a DMA system
> like in the Cell).
> 

What is the performance if the stack alignment adjustment is required in
all functions with floating point variables on stack?

------- Comment #10 From H.J. Lu 2007-10-03 22:27 -------
For backend with frame pointer and working -fomit-frame-pointer -g, can
we

1. Make -fomit-frame-pointer per function, instead of per file.
2. Enable -fomit-frame-pointer for functions which need stack alignment.
3. Mark frame-pointer as reserved and use frame pointer to save stack
pointer.

and make sure that

1. tail call optimization.
2. stack unwind.
3. nested functions.
4. inline functions

work properly?

------- Comment #11 From Andrew Pinski 2007-10-03 22:40 -------
(In reply to comment #7)
> It saves stack pointer in frame pointer. Can we implement it for suitable
> cases/backends and properly handle

This only helps x86 really.  If you look at my patch, it already implements
(correctly) handling large cases like 128byte alignment (which people use with
the Cell).  What you are proposing will cause more stack to be used than
actually required and more complex for the normal case.  If you look at my
patch, you will see it handles 1-4 issues nicely without any problems (because
the stack itself is not realigned).  Oh on PPC, the stack pointer has to be
correct so you cannot use frame pointer to be the old stack pointer.

------- Comment #12 From H.J. Lu 2007-10-04 00:01 -------
(In reply to comment #11)

> This only helps x86 really.  If you look at my patch, it already implements
> (correctly) handling large cases like 128byte alignment (which people use with
> the Cell).  What you are proposing will cause more stack to be used than
> actually required and more complex for the normal case.  If you look at my
> patch, you will see it handles 1-4 issues nicely without any problems (because
> the stack itself is not realigned).  Oh on PPC, the stack pointer has to be
> correct so you cannot use frame pointer to be the old stack pointer.

Does your patch handle register spill which needs a larger alignment? What
is the impact of your approach on performance when stack alignment is needed
for local variable as well as register spill?

------- Comment #13 From Andrew Pinski 2008-01-03 17:47 -------
I am getting tried of pinging this patch, I guess if nobody wants to comment
that is up to them.

------- Comment #14 From H.J. Lu 2008-07-31 01:04 -------
(In reply to comment #0)
> Running the program below compiled with "-mpreferred-stack-boundary=2"
> gets a "segmentation fault" because the variable "tmp"
> is not properly aligned on a 16-byte boundary (required for
> movaps), violating the aligned(16) request in the attribute.
> 
> void f()
> {
>   unsigned long tmp[4] __attribute__((aligned(16)));
>   asm("movaps %%xmm0, (%0)" : : "r" (tmp) : "memory");
> }
> 
> int main()
> {
>   f();
> }

This should work with gcc 4.4 revision 138335.

------- Comment #15 From Andrew Pinski 2008-07-31 01:05 -------
Subject: Re:  attribute((aligned)) doesn't work for variables on the stack for
greater than required alignement

> This should work with gcc 4.4 revision 138335.

Only on x86 and not on any other target ...

-- Pinski

------- Comment #16 From H.J. Lu 2009-03-05 14:02 -------
*** Bug 39373 has been marked as a duplicate of this bug. ***

------- Comment #17 From bernd_afa 2009-04-16 09:22 -------
I get same align problem on 68k amigaos Target.the rport and fix is old.

its a middle end bug and i see the fix is not in the source i download (4.3.3)
i can test this patch if you like, or have you something more new ?  

Here is mail i get last in gcc ML

http://gcc.gnu.org/ml/gcc/2009-04/msg00395.html

------- Comment #18 From David Conrad 2009-08-20 08:49 -------
This still doesn't work on ARM either (tested with 4.4.0). The EABI only
mandates the stack be 8 byte aligned, and gcc silently clips any alignment
request above 8 bytes to 8 (so even if the stack were 16-byte aligned by
accident the variables still wouldn't be.)

Even a simple sp -= sp & (align-1) for every function with variables needing
more alignment would be faster than unaligned NEON loads/stores.

Bug List: (This bug is not in your last search results)   Show last search results      Search page      Enter new bug