[Bug target/53383] Allow -mpreferred-stack-boundary=3 on x86-64

Sun Jul 5 21:04:00 GMT 2015

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53383

--- Comment #22 from H.J. Lu <hjl.tools at gmail dot com> ---
(In reply to Andy Lutomirski from comment #21)

> $ touch foo.c
> $ gcc -c -mno-sse -mpreferred-stack-boundary=3 -mincoming-stack-boundary=3
> foo.c
> foo.c:1:0: error: -mincoming-stack-boundary=3 is not between 4 and 12
> $ gcc -c -mno-sse -mpreferred-stack-boundary=3 foo.c

I will fix it.

> > 
> > > This makes no sense, since they should be equivalent.
> > > 
> > > Also, I find the docs to be unclear as to what different values of the
> > > incoming and preferred stack boundaries mean.
> > > 
> > > Finally, why is -mno-sse required in order to set a low stack boundary? 
> > > Couldn't gcc figure out that the existence of a stack variable (SSE,
> > > alignas, __attribute__((aligned(32))), etc) should force dynamic stack
> > > alignment? 
> > 
> > Since the x86-86 psABI says that stack must be 16 byte aligned, if the stack
> > isn't 16-byte aligned,  the code with SSE insn, which follows the psABI,
> > will crash when called with 8-byte aligned stack.
> 
> I'm confused here.  I agree in principle, but I don't actually think that
> gcc works this way, or, if it does, it shouldn't.
> 
> If I compile with -mpreferred-stack-boundary=3 and create an aligned(32)
> local variable, then gcc will dynamically align the stack and the variable
> will have correct alignment even if the incoming stack was not 16-byte
> aligned.

That is correct.

> Sure, the generated SSE code will be less efficient with
> -mpreferred-stack-boundary=3 (because neither "and $-16,%rsp" nor the
> required frame pointer is free), but it should still work, right?

It works only if ALL codes are compiled with -mpreferred-stack-boundary=3.