This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/53383] Allow -mpreferred-stack-boundary=3 on x86-64
- From: "luto at mit dot edu" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Sun, 05 Jul 2015 20:49:22 +0000
- Subject: [Bug target/53383] Allow -mpreferred-stack-boundary=3 on x86-64
- Auto-submitted: auto-generated
- References: <bug-53383-4 at http dot gcc dot gnu dot org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53383
--- Comment #21 from Andy Lutomirski <luto at mit dot edu> ---
(In reply to H.J. Lu from comment #20)
> (In reply to Andy Lutomirski from comment #19)
> > I don't think the fix is correct.
> >
> > This works:
> >
> > gcc -mno-sse -mpreferred-stack-boundary=3 ...
> >
> > This does not:
> >
> > gcc -mno-sse -mpreferred-stack-boundary=3 -mincoming-stack-boundary=3 ...
> >
>
> Please provide a testcase.
No code needed:
$ touch foo.c
$ gcc -c -mno-sse -mpreferred-stack-boundary=3 -mincoming-stack-boundary=3
foo.c
foo.c:1:0: error: -mincoming-stack-boundary=3 is not between 4 and 12
$ gcc -c -mno-sse -mpreferred-stack-boundary=3 foo.c
>
> > This makes no sense, since they should be equivalent.
> >
> > Also, I find the docs to be unclear as to what different values of the
> > incoming and preferred stack boundaries mean.
> >
> > Finally, why is -mno-sse required in order to set a low stack boundary?
> > Couldn't gcc figure out that the existence of a stack variable (SSE,
> > alignas, __attribute__((aligned(32))), etc) should force dynamic stack
> > alignment?
>
> Since the x86-86 psABI says that stack must be 16 byte aligned, if the stack
> isn't 16-byte aligned, the code with SSE insn, which follows the psABI,
> will crash when called with 8-byte aligned stack.
I'm confused here. I agree in principle, but I don't actually think that gcc
works this way, or, if it does, it shouldn't.
If I compile with -mpreferred-stack-boundary=3 and create an aligned(32) local
variable, then gcc will dynamically align the stack and the variable will have
correct alignment even if the incoming stack was not 16-byte aligned.
Shouldn't an SSE variable work exactly the same way? That is, if gcc is
generating an SSE instruction with a memory reference to an on-stack variable
that requires 16-byte alignment (movdqa, for example), wouldn't that variable
be effectively aligned(16) or greater and thus trigger dynamic stack alignment.
Sure, the generated SSE code will be less efficient with
-mpreferred-stack-boundary=3 (because neither "and $-16,%rsp" nor the required
frame pointer is free), but it should still work, right?