This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[RFC PATCH, i386]: Allow -mincoming-stack-boundary=3 for x86_64
- From: Uros Bizjak <ubizjak at gmail dot com>
- To: "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
- Cc: "H.J. Lu" <hjl dot tools at gmail dot com>
- Date: Sun, 4 Oct 2015 17:26:53 +0200
- Subject: [RFC PATCH, i386]: Allow -mincoming-stack-boundary=3 for x86_64
- Authentication-results: sourceware.org; auth=none
Hello!
As shown in PR 66697 [1] and WineHQ bug [2], an application can
misalign incoming stack to less than ABI mandated 16 bytes. While it
is possible to use -mincoming-stack-boundary=2 (= 4 bytes) for 32 bit
targets to emit stack realignment code, this option is artificially
limited to 4 (= 16 bytes) for 64bit targets.
Attached patches lowers this limitation to 3 (= 8 bytes, which is
actually the minimum amount that stack can be misaligned) for 64bit
targets. The "outside" code is out of users control, and the last
resort is -mincoming-stack-boundary=3 that emits realignment code for
all functions.
So, for the following testcase:
-- cut here--
typedef float v4sf __attribute__((vector_size(16)));
v4sf test (v4sf a, v4sf b)
{
volatile v4sf z = a + b;
return z;
}
--cut here--
gcc -O2 -mincoming-stack-boundary=3 generates:
0000000000000000 <test>:
0: 4c 8d 54 24 08 lea 0x8(%rsp),%r10
5: 0f 58 c8 addps %xmm0,%xmm1
8: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp
c: 41 ff 72 f8 pushq -0x8(%r10)
10: 55 push %rbp
11: 48 89 e5 mov %rsp,%rbp
14: 41 52 push %r10
16: 0f 29 4d e0 movaps %xmm1,-0x20(%rbp)
1a: 0f 28 45 e0 movaps -0x20(%rbp),%xmm0
1e: 41 5a pop %r10
20: 5d pop %rbp
21: 49 8d 62 f8 lea -0x8(%r10),%rsp
25: c3 retq
instead of:
0000000000000000 <test>:
0: 0f 58 c8 addps %xmm0,%xmm1
3: 0f 29 4c 24 e8 movaps %xmm1,-0x18(%rsp)
8: 0f 28 44 24 e8 movaps -0x18(%rsp),%xmm0
d: c3 retq
IMO, additional stack realignment code is also a good punishment for
rogue application :)
2015-10-04 Uros Bizjak <ubizjak@gmail.com>
* config/i386/i386.c (ix86_option_override_internal): Lower minimum
allowed incoming stack boundary to 3 also for 64bit SSE targets.
[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66697
[2] https://bugs.winehq.org/show_bug.cgi?id=27680
Uros.
Index: config/i386/i386.c
===================================================================
--- config/i386/i386.c (revision 228460)
+++ config/i386/i386.c (working copy)
@@ -5102,8 +5102,7 @@ ix86_option_override_internal (bool main_args_p,
ix86_incoming_stack_boundary = ix86_default_incoming_stack_boundary;
if (opts_set->x_ix86_incoming_stack_boundary_arg)
{
- int min = (TARGET_64BIT_P (opts->x_ix86_isa_flags)
- ? (TARGET_SSE_P (opts->x_ix86_isa_flags) ? 4 : 3) : 2);
+ int min = TARGET_64BIT_P (opts->x_ix86_isa_flags) ? 3 : 2;
if (opts->x_ix86_incoming_stack_boundary_arg < min
|| opts->x_ix86_incoming_stack_boundary_arg > 12)