Bug 81769 - Unnecessary stack realign with -mavx
Summary: Unnecessary stack realign with -mavx
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 8.0
: P3 normal
Target Milestone: 8.0
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-08-08 12:39 UTC by H.J. Lu
Modified: 2017-10-24 11:18 UTC (History)
0 users

See Also:
Host:
Target: x86
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description H.J. Lu 2017-08-08 12:39:54 UTC
[hjl@gnu-6 pr59501]$ cat k.i
typedef int v8si __attribute__ ((vector_size (32)));

void
foo (unsigned long long *idx, v8si *out_start, v8si *regions)
{
  if (*idx < 20 ) {
    v8si base = regions[*idx];
    *out_start = base;
  }
}
[hjl@gnu-6 pr59501]$ make k.s
/export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc -B/export/build/gnu/gcc/build-x86_64-linux/gcc/ -O -mregparm=3 -m32 -mavx -S k.i
[hjl@gnu-6 pr59501]$ cat k.s
	.file	"k.i"
	.text
	.globl	foo
	.type	foo, @function
foo:
.LFB0:
	.cfi_startproc
	pushl	%ebp
	.cfi_def_cfa_offset 8
	.cfi_offset 5, -8
	movl	%esp, %ebp
	.cfi_def_cfa_register 5
	pushl	%ebx
	andl	$-32, %esp  <<<<<<<<<<< This isn't needed.
	.cfi_offset 3, -12
	movl	(%eax), %ebx
	cmpl	$0, 4(%eax)
	jne	.L1
	cmpl	$19, %ebx
	ja	.L1
	sall	$5, %ebx
	vmovdqa	(%ecx,%ebx), %ymm0
	vmovdqa	%ymm0, (%edx)
.L1:
	movl	-4(%ebp), %ebx
	leave
	.cfi_restore 5
	.cfi_restore 3
	.cfi_def_cfa 4, 4
	ret
	.cfi_endproc
.LFE0:
	.size	foo, .-foo
	.ident	"GCC: (GNU) 8.0.0 20170807 (experimental)"
	.section	.note.GNU-stack,"",@progbits
[hjl@gnu-6 pr59501]$
Comment 1 hjl@gcc.gnu.org 2017-09-05 16:39:55 UTC
Author: hjl
Date: Tue Sep  5 16:39:24 2017
New Revision: 251718

URL: https://gcc.gnu.org/viewcvs?rev=251718&root=gcc&view=rev
Log:
i386: Avoid stack realignment if possible

ix86_finalize_stack_frame_flags has been extended to eliminate frame
pointer when the new stack frame isn't needed with and without
-maccumulate-outgoing-args as well as -fomit-frame-pointer.  Since stack
access with larger alignment may be optimized out, to decide if stack
realignment is needed, we need to not only check for stack frame access,
but also verify the alignment of stack frame access.  Since alignment of
memory access via arg_pointer is set up by caller, not by callee, we
should find the maximum stack alignment from the stack frame access
instructions via stack pointer and frame pointrer to avoid stack
realignment when stack alignment needed is less than incoming stack
boundary.

gcc/

	PR target/59501
	PR target/81624
	PR target/81769
	* config/i386/i386.c (ix86_finalize_stack_frame_flags): Don't
	realign stack if stack alignment needed is less than incoming
	stack boundary.

gcc/testsuite/

	PR target/59501
	PR target/81624
	PR target/81769
	* gcc.target/i386/pr59501-4a.c: Remove xfail.
	* gcc.target/i386/pr81769-1a.c: New test.
	* gcc.target/i386/pr81769-1b.c: Likewise.
	* gcc.target/i386/pr81769-2.c: Likewise.

Added:
    trunk/gcc/testsuite/gcc.target/i386/pr81769-1a.c
    trunk/gcc/testsuite/gcc.target/i386/pr81769-1b.c
    trunk/gcc/testsuite/gcc.target/i386/pr81769-2.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/i386.c
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/gcc.target/i386/pr59501-4a.c
Comment 2 H.J. Lu 2017-10-24 11:18:40 UTC
Fixed.