Bug 61296 - Excessive alignment in ix86_data_alignment
Summary: Excessive alignment in ix86_data_alignment
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.9.0
: P3 normal
Target Milestone: 5.0
Assignee: Not yet assigned to anyone
URL:
Keywords:
: 54167 70451 (view as bug list)
Depends on:
Blocks:
 
Reported: 2014-05-23 16:49 UTC by H.J. Lu
Modified: 2021-09-12 07:47 UTC (History)
6 users (show)

See Also:
Host:
Target: x86
Build:
Known to work:
Known to fail:
Last reconfirmed: 2014-12-11 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description H.J. Lu 2014-05-23 16:49:45 UTC
ix86_data_alignment was introduced by

https://gcc.gnu.org/ml/gcc-patches/2000-06/msg00871.html

It aligns struct larger than 32 bytes to 32 bytes. This change

https://gcc.gnu.org/ml/gcc-patches/2001-03/msg01356.html

aligns struct/union/array bigger than 16 bytes to 16 bytes, which
causes PR 56564 and leads to DATA_ABI_ALIGNMENT.  When OPT is false,
ix86_data_alignment may return alignment bigger than ABI required.
It happens when the references is bound to the current definition.
It improves the performance when data can be accessed with the biggest
alignment.  If data is aligned bigger than the biggest alignment, we
may not get performance benefit while wasting alignment padding.
Comment 1 H.J. Lu 2014-05-27 18:23:44 UTC
The comdat definition needs to the biggest alignment
generated by any compilers.
Comment 2 H.J. Lu 2014-05-30 16:52:49 UTC
After r199898, DATA_ALIGNMENT is only for optimization purposes.
Align struct >= 64 bytes to 64 bytes may increase data size due
to excessive alignment.
Comment 3 H.J. Lu 2014-06-11 16:38:41 UTC
Comments for DATA_ALIGNMENT

   One use of this macro is to increase alignment of medium-size
   data to make it all fit in fewer cache lines.  Another is to
   cause character arrays to be word-aligned so that `strcpy' calls
   that copy constants to character arrays can be done inline.

was added by

https://gcc.gnu.org/git/?p=gcc.git;a=patch;h=ed45e834f305d1f2709bf200a13d5beebc2fcfee

to improve x86 FP performance, which might be partially copied from
CONSTANT_ALIGNMENT:

https://gcc.gnu.org/git/?p=gcc.git;a=patch;h=f7d6703c5d83fc9fb06246d6eb49e9b61098045c
Comment 4 Jan Hubicka 2014-12-11 17:21:42 UTC
Confirmed. I think this is counts as a regression because the alignment slowly increased as SSE and AVX was added (and it is regression to pre-2001 compilers).  We currently produce 5-15% bigger data sections than LLVM on Firefox/libreoffice.
Comment 5 Jakub Jelinek 2014-12-11 17:33:17 UTC
I doubt we can do anything about it though, decreasing DATA_ALIGNMENT would break backwards compatibility with older gcc versions.
Comment 6 H.J. Lu 2014-12-11 17:37:56 UTC
(In reply to Jakub Jelinek from comment #5)
> I doubt we can do anything about it though, decreasing DATA_ALIGNMENT would
> break backwards compatibility with older gcc versions.

We need a testcase to verify the claim.
If it is true, GCC is incompatible with
ICC nor LLVM. We aren't conform to psABI.
Comment 7 Jakub Jelinek 2014-12-11 23:24:56 UTC
See discussions when I've added DATA_ABI_ALIGNMENT.
Comment 8 H.J. Lu 2014-12-16 13:21:10 UTC
(In reply to Jakub Jelinek from comment #7)
> See discussions when I've added DATA_ABI_ALIGNMENT.

DATA_ABI_ALIGNMENT was added for PR 56564:

/* Similar to DATA_ALIGNMENT, but for the cases where the ABI mandates
   some alignment increase, instead of optimization only purposes.  E.g.
   AMD x86-64 psABI says that variables with array type larger than 15 bytes
   must be aligned to 16 byte boundaries.

   If this macro is not defined, then ALIGN is used.  */

#define DATA_ABI_ALIGNMENT(TYPE, ALIGN) \
  ix86_data_alignment ((TYPE), (ALIGN), false)

and ix86_data_alignment was changed by

https://gcc.gnu.org/ml/gcc-patches/2014-01/msg01086.html

There is a discussion we should always align to DATA_ABI_ALIGNMENT:

https://gcc.gnu.org/ml/gcc-patches/2014-05/msg01435.html

My question was "Should we limit DATA_ALIGNMENT to MAX (ABI alignment,
natural alignment)?"

GCC will only use the excessive alignment with locally defined data,
which has no psABI implications:

[hjl@gnu-6 pr61296]$ cat z.c
struct foo
{
  char i[128];
};

struct foo x = { 1 };
struct foo y = { 1 };
struct foo z = { 1 };

void
bar ()
{
  int i;

  for (i = 0; i < sizeof (x.i); i++)
    x.i[i] = y.i[i] + z.i[i];
}

[hjl@gnu-6 pr61296]$ /export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc -B/export/build/gnu/gcc/build-x86_64-linux/gcc/ -O3 -mavx2 -S z.c -fPIC   
[hjl@gnu-6 pr61296]$ cat z.s
	.file	"z.c"
	.section	.text.unlikely,"ax",@progbits
.LCOLDB0:
	.text
.LHOTB0:
	.p2align 4,,15
	.globl	bar
	.type	bar, @function
bar:
.LFB0:
	.cfi_startproc
	leaq	8(%rsp), %r10
	.cfi_def_cfa 10, 0
	movq	y@GOTPCREL(%rip), %rdi
	andq	$-32, %rsp
	pushq	-8(%r10)
	pushq	%rbp
	.cfi_escape 0x10,0x6,0x2,0x76,0
	movq	%rsp, %rbp
	pushq	%r12
	pushq	%r10
	.cfi_escape 0xf,0x3,0x76,0x70,0x6
	.cfi_escape 0x10,0xc,0x2,0x76,0x78
	movq	%rdi, %r10
	pushq	%rbx
	.cfi_escape 0x10,0x3,0x2,0x76,0x68
	negq	%r10
	andl	$31, %r10d
	je	.L7
	movq	x@GOTPCREL(%rip), %r8
	movq	z@GOTPCREL(%rip), %r9
	movl	%r10d, %esi
	xorl	%eax, %eax
	xorl	%edx, %edx
	movl	$128, %r11d
	.p2align 4,,10
	.p2align 3
.L3:
	movzbl	(%rdi,%rax), %ecx
	addl	$1, %edx
	addb	(%r9,%rax), %cl
	movb	%cl, (%r8,%rax)
	movl	%r11d, %ecx
	addq	$1, %rax
	subl	%edx, %ecx
	cmpl	%r10d, %edx
	jne	.L3
	movl	%edx, %eax
	movl	%ecx, %ebx
	movl	$96, %r11d
	movl	$3, %r12d
.L2:
	leaq	(%r9,%rax), %rdx
	leaq	(%rdi,%rax), %r10
	addq	%r8, %rax
	cmpl	$4, %r12d
	vmovdqu	(%rdx), %xmm0
	vinserti128	$0x1, 16(%rdx), %ymm0, %ymm0
	vpaddb	(%r10), %ymm0, %ymm0
	vmovups	%xmm0, (%rax)
	vextracti128	$0x1, %ymm0, 16(%rax)
	vmovdqu	32(%rdx), %xmm0
	vinserti128	$0x1, 48(%rdx), %ymm0, %ymm0
	vpaddb	32(%r10), %ymm0, %ymm0
	vmovups	%xmm0, 32(%rax)
	vextracti128	$0x1, %ymm0, 48(%rax)
	vmovdqu	64(%rdx), %xmm0
	vinserti128	$0x1, 80(%rdx), %ymm0, %ymm0
	vpaddb	64(%r10), %ymm0, %ymm0
	vmovups	%xmm0, 64(%rax)
	vextracti128	$0x1, %ymm0, 80(%rax)
	jne	.L4
	vmovdqu	96(%rdx), %xmm0
	vinserti128	$0x1, 112(%rdx), %ymm0, %ymm0
	vpaddb	96(%r10), %ymm0, %ymm0
	vmovups	%xmm0, 96(%rax)
	vextracti128	$0x1, %ymm0, 112(%rax)
.L4:
	leal	(%r11,%rsi), %eax
	subl	%r11d, %ecx
	cmpl	%r11d, %ebx
	leal	(%rax,%rcx), %esi
	je	.L10
	.p2align 4,,10
	.p2align 3
.L5:
	movslq	%eax, %rdx
	addl	$1, %eax
	movzbl	(%r9,%rdx), %ecx
	addb	(%rdi,%rdx), %cl
	cmpl	%esi, %eax
	movb	%cl, (%r8,%rdx)
	jne	.L5
.L10:
	vzeroupper
	popq	%rbx
	popq	%r10
	.cfi_remember_state
	.cfi_def_cfa 10, 0
	popq	%r12
	popq	%rbp
	leaq	-8(%r10), %rsp
	.cfi_def_cfa 7, 8
	ret
	.p2align 4,,10
	.p2align 3
.L7:
	.cfi_restore_state
	movl	$128, %r11d
	movl	$4, %r12d
	movl	$128, %ebx
	xorl	%eax, %eax
	movl	$128, %ecx
	xorl	%esi, %esi
	movq	x@GOTPCREL(%rip), %r8
	movq	z@GOTPCREL(%rip), %r9
	jmp	.L2
	.cfi_endproc
.LFE0:
	.size	bar, .-bar
	.section	.text.unlikely
.LCOLDE0:
	.text
.LHOTE0:
	.globl	z
	.data
	.align 64
	.type	z, @object
	.size	z, 128
z:
	.byte	1
	.zero	127
	.globl	y
	.align 64
	.type	y, @object
	.size	y, 128
y:
	.byte	1
	.zero	127
	.globl	x
	.align 64
	.type	x, @object
	.size	x, 128
x:
	.byte	1
	.zero	127
	.ident	"GCC: (GNU) 5.0.0 20141205 (experimental)"
	.section	.note.GNU-stack,"",@progbits

Do you have a testcase to show decreasing DATA_ALIGNMENT would break
backwards compatibility with older gcc versions?
Our data show that the excessive alignment doesn't improve performance.
Comment 9 Jakub Jelinek 2014-12-16 13:31:23 UTC
(In reply to H.J. Lu from comment #8)
> Do you have a testcase to show decreasing DATA_ALIGNMENT would break
> backwards compatibility with older gcc versions?

Older GCC versions used DATA_ALIGNMENT (what is now DATA_ABI_ALIGNMENT) on MEM_ALIGN etc., and gradually more and more optimizations relied on it, even when the data was defined in some other compilation unit or was comdat or could be interposed.  See e.g. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56564#c9
Comment 10 H.J. Lu 2014-12-16 13:40:59 UTC
(In reply to Jakub Jelinek from comment #9)
> (In reply to H.J. Lu from comment #8)
> > Do you have a testcase to show decreasing DATA_ALIGNMENT would break
> > backwards compatibility with older gcc versions?
> 
> Older GCC versions used DATA_ALIGNMENT (what is now DATA_ABI_ALIGNMENT) on
> MEM_ALIGN etc., and gradually more and more optimizations relied on it, even
> when the data was defined in some other compilation unit or was comdat or
> could be interposed.  See e.g.
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56564#c9

I still don't see that limit DATA_ALIGNMENT to MAX (ABI alignment,
natural alignment) will cause additional ABI problems which don't
exist today.
Comment 11 Jakub Jelinek 2014-12-16 13:47:15 UTC
If you hit the assumption beyond what ABI mandates on some public symbol issue in some older GCC version, then sure, if you have that public symbol defined by ICC, it will misbehave.  But, if it is compiled with current GCC without your proposed changes, it will work fine, newer GCCs don't assume anything beyond ABI mandates alignments, unless they control the definition and uses bind locally to it, but still align as optimization as much or more as older GCC used to align.
With your proposed changes that would be no longer true.  What is so hard to understand on it?
Comment 12 H.J. Lu 2014-12-16 13:53:25 UTC
(In reply to Jakub Jelinek from comment #11)
> If you hit the assumption beyond what ABI mandates on some public symbol
> issue in some older GCC version, then sure, if you have that public symbol
> defined by ICC, it will misbehave.  But, if it is compiled with current GCC
> without your proposed changes, it will work fine, newer GCCs don't assume
> anything beyond ABI mandates alignments, unless they control the definition
> and uses bind locally to it, but still align as optimization as much or more
> as older GCC used to align.
> With your proposed changes that would be no longer true.  What is so hard to
> understand on it?

What is the maximum alignment the older GCCs may assume?
Comment 13 Jakub Jelinek 2014-12-16 13:58:05 UTC
Read the sources?  It really depends on many factors.
Comment 14 H.J. Lu 2014-12-16 15:51:48 UTC
I got

[hjl@gnu-6 pr61296]$ cat a.c
struct foo
{
  char i[128];
};

struct foo x = { 1 };
[hjl@gnu-6 pr61296]$ make a.s
/export/build/gnu/gcc-misc/build-x86_64-linux/gcc/xgcc -B/export/build/gnu/gcc-misc/build-x86_64-linux/gcc/ -O3 -mavx2 -S a.c
[hjl@gnu-6 pr61296]$ cat a.s
	.file	"a.c"
	.globl	x
	.data
	.align 64
	.type	x, @object
	.size	x, 128
x:
	.byte	1
	.zero	127
	.ident	"GCC: (GNU) 5.0.0 20141216 (experimental)"
	.section	.note.GNU-stack,"",@progbits
[hjl@gnu-6 pr61296]$ 

Which older GCC expects 64-byte alignment here?  We should limit
DATA_ALIGNMENT to the maximum alignment actually expected by the
older GCC, which I believe is 32 byte, and have an option to limit
DATA_ALIGNMENT to MAX (ABI alignment, natural alignment).
Comment 15 H.J. Lu 2014-12-16 19:24:40 UTC
(In reply to Jakub Jelinek from comment #13)
> Read the sources?  It really depends on many factors.

There are

  int max_align_compat
    = optimize_size ? BITS_PER_WORD : MIN (256, MAX_OFILE_ALIGNMENT);

With -Os, we aren't compatible with the older GCC anyway.
Comment 16 hjl@gcc.gnu.org 2014-12-17 14:23:29 UTC
Author: hjl
Date: Wed Dec 17 14:22:57 2014
New Revision: 218818

URL: https://gcc.gnu.org/viewcvs?rev=218818&root=gcc&view=rev
Log:
Add -malign-data={abi|compat|cachineline}

Add -malign-data={abi|compat,cachineline} to control how GCC aligns
variables.  "compat" uses increased alignment value compatible with
GCC 4.8 and earlier, "abi" uses alignment value as specified by the
psABI, and "cacheline" uses increased alignment value to match the
cache line size.  "compat" is the default.

gcc/

	PR target/61296
	* config/i386/i386-opts.h (ix86_align_data): New enum.
	* config/i386/i386.c (ix86_data_alignment): Return the ABI
	alignment value for -malign-data=abi, the cachine line size
	for -malign-data=cachineline and the older GCC compatible
	alignment value for for -malign-data=compat.
	* config/i386/i386.opt (malign-data=): New.
	* doc/invoke.texi: Document -malign-data=.

gcc/testsuite/

	PR target/61296
	* gcc.target/i386/pr61296-2.c: New.
	* gcc.target/i386/pr61296-2.c: Likewise.
	* gcc.target/i386/pr61296-3.c: Likewise.
	* gcc.target/i386/pr61296-4.c: Likewise.
	* gcc.target/i386/pr61296-5.c: Likewise.
	* gcc.target/i386/pr61296-6.c: Likewise.
	* gcc.target/i386/pr61296-7.c: Likewise.

Added:
    trunk/gcc/testsuite/gcc.target/i386/pr61296-1.c
    trunk/gcc/testsuite/gcc.target/i386/pr61296-2.c
    trunk/gcc/testsuite/gcc.target/i386/pr61296-3.c
    trunk/gcc/testsuite/gcc.target/i386/pr61296-4.c
    trunk/gcc/testsuite/gcc.target/i386/pr61296-5.c
    trunk/gcc/testsuite/gcc.target/i386/pr61296-6.c
    trunk/gcc/testsuite/gcc.target/i386/pr61296-7.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/i386-opts.h
    trunk/gcc/config/i386/i386.c
    trunk/gcc/config/i386/i386.opt
    trunk/gcc/doc/invoke.texi
    trunk/gcc/testsuite/ChangeLog
Comment 17 H.J. Lu 2014-12-17 14:50:26 UTC
Fixed in GCC 5.
Comment 18 H.J. Lu 2016-03-30 11:10:36 UTC
*** Bug 70451 has been marked as a duplicate of this bug. ***
Comment 19 Andrew Pinski 2021-09-12 07:47:18 UTC
*** Bug 54167 has been marked as a duplicate of this bug. ***