This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Severe problems with vectorizing stuff in 4.0.3 HEAD


It indicated that sibling calling optimization in main should
be disabled for targets that need to up the stack alignment,
otherwise you get the stack alignment of a lower one than
While that may be true, I think the problem is broader.

I took out the main1() function and put it into a separate
file, and compiled just that. So now there is no carnal
knowledge of main or its stack alignment. The generated
code for this stand-alone main1() makes no attempt to
align the stack or the stack variables it is going to be
passing to the movdqa instruction. Unless thats what you
mean by:
that is required.  You have to look to see what changed
between 3.4.0 and 4.0.0 that caused this since it is a
regression.  I think the issue is that we are detecting them
at the tree level but not rejecting them when expanding.  So you
have to look at the expand functions for that.

You're using internals verbiage thats beyond me :) I'm a simple porter, I have very little understanding of the actual internals of GCC.

The reason why nobody notices this before is because most x86 OS's
now a days align their stack going into main as 16byte aligned
which was what my comment about fixing your OS was about, it was
more of a joke rather than anything else.
Ok I appologise Andrew. I took it as a SCO-bash. My bad.

However, I dont think the stack being aligned on a 16-byte
boundary into main will help, unless GCC is assuming (and I
dont see how it possibly could) that every function would
likewise be aligned. The fact that a stand-alone version of
main1() was not correctly aligned leads me to believe that
the real error is that gcc is not making an attempt to
align the stack variables for use by the alignment-sensitive
vector insns.

Also, when you say "stack going into main is 16 byte aligned",
what specifically do you mean? that its 16-byte aligned before
the call to main() itself? That at the first insn in main, most
likely a push %ebp, its 16-byte aligned (i.e does the call
to main from crt1.o have to take the push of the return address
into account)?

Kean

PS, here is the generated assembly for main() as a stand-alone
function, nothing else defined in the .c file:

	.file	"foo.c"
	.version	"01.01"
	.section	.rodata
	.align 32
	.type	C.0.1458, @object
	.size	C.0.1458, 32
C.0.1458:
	.long	0
	.long	3
	.long	6
	.long	9
	.long	12
	.long	15
	.long	18
	.long	21
	.text
	.align 16
	.globl	main1
	.type	main1, @function
main1:
	pushl	%ebp
	movl	$8, %ecx
	movl	%esp, %ebp
	pushl	%edi
	cld
	pushl	%esi
	leal	-40(%ebp), %edi
	subl	$64, %esp
	movl	$C.0.1458, %esi
	rep
	movsl
	xorl	%edx, %edx
	leal	-40(%ebp), %esi
	leal	-72(%ebp), %ecx
	.align 16
.L2:
	leal	0(,%edx,4), %eax
	addl	$4, %edx
	cmpl	$8, %edx
	movdqa	(%esi,%eax), %xmm0
	movdqa	%xmm0, (%ecx,%eax)
	jne	.L2
	movb	$1, %dl
	.align 16
.L4:
	movl	-4(%ecx,%edx,4), %eax
	cmpl	-4(%esi,%edx,4), %eax
	jne	.L14
	incl	%edx
	cmpl	$9, %edx
	jne	.L4
	addl	$64, %esp
	xorl	%eax, %eax
	popl	%esi
	popl	%edi
	popl	%ebp
	ret
.L14:
	call	abort
	.size	main1, .-main1
	.ident	"GCC: (GNU) 4.0.3 20051013 (prerelease)"

# cat foo.c
#define N 8

int main1 ()
{
  int b[N] = {0,3,6,9,12,15,18,21};
  int a[N];
  int i;

  for (i = 0; i < N; i++)
    {
      a[i] = b[i];
    }

  /* check results:  */
  for (i = 0; i < N; i++)
    {
      if (a[i] != b[i])
        abort ();
    }

  return 0;
}


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]