This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

ix86 `double' alignment (was Re: egcs-1.1 release schedule)


>  In message <rzqaf79nru7.fsf@djlvig.dl.ac.uk>you write:
>  > _Please_ include some means of allowing Fortran (at least) to get
>  > stack-allocated doubles double-aligned on x86 (modulo libc).  (I hope
>  > I haven't missed this going in at some stage!)  The one-line patch for
>  > STACK_BOUNDARY used by g77 0.5.22 is good enough.
>I'm still waiting on some kind of solution that doesn't totally break
>the ABI.

I'm willing to do some serious work to make this happen for 1.1,
which assumes it can be done in the next couple of weeks, right?

>To do this "right" you have to:
>
>	* Make sure gcc always allocates stack in multiples of 8 bytes,
>	  adding dummy outgoing args as necessary to keep the stack
>	  properly aligned at call points.
>
>	  You can't do this with STACK_BOUNDARY since that says we
>	  will 100% always have a properly aligned stack, which can
>	  never be true since we might be linking in code from
>	  another compiler which didn't keep the stack suitably
>	  aligned.

For Fortran code, we can usually hand-wave that; this case would
only come up when the call tree has an *embedded* procedure
that doesn't maintain proper alignment, and since the big
computational problem with g77 performance is in code compiled
by g77, and such code is rarely called by C code, I don't think
this would represent a huge deficiency.

>	  If the stack gets mis-aligned relative to STACK_BOUNDARY
>	  combine could end up removing a seemingly useless
>	  stack operation/address calculation.

I don't understand this, but presumably I need to look into it
further.

>	  The idea is to make sure the stack is 8 byte aligned in the
>	  common cases, but not absolutely rely on it for correct code
>	  generation.

Absolutely.

>	* Second, assuming that gcc always keeps the pointer aligned
>	  for itself, then arrange for doubles to end up 8 byte
>	  aligned relative to the stack pointer.
>
>	  If the stack gets mis-aligned due to an old module, then
>	  our doubles won't be aligned correctly, but the vast majority
>	  of the time they will be suitably aligned.
>
>	  I don't think there's any mechanism to do this when the
>	  desired alignment is less than STACK_BOUNDARY.  I fact
>	  I know that to be the case since I worked on a similar
>	  problem recently.

Okay, that makes sense to me.  We want to hit a majority of cases
anyway.  We don't care (for now) about cases where users are
combining multiple languages in weird ways, for example.

>	* The ABI is still going to mandate that some doubles in
>	  argument lists are going to be mis-aligned.  We'd have
>	  to arrange to copy them from the arglist into a suitable
>	  stack slot.  This may be more trouble than its worth.

I'm not sure how this can ever happen in the x86 architecture?

Well, I mean, not when passing argument by reference, which is
generally how g77 works anyway.

>Note that some non-ABI breaking changes to align doubles and other
>values have gone into the x86 compiler.  In particular we should be
>properly aligning all data in the static store.

Right.  The Next Big Thing is to, by default, 64-bit-align any
stack-based VAR_DECLs.  Just doing that would be Great.

What I'd like to see, and think wouldn't be too hard, is a change
that'd leave TYPE_ALIGN for doubles at 32, so g77 would still
be able to produce COMMON and EQUIVALENCE blocks ("aggregates")
containing doubles without breaking the ABI or rejecting
standard-conforming code.  (Never mind that g77 already does this
for systems like SPARC; SPARC users expect that, apparently,
while x86 users don't, mostly.)

But this change would set DECL_ALIGN for stack-based VAR_DECLs
to 64, and implement that, presumably by assuring that the
stack frame is itself 64-bit aligned.

What I don't know (having not looked into it in any detail) is
how best to ensure the stack frame is 64-bit aligned.  Presumably
%sp will always be 32-bit aligned upon entry to any procedure
(according to the ABI; perhaps the hardware?).  Is it reasonable
to just subtract an extra 8 bytes when creating the frame
pointer upon procedure entry and then NAND it with 7 to align
it?  Or would that make for problems with debugger, profiling,
and/or exception support, or is there no quick way to NAND the
frame pointer on the x86?

It seems like everyone else thinks the right way to do this is
to try to always assure %sp is 64-bit aligned across calls by
modifying all the code that is in the procedure-call chain.
That probably means an extra dummy push before odd-number-of-args
calls, etc., right?

The reason I'd generally prefer the former approach to the latter
is that either one is likely to cost some performance, but the
latter *always* costs performance since the caller doesn't know
whether the callee uses doubles, whereas the former costs only
when the procedure doing the extra dance actually uses doubles.
(Whether we can teach gcc to not do the NAND(%fp,7) if there
are no doubles on the stack is another issue.)

        tq vm, (burley)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]