[PATCH] 69780 - [4.9/5/6 Regression] ICE on __builtin_alloca_with_align, with small alignment

Wed Feb 17 22:51:00 GMT 2016

On Wed, Feb 17, 2016 at 01:48:26PM -0700, Martin Sebor wrote:
> I shifted the alignment so that it could be compared against
> MAX_STACK_ALIGNMENT.  But after some searching it seems as though
> MAX_STACK_ALIGNMENT is in bits, rather than bytes as I had assumed,
> so I've removed the shift.

The reason why MAX_STACK_ALIGNMENT is wrong is that on most targets
it is terribly small number (a couple of bytes usually), only i?86/x86_64 is
an exception, because it is the only target that supports dynamic stack
realignment.
All other targets do support more aligned variables than that, but because
they don't support dynamic stack realignment, they handle those more aligned
automatic variables by doing alloca instead.  Which is exactly why we need
__builtin_alloca_with_align to support those larger alignments.
There is no inherent reason why __builtin_alloca_with_align can't support
arbitrary (power of 2 > BITS_PER_UNITS of course) alignments, as long as
it fits into address space and the alignment doesn't run into other memory,
but that is the general problem of alloca, it is up to the user to ensure
he doesn't run out of the stack, and the alignment is no different.

> It's not obvious to me that this is guaranteed to be correct.
> IMO, even if it happens to be, I find it much clearer to check
> against MAX_STACK_ALIGNMENT (or whatever macro describes the
> limit if not this one).

See above, there is no macro describing such limit, it is solely about
doing pretty much __builtin_alloca (size + alignment - 1);
and realign the pointer.

> That would be incorrect because ~0U isn't neither a power of 2, nor
> the enforced stack alignment.  Again, using the actual limit encoded
> in MAX_STACK_ALIGNMENT seems correct and IMO results in much clearer
> code.

No, see above.  And, if you want the exact largest possible power of 2
smaller than ~0U, you can use (unsigned int) INTTYPE_MINIMUM (int).

> I don't mind waiting a bit for the documentation review, but I do
> feel it's important to update the documentation at the same time
> as making the change to the interface of the builtin.  The GCC
> Coding Conventions even requires it:
> 
>   Any change to documented behavior (for example, the behavior of
>   a command-line option or a GNU language extension) must include
>   the necessary changes to the manual.

But the builtin (which IMHO really was never meant to be user accessible,
but has been added before we had internal functions) is not documented yet.
So, the documentation can be added before or after that IMHO.

> FWIW, since I haven't noticed a clear preference for either of
> these two styles in the code base I decided to count the number
> of occurrences of each to see if one is prevalent.  Although
> the results are mildly in favor of the style you suggest, they
> clearly indicate the lack of consensus:

First of all, you are also counting the static (typically aggregate)
initializers, which are indeed often written as
static ... var =
{
....
};
But even then the greps I've done were like 2440 vs. 819, and that also
included the file scope initializers.

	Jakub