This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: PATCH: PR middle-end/36253: Caller-save stack slot may not have proper alignment
On Sun, May 25, 2008 at 8:33 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Sun, May 25, 2008 at 07:07:49PM +0200, Richard Guenther wrote:
>> On Sun, May 25, 2008 at 6:31 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> > On Sun, May 25, 2008 at 06:17:02PM +0200, Richard Guenther wrote:
>> >> On Sat, May 17, 2008 at 10:20 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> >> > On Sat, May 17, 2008 at 08:10:54AM -0700, H.J. Lu wrote:
>> >> >> setup_save_areas calls assign_stack_local to allocate a stack slot for
>> >> >> a hard register in the widest mode. assign_stack_local has
>> >> >>
>> >> >> if (align == 0)
>> >> >> {
>> >> >> tree type;
>> >> >>
>> >> >> if (mode == BLKmode)
>> >> >> alignment = BIGGEST_ALIGNMENT;
>> >> >> else
>> >> >> alignment = GET_MODE_ALIGNMENT (mode);
>> >> >>
>> >> >> /* Allow the target to (possibly) increase the alignment of this
>> >> >> stack slot. */
>> >> >> type = lang_hooks.types.type_for_mode (mode, 0);
>> >> >> if (type)
>> >> >> alignment = LOCAL_ALIGNMENT (type, alignment);
>> >> >>
>> >> >> alignment /= BITS_PER_UNIT;
>> >> >> }
>> >> >>
>> >> >> For x87, the widest mode is XF, which is 12byte aligned at 4 byte.
>> >> >> So a 12byte stack slot aligned at 4 byte is allocated for an x87
>> >> >> register. For x87, DF is 8 byte aligned at 8 byte. If it turns out
>> >> >> later that we need to save/restore DF instead XF, we load/store the DF
>> >> >> register at 4 byte instead of 8 byte. It won't give us the best x87
>> >> >> performance.
>> >> >>
>> >> >> This patch fixes this problem by passing NULL type and mode to
>> >> >> LOCAL_ALIGNMENT so that the backend can provide the largest alignment
>> >> >> for a caller-save stack slot.
>> >> >>
>> >> >> This patch moves the duplicated LOCAL_ALIGNMENT definitions in
>> >> >> cfgexpand.c and function.c to defaults.h. It also removed
>> >> >> the duplicated stack slot alignment code.
>> >> >>
>> >> >> Is this a right approach?
>> >>
>> >> First of all I don't like the overloading of LOCAL_ALIGNMENT for the
>> >> special case of
>> >> allocating caller save stack slots. But thinking about it anything
>> >> else wouldn't make
>> >> it simpler (but maybe easier to understand).
>> >>
>> >> Second, changes like this:
>> >>
>> >> -#define LOCAL_ALIGNMENT(TYPE,ALIGN) ((ALIGN) > 128 ? (ALIGN) : 128)
>> >> +#define LOCAL_ALIGNMENT(TYPE,MODE,ALIGN) \
>> >> + ((!(TYPE) || (ALIGN) > 128) ? (ALIGN) : 128)
>> >>
>> >> do change the alignment of stack slots to less alignment. So I'd like
>> >> all the affected
>> >> target maintainers to have a look or you to make the change a no-op
>> >> for all affected
>> >> targets instead.
>> >
>> > When LOCAL_ALIGNMENT is used on stack slot, type isn't NULL:
>> >
>> > if (type)
>> > alignment = LOCAL_ALIGNMENT (type, alignment);
>> >
>> > That is if type is NULL, alignment is unchanged. My patch shouldn't
>> > reduce alignment on stack slots.
>>
>> It changes alignment if the type is NULL in which case 128 was used before
>> and now we use ALIGN. If that never happens, why check type at all? Thus,
>
> LOCAL_ALIGNMENT is never used on the NULL type:
>
> if (type)
> alignment = LOCAL_ALIGNMENT (type, alignment);
>
>> why not do
>>
>> >> +#define LOCAL_ALIGNMENT(TYPE,MODE,ALIGN) \
>> >> + ((ALIGN) > 128) ? (ALIGN) : 128)
>>
>> leaving the macro basically unchanged?
>
> LOCAL_ALIGNMENT is only used on stack slots with
>
> if (type)
> alignment = LOCAL_ALIGNMENT (type, alignment);
>
> and on variables:
>
> align = LOCAL_ALIGNMENT (TREE_TYPE (decl), align);
>
> where TREE_TYPE (decl) shouldn't be NULL. That is if type is NULL,
> alignment will be unchanged. My patch changes alignment of stack slot
> from
>
> if (type)
> alignment = LOCAL_ALIGNMENT (type, alignment);
>
> to
>
> alignment = LOCAL_ALIGNMENT (type, mode, alignment);
>
> If we use
>
> +#define LOCAL_ALIGNMENT(TYPE,MODE,ALIGN) \
> + ((ALIGN) > 128) ? (ALIGN) : 128)
>
> the alignment of stack slot will be capped to 128. That is different
> from before the patch. Basically, my patch allows a NULL type to be
> passed to LOCAL_ALIGNMENT and lets each backend decide the alignment
> of stack slots. If the type is NULL, the mode alignment will be used,
> which is the same as before, unless a backend wants something
> different, like x86.
Ok, I see. I like the overloading of LOCAL_ALIGNMENT even less than before
given the amount of questions I needed to ask to get at the difference.
Richard.