This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: PATCH: PR middle-end/36253: Caller-save stack slot may not have proper alignment


On Sun, May 25, 2008 at 8:33 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Sun, May 25, 2008 at 07:07:49PM +0200, Richard Guenther wrote:
>> On Sun, May 25, 2008 at 6:31 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> > On Sun, May 25, 2008 at 06:17:02PM +0200, Richard Guenther wrote:
>> >> On Sat, May 17, 2008 at 10:20 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> >> > On Sat, May 17, 2008 at 08:10:54AM -0700, H.J. Lu wrote:
>> >> >> setup_save_areas calls assign_stack_local to allocate a stack slot for
>> >> >>  a hard register in the widest mode. assign_stack_local has
>> >> >>
>> >> >>   if (align == 0)
>> >> >>     {
>> >> >>       tree type;
>> >> >>
>> >> >>       if (mode == BLKmode)
>> >> >>         alignment = BIGGEST_ALIGNMENT;
>> >> >>       else
>> >> >>         alignment = GET_MODE_ALIGNMENT (mode);
>> >> >>
>> >> >>       /* Allow the target to (possibly) increase the alignment of this
>> >> >>          stack slot.  */
>> >> >>       type = lang_hooks.types.type_for_mode (mode, 0);
>> >> >>       if (type)
>> >> >>         alignment = LOCAL_ALIGNMENT (type, alignment);
>> >> >>
>> >> >>       alignment /= BITS_PER_UNIT;
>> >> >>     }
>> >> >>
>> >> >> For x87, the widest mode is XF, which is 12byte aligned at 4 byte.
>> >> >> So a 12byte stack slot aligned at 4 byte is allocated for an x87
>> >> >> register.  For x87, DF is 8 byte aligned at 8 byte. If it turns out
>> >> >> later that we need to save/restore DF instead XF, we load/store the DF
>> >> >> register at 4 byte instead of 8 byte. It won't give us the best x87
>> >> >> performance.
>> >> >>
>> >> >> This patch fixes this problem by passing NULL type and mode to
>> >> >> LOCAL_ALIGNMENT so that the backend can provide the largest alignment
>> >> >> for a caller-save stack slot.
>> >> >>
>> >> >> This patch moves the duplicated LOCAL_ALIGNMENT definitions in
>> >> >> cfgexpand.c and function.c to defaults.h.  It also removed
>> >> >> the duplicated stack slot alignment code.
>> >> >>
>> >> >> Is this a right approach?
>> >>
>> >> First of all I don't like the overloading of  LOCAL_ALIGNMENT for the
>> >> special case of
>> >> allocating caller save stack slots.  But thinking about it anything
>> >> else wouldn't make
>> >> it simpler (but maybe easier to understand).
>> >>
>> >> Second, changes like this:
>> >>
>> >> -#define LOCAL_ALIGNMENT(TYPE,ALIGN) ((ALIGN) > 128 ? (ALIGN) : 128)
>> >> +#define LOCAL_ALIGNMENT(TYPE,MODE,ALIGN) \
>> >> +  ((!(TYPE) || (ALIGN) > 128) ? (ALIGN) : 128)
>> >>
>> >> do change the alignment of stack slots to less alignment.  So I'd like
>> >> all the affected
>> >> target maintainers to have a look or you to make the change a no-op
>> >> for all affected
>> >> targets instead.
>> >
>> > When LOCAL_ALIGNMENT is used on stack slot, type isn't NULL:
>> >
>> >  if (type)
>> >    alignment = LOCAL_ALIGNMENT (type, alignment);
>> >
>> > That is if type is NULL, alignment is unchanged.  My patch shouldn't
>> > reduce alignment on stack slots.
>>
>> It changes alignment if the type is NULL in which case 128 was used before
>> and now we use ALIGN.  If that never happens, why check type at all?  Thus,
>
> LOCAL_ALIGNMENT is never used on the NULL type:
>
>  if (type)
>    alignment = LOCAL_ALIGNMENT (type, alignment);
>
>> why not do
>>
>> >> +#define LOCAL_ALIGNMENT(TYPE,MODE,ALIGN) \
>> >> +  ((ALIGN) > 128) ? (ALIGN) : 128)
>>
>> leaving the macro basically unchanged?
>
> LOCAL_ALIGNMENT is only used on stack slots with
>
>  if (type)
>    alignment = LOCAL_ALIGNMENT (type, alignment);
>
> and on variables:
>
>  align = LOCAL_ALIGNMENT (TREE_TYPE (decl), align);
>
> where TREE_TYPE (decl) shouldn't be NULL.  That is if type is NULL,
> alignment will be unchanged.  My patch changes alignment of stack slot
> from
>
>  if (type)
>    alignment = LOCAL_ALIGNMENT (type, alignment);
>
> to
>
>  alignment = LOCAL_ALIGNMENT (type, mode, alignment);
>
> If we use
>
> +#define LOCAL_ALIGNMENT(TYPE,MODE,ALIGN) \
> +  ((ALIGN) > 128) ? (ALIGN) : 128)
>
> the alignment of stack slot will be capped to 128. That is different
> from before the patch.  Basically, my patch allows a NULL type to be
> passed to LOCAL_ALIGNMENT and lets each backend decide the alignment
> of stack slots.  If the type is NULL, the mode alignment will be used,
> which is the same as before, unless a backend wants something
> different, like x86.

Ok, I see.  I like the overloading of LOCAL_ALIGNMENT even less than before
given the amount of questions I needed to ask to get at the difference.

Richard.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]