This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

PING^2: [PATCH] i386: Avoid stack realignment if possible


On Tue, Jul 25, 2017 at 7:54 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Tue, Jul 25, 2017 at 3:52 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Fri, Jul 14, 2017 at 4:46 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>> On Fri, Jul 7, 2017 at 5:56 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>> On Fri, Jul 07, 2017 at 09:58:42AM -0700, H.J. Lu wrote:
>>>>> On Fri, Dec 20, 2013 at 8:06 AM, Jakub Jelinek <jakub@redhat.com> wrote:
>>>>> > Hi!
>>>>> >
>>>>> > Honza recently changed the i?86 backend, so that it often doesn't
>>>>> > do -maccumulate-outgoing-args by default on x86_64.
>>>>> > Unfortunately, on some of the here included testcases this regressed
>>>>> > quite a bit the generated code.  As AVX vectors are used, the dynamic
>>>>> > realignment code needs to assume e.g. that some of them will need to be
>>>>> > spilled, and for -mno-accumulate-outgoing-args the code needs to set
>>>>> > need_drap early as well.  But in when emitting the prologue/epilogue,
>>>>> > if need_drap is set, we don't perform the optimization for leaf functions
>>>>> > which have zero size stack frame, thus we end up with uselessly doing
>>>>> > dynamic stack realignment, setting up DRAP that nothing uses and later on
>>>>> > restore everything back.
>>>>> >
>>>>> > This patch improves it, if the DRAP register isn't live at the start of
>>>>> > entry bb successor and we aren't going to realign the stack, we don't
>>>>> > need DRAP at all, and even if we need DRAP register, that can't be the sole
>>>>> > reason for doing stack realignment, the prologue code is able to set up DRAP
>>>>> > even without dynamic stack realignment.
>>>>> >
>>>>> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>>>>> >
>>>>> > 2013-12-20  Jakub Jelinek  <jakub@redhat.com>
>>>>> >
>>>>> >         PR target/59501
>>>>> >         * config/i386/i386.c (ix86_save_reg): Don't return true for drap_reg
>>>>> >         if !crtl->stack_realign_needed.
>>>>> >         (ix86_finalize_stack_realign_flags): If drap_reg isn't live on entry
>>>>> >         and stack_realign_needed will be false, clear drap_reg and need_drap.
>>>>> >         Optimize leaf functions that don't need stack frame even if
>>>>> >         crtl->need_drap.
>>>>> >
>>>>> >         * gcc.target/i386/pr59501-1.c: New test.
>>>>> >         * gcc.target/i386/pr59501-1a.c: New test.
>>>>> >         * gcc.target/i386/pr59501-2.c: New test.
>>>>> >         * gcc.target/i386/pr59501-2a.c: New test.
>>>>> >         * gcc.target/i386/pr59501-3.c: New test.
>>>>> >         * gcc.target/i386/pr59501-3a.c: New test.
>>>>> >         * gcc.target/i386/pr59501-4.c: New test.
>>>>> >         * gcc.target/i386/pr59501-4a.c: New test.
>>>>> >         * gcc.target/i386/pr59501-5.c: New test.
>>>>> >         * gcc.target/i386/pr59501-6.c: New test.
>
> LGTM, assuming Jakub is OK with the patch.
>
> Thanks,
> Uros.

Jakub, can you take a look at this:

https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00400.html

Thanks.
>
>>>>> >
>>>>> > --- gcc/testsuite/gcc.target/i386/pr59501-4a.c.jj       2013-12-20 12:19:20.603212859 +0100
>>>>> > +++ gcc/testsuite/gcc.target/i386/pr59501-4a.c  2013-12-20 12:23:33.647881672 +0100
>>>>> > @@ -0,0 +1,8 @@
>>>>> > +/* PR target/59501 */
>>>>> > +/* { dg-do compile { target { ! ia32 } } } */
>>>>> > +/* { dg-options "-O2 -mavx -maccumulate-outgoing-args" } */
>>>>> > +
>>>>> > +#include "pr59501-3a.c"
>>>>> > +
>>>>> > +/* Verify no dynamic realignment is performed.  */
>>>>> > +/* { dg-final { scan-assembler-not "and\[^\n\r]*sp" { xfail *-*-* } } } */
>>>>> >
>>>>>
>>>>> Since DRAP isn't used with -maccumulate-outgoing-args, pr59501-4a.c was
>>>>> xfailed due to stack frame access via frame pointer instead of DARP.
>>>>> This patch finds the maximum stack alignment from the stack frame access
>>>>> instructions and avoids stack realignment if stack alignment needed is
>>>>> less than incoming stack boundary.
>>>>>
>>>>> I am testing this patch.  OK for trunk if there is no regression?
>>>>>
>>>>>
>>>>
>>>> We need to keep the preferred stack alignment as the minimum stack
>>>> alignment. Here is the updated patch.  Tested on x86-64.  OK for
>>>> trunk?
>>>>
>>>> Thanks.
>>>
>>> Hi Jakub,
>>>
>>> This patch fixes the xfailed testcase in your patch:
>>>
>>> https://gcc.gnu.org/ml/gcc-patches/2013-12/msg01767.html
>>>
>>> Your original patch:
>>>
>>> https://gcc.gnu.org/ml/gcc-patches/2011-11/msg01058.html
>>>
>>> assumes that any instructions accessing stack require stack
>>> realignment:
>>>
>>> +      FOR_EACH_BB (bb)
>>> +        {
>>> +          rtx insn;
>>> +  FOR_BB_INSNS (bb, insn)
>>> +    if (NONDEBUG_INSN_P (insn)
>>> + && requires_stack_frame_p (insn, prologue_used,
>>> +   set_up_by_prologue))
>>> +      {
>>> + crtl->stack_realign_needed = stack_realign;
>>> + crtl->stack_realign_finalized = true;
>>> + return;
>>> +      }
>>> + }
>>>
>>> This patch checks the actual alignment needed for any instructions
>>> accessing stack.  It skips stack realignment if the actual stack alignment
>>> needed is less than or equal to incoming stack alignment.
>>>
>>> Can you take look at it?
>>>
>>> Thanks.
>>>
>>
>> PING
>>
>> https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00400.html
>>
>> --
>> H.J.



-- 
H.J.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]