This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: PATCH: PR middle-end/45678: [4.4/4.5/4.6 Regression] crash on vector code with -m32 -msse

From: "H.J. Lu" <hjl dot tools at gmail dot com>
To: Jakub Jelinek <jakub at redhat dot com>
Cc: Richard Henderson <rth at redhat dot com>, gcc-patches at gcc dot gnu dot org
Date: Fri, 17 Sep 2010 14:34:00 -0700
Subject: Re: PATCH: PR middle-end/45678: [4.4/4.5/4.6 Regression] crash on vector code with -m32 -msse
References: <20100917165833.GA17163@intel.com> <4C93A958.6000502@redhat.com> <20100917181519.GE1269@tyan-ft48-01.lab.bos.redhat.com> <AANLkTimJOFRkVb6WnJHgoY_dWU37jjmLCeOyMAFzc-Cj@mail.gmail.com> <AANLkTi=w1=O3j0R1Ym1ZhfpsSR5k=gBmgGhUm6hkamL6@mail.gmail.com> <AANLkTinZ9XNhEq0YhxnNNoW7YWqQaCEi7CybK8muKay+@mail.gmail.com> <AANLkTikVRyKD9kRbF_unro1qHUOywvCx-exG4bYd-ESC@mail.gmail.com>

On Fri, Sep 17, 2010 at 2:31 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Fri, Sep 17, 2010 at 1:03 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Fri, Sep 17, 2010 at 12:18 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>> On Fri, Sep 17, 2010 at 11:32 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>> On Fri, Sep 17, 2010 at 11:15 AM, Jakub Jelinek <jakub@redhat.com> wrote:
>>>>> On Fri, Sep 17, 2010 at 10:46:00AM -0700, Richard Henderson wrote:
>>>>>> On 09/17/2010 09:58 AM, H.J. Lu wrote:
>>>>>> > ? ? PR middle-end/45678
>>>>>> > ? ? * cfgexpand.c (update_stack_alignment): New.
>>>>>> > ? ? (get_decl_align_unit): Use it.
>>>>>> > ? ? (expand_one_stack_var_at): Call update_stack_alignment.
>>>>>>
>>>>>> Ok.
>>>>>
>>>>> I don't deny this fixes the problem and probably is a good idea in any case,
>>>>> the question I have is just if this won't pessimize i?86 code too much,
>>>>> especially with non-default options where incoming stack boundary is smaller
>>>>> than PREFERRED_STACK_BOUNDARY.
>>>>> The thing is that this expand_one_stack_var_at is just an optimization,
>>>>> but we don't know if anything will actually make use of the increased
>>>>> alignment. ?If there will be just variables that need 32-bit alignment
>>>>> and say incoming stack alignment is also 32-bit, but preferred stack
>>>>> boundary is 128-bit, as soon as some variable will be placed in slot 16
>>>>> bytes from the base, we'll force realignment even when nothing actually
>>>>> needs it.
>>>>>
>>>>> So, I wonder if
>>>>> ? ? ?max_align = MAX (crtl->max_used_stack_slot_alignment,
>>>>> ? ? ? ? ? ? ? ? ? ? ? PREFERRED_STACK_BOUNDARY);
>>>>> shouldn't be replaced by something like:
>>>>> 1)
>>>>> ? ? ?max_align = MAX (crtl->max_used_stack_slot_alignment,
>>>>> ? ? ? ? ? ? ? ? ? ? ? SUPPORTS_STACK_ALIGNMENT
>>>>> ? ? ? ? ? ? ? ? ? ? ? ? STACK_BOUNDARY : PREFERRED_STACK_BOUNDARY);
>>>>> or:
>>>>> 2)
>>>>> ? ? ?max_align = MAX (crtl->max_used_stack_slot_alignment,
>>>>> ? ? ? ? ? ? ? ? ? ? ? SUPPORTS_STACK_ALIGNMENT
>>>>> ? ? ? ? ? ? ? ? ? ? ? ? INCOMING_STACK_BOUNDARY : PREFERRED_STACK_BOUNDARY);
>>>>> or:
>>>>> 3)
>>>>> ? ? ?max_align = SUPPORTS_STACK_ALIGNMENT
>>>>> ? ? ? ? ? ? ? ? ?? MAX (crtl->max_used_stack_slot_alignment, STACK_BOUNDARY)
>>>>> ? ? ? ? ? ? ? ? ?: PREFERRED_STACK_BOUNDARY;
>>>>> or:
>>>>> 4)
>>>>> ? ? ?max_align = SUPPORTS_STACK_ALIGNMENT
>>>>> ? ? ? ? ? ? ? ? ?? MAX (crtl->max_used_stack_slot_alignment,
>>>>> ? ? ? ? ? ? ? ? ? ? ? ? INCOMING_STACK_BOUNDARY)
>>>>> ? ? ? ? ? ? ? ? ?: PREFERRED_STACK_BOUNDARY;
>>>>> For 2) or 4) we'd need to do
>>>>> ?if (targetm.calls.update_stack_boundary)
>>>>> ? ?targetm.calls.update_stack_boundary ();
>>>>> earlier (before all the expand_one_stack_var_at calls).
>>>>> 1) and 3) are conservative choices for i?86, seems without H.J's patch gcc
>>>>> would often happily use just 32-bit alignment of the stack base and even if
>>>>> incoming stack boundary was 128-bit, it would tell other routines they don't
>>>>> need to bother with that alignment, but e.g. on x86-64 it would still assume
>>>>> max_align of 8 bytes initially, even when e.g. x86-64 doesn't allow setting
>>>>> PREFERRED_STACK_BOUNDARY below 128 bits. ?I think usually it doesn't buy us
>>>>> too much to align below INCOMING_STACK_BOUNDARY. ?So perhaps 2)/4), where
>>>>> needlessly bumping alignment in expand_one_stack_var_at too much would be
>>>>> far more costly (would cause stack realignment).
>>>>> The 1)/2) vs. 3)/4) difference is for !SUPPORTS_STACK_ALIGNMENT targets,
>>>>> I wonder as they are never realigning the stack whether we should ever
>>>>> use there higher alignment.
>>>>>
>>>>
>>>> I think it is related to
>>>>
>>>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44542
>>>>
>>>> One of my attempts may work.
>>>>
>>>> FWIW, I am fine with any solution as long as when we do
>>>>
>>>> DECL_ALIGN (decl) = align;
>>>>
>>>> we guarantee that it will be aligned at that boundary at run-time.
>>>
>>> I don't think we should use offset as alignment unless the variable
>>> should be aligned at offset bytes. ?My patch
>>>
>>> http://gcc.gnu.org/bugzilla/attachment.cgi?id=20922&action=view
>>>
>>> may align the local variable at boundary specified for the variable.
>>>
>>>
>>>
>>> --
>>> H.J.
>>>
>>
>> Using stack offset for local variable alignment is a bad idea. My patch
>> caused:
>>
>> FAIL: gcc.target/i386/incoming-9.c scan-assembler-not andl[\\t
>> ]*\\$-16,[\\t ]*%esp
>
> Here is a patch to align local variable with specified alignment.
> There is no need to call update_stack_alignment in
> expand_one_stack_var_at since we don't use its offset
> to increase is alignment. ? OK for trunk if there are regressions?

I meant "OK for trunk if there are NO regressions?"

H.J.
--
> Thanks.
>
>
> --
> H.J.
> ---gcc/
>
> 2010-09-17 ?H.J. Lu ?<hongjiu.lu@intel.com>
>
> ? ? ? ?PR target/44542
> ? ? ? ?PR middle-end/45678
> ? ? ? ?* cfgexpand.c (expand_one_stack_var_at): Take an alignment
> ? ? ? ?parameter and use it for alignment. ?Don't call
> ? ? ? ?update_stack_alignment.
> ? ? ? ?(expand_stack_vars): Call expand_one_stack_var_at with specified
> ? ? ? ?alignment.
> ? ? ? ?(expand_one_stack_var): Likewise.
>
> gcc/testsuite/
>
> 2010-09-17 ?H.J. Lu ?<hongjiu.lu@intel.com>
>
> ? ? ? ?PR middle-end/45678
> ? ? ? ?* gcc.dg/torture/pr45678-3.c: New.
> -
>



-- 
H.J.

References:
- PATCH: PR middle-end/45678: [4.4/4.5/4.6 Regression] crash on vector code with -m32 -msse
  - From: H.J. Lu
- Re: PATCH: PR middle-end/45678: [4.4/4.5/4.6 Regression] crash on vector code with -m32 -msse
  - From: Richard Henderson
- Re: PATCH: PR middle-end/45678: [4.4/4.5/4.6 Regression] crash on vector code with -m32 -msse
  - From: Jakub Jelinek
- Re: PATCH: PR middle-end/45678: [4.4/4.5/4.6 Regression] crash on vector code with -m32 -msse
  - From: H.J. Lu
- Re: PATCH: PR middle-end/45678: [4.4/4.5/4.6 Regression] crash on vector code with -m32 -msse
  - From: H.J. Lu
- Re: PATCH: PR middle-end/45678: [4.4/4.5/4.6 Regression] crash on vector code with -m32 -msse
  - From: H.J. Lu
- Re: PATCH: PR middle-end/45678: [4.4/4.5/4.6 Regression] crash on vector code with -m32 -msse
  - From: H.J. Lu

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]