This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: New regression on ARM Linux


On 31/03/15 12:08, Richard Biener wrote:
> On Tue, 31 Mar 2015, Richard Earnshaw wrote:
> 
>> On 31/03/15 11:45, Richard Biener wrote:
>>> On Tue, 31 Mar 2015, Richard Earnshaw wrote:
>>>
>>>> On 31/03/15 11:36, Richard Biener wrote:
>>>>> On Tue, 31 Mar 2015, Richard Earnshaw wrote:
>>>>>
>>>>>> On 31/03/15 11:00, Richard Biener wrote:
>>>>>>> On Tue, 31 Mar 2015, Richard Earnshaw wrote:
>>>>>>>
>>>>>>>> On 31/03/15 08:50, Richard Biener wrote:
>>>>>>>>> On Mon, Mar 30, 2015 at 10:13 PM, Richard Biener <rguenther@suse.de> wrote:
>>>>>>>>>> On March 30, 2015 6:45:34 PM GMT+02:00, Alan Lawrence <alan.lawrence@arm.com> wrote:
>>>>>>>>>>> -O2 was what I first used; it also occurs at -O1. -fno-tree-sra fixes
>>>>>>>>>>> it.
>>>>>>>>>>>
>>>>>>>>>>> The problem appears to be in laying out arguments, specifically
>>>>>>>>>>> varargs. From
>>>>>>>>>>> the "good" -fdump-rtl-expand:
>>>>>>>>>>>
>>>>>>>>>>> (insn 18 17 19 2 (set (mem:SI (reg/f:SI 107 virtual-outgoing-args) [0
>>>>>>>>>>> S4 A32])
>>>>>>>>>>>         (reg:SI 111 [ b1$16 ])) reduced.c:14 -1
>>>>>>>>>>>      (nil))
>>>>>>>>>>> (insn 19 18 20 2 (set (reg:DF 2 r2)
>>>>>>>>>>>         (reg:DF 112 [ b1$8 ])) reduced.c:14 -1
>>>>>>>>>>>      (nil))
>>>>>>>>>>> (insn 20 19 21 2 (set (reg:SI 1 r1)
>>>>>>>>>>>         (reg:SI 113 [ b1 ])) reduced.c:14 -1
>>>>>>>>>>>      (nil))
>>>>>>>>>>> (insn 21 20 22 2 (set (reg:SI 0 r0)
>>>>>>>>>>>         (reg:SI 118)) reduced.c:14 -1
>>>>>>>>>>>      (nil))
>>>>>>>>>>> (call_insn 22 21 23 2 (parallel [
>>>>>>>>>>>             (set (reg:SI 0 r0)
>>>>>>>>>>>                 (call (mem:SI (symbol_ref:SI ("printf") [flags 0x41]
>>>>>>>>>>> <function_decl 0x2ab50e856d00 __builtin_printf>) [0 __builtin_printf S4
>>>>>>>>>>> A32])
>>>>>>>>>>>
>>>>>>>>>>> The struct members are
>>>>>>>>>>> reg:SI 113 => int a;
>>>>>>>>>>> reg:DF 112 => double b;
>>>>>>>>>>> reg:SI 111 => int c;
>>>>>>>>>>>
>>>>>>>>>>> r0 gets the format string; r1 gets int a; r2+r3 get double b; int c is
>>>>>>>>>>> pushed
>>>>>>>>>>> into virtual-outgoing-args. In contrast, post-change to
>>>>>>>>>>> build_ref_of_offset, we get:
>>>>>>>>>>>
>>>>>>>>>>> (insn 17 16 18 2 (set (reg:SI 118)
>>>>>>>>>>>   (symbol_ref/v/f:SI ("*.LC1") [flags 0x82]  <var_decl 0x2ba57fa8d750
>>>>>>>>>>> *.LC1>)) reduced.c:14 -1
>>>>>>>>>>>      (nil))
>>>>>>>>>>> (insn 18 17 19 2 (set (mem:SI (plus:SI (reg/f:SI 107
>>>>>>>>>>> virtual-outgoing-args)
>>>>>>>>>>>                 (const_int 8 [0x8])) [0  S4 A64])
>>>>>>>>>>>         (reg:SI 111 [ b1$16 ])) reduced.c:14 -1
>>>>>>>>>>>      (nil))
>>>>>>>>>>> (insn 19 18 20 2 (set (mem:DF (reg/f:SI 107 virtual-outgoing-args) [0
>>>>>>>>>>> S8 A64])
>>>>>>>>>>>         (reg:DF 112 [ b1$8 ])) reduced.c:14 -1
>>>>>>>>>>>      (nil))
>>>>>>>>>>> (insn 20 19 21 2 (set (reg:SI 2 r2)
>>>>>>>>>>>         (reg:SI 113 [ b1 ])) reduced.c:14 -1
>>>>>>>>>>>      (nil))
>>>>>>>>>>> (insn 21 20 22 2 (set (reg:SI 0 r0)
>>>>>>>>>>>         (reg:SI 118)) reduced.c:14 -1
>>>>>>>>>>>      (nil))
>>>>>>>>>>> (call_insn 22 21 23 2 (parallel [
>>>>>>>>>>>             (set (reg:SI 0 r0)
>>>>>>>>>>>                 (call (mem:SI (symbol_ref:SI ("printf") [flags 0x41]
>>>>>>>>>>> <function_decl 0x2ba57f843d00 __builtin_printf>) [0 __builtin_printf S4
>>>>>>>>>>> A32])
>>>>>>>>>>>
>>>>>>>>>>> r0 still gets the format string, but 'int b1.a' now goes in r2, and the
>>>>>>>>>>>
>>>>>>>>>>> double+following int are all pushed into virtual-outgoing-args. This is
>>>>>>>>>>> because
>>>>>>>>>>> arm_function_arg is fed a 64-bit-aligned int as type of the second
>>>>>>>>>>> argument (the
>>>>>>>>>>> type constructed by build_ref_for_offset); it then executes
>>>>>>>>>>> (aapcs_layout_arg,
>>>>>>>>>>> arm.c line ~~5914)
>>>>>>>>>>>
>>>>>>>>>>>   /* C3 - For double-word aligned arguments, round the NCRN up to the
>>>>>>>>>>>      next even number.  */
>>>>>>>>>>>   ncrn = pcum->aapcs_ncrn;
>>>>>>>>>>>   if ((ncrn & 1) && arm_needs_doubleword_align (mode, type))
>>>>>>>>>>>     ncrn++;
>>>>>>>>>>>
>>>>>>>>>>> Which changes r1 to r2. Passing -fno-tree-sra, or removing from the
>>>>>>>>>>> testcase
>>>>>>>>>>> "*(cls_struct_16byte *)resp = b1", causes arm_function_arg to be fed a
>>>>>>>>>>> 32-bit-aligned int instead, which works as previously.
>>>>>>>>>>>
>>>>>>>>>>> Passing the same members of that struct in a non-vargs call, works ok -
>>>>>>>>>>> I think
>>>>>>>>>>> because these use the type of the declared parameters, rather than the
>>>>>>>>>>> provided
>>>>>>>>>>> arguments, and the former do not have the increased alignment from
>>>>>>>>>>> build_ref_for_offset.
>>>>>>>>>>
>>>>>>>>>> It doesn't make sense to use the alignment of passed values.  That looks like bs.
>>>>>>>>>>
>>>>>>>>>> This means that
>>>>>>>>>>
>>>>>>>>>> Int I __aligned__(8);
>>>>>>>>>>
>>>>>>>>>> Is passed differently than int.
>>>>>>>>>>
>>>>>>>>>> Arm_function_arg needs to be fixed.
>>>>>>>>>
>>>>>>>>> That is,
>>>>>>>>>
>>>>>>>>> typedef int myint __attribute__((aligned(8)));
>>>>>>>>>
>>>>>>>>> int main()
>>>>>>>>> {
>>>>>>>>>   myint i = 1;
>>>>>>>>>   int j = 2;
>>>>>>>>>   __builtin_printf("%d %d\n", i, j);
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> or
>>>>>>>>>
>>>>>>>>> myint i;
>>>>>>>>> int j;
>>>>>>>>> myint *p = &i;
>>>>>>>>> int *q = &j;
>>>>>>>>>
>>>>>>>>> int main()
>>>>>>>>> {
>>>>>>>>>   __builtin_printf("%d %d", *p, *q);
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> should behave the same.  There isn't a printf modifier for an "aligned int"
>>>>>>>>> because that sort of thing doesn't make sense.  Special-casing aligned vs.
>>>>>>>>> non-aligned values only makes sense for things passed by value on the stack.
>>>>>>>>> And then obviously only dependent on the functuion type signature, not
>>>>>>>>> on the type of the passed value.
>>>>>>>>>
>>>>>>>>
>>>>>>>> I think the testcase is ill-formed.  Just because printf doesn't have
>>>>>>>> such a modifier, doesn't mean that another variadic function might not
>>>>>>>> have a means to detect when an object in the variadic list needs to be
>>>>>>>> over-aligned.  As such, the test should really be written as:
>>>>>>>
>>>>>>> A value doesn't have "alignment".  A function may have alignment
>>>>>>> requirements on its arguments, clearly printf doesn't.
>>>>>>>
>>>>>>
>>>>>> Values don't.  But types do and variadic functions are special in that
>>>>>> they derive their types from the types of the actual parameters passed
>>>>>> not from the formals in the prototype.  Any manipulation of the types
>>>>>> should be done in the front end, not in the back end.
>>>>>
>>>>> The following seems to help the testcase (by luck I'd say?).  It
>>>>> makes us drop alignment information from the temporaries that
>>>>> SRA creates as memory replacement.
>>>>>
>>>>> But I find it odd that on ARM passing *((aligned_int *)p) as
>>>>> vararg (only as varargs?) changes calling conventions independent
>>>>> of the functions type signature.
>>>>>
>>>>> Richard.
>>>>>
>>>>> Index: gcc/tree-sra.c
>>>>> ===================================================================
>>>>> --- gcc/tree-sra.c      (revision 221770)
>>>>> +++ gcc/tree-sra.c      (working copy)
>>>>> @@ -2012,7 +2012,9 @@ create_access_replacement (struct access
>>>>>        DECL_CONTEXT (repl) = current_function_decl;
>>>>>      }
>>>>>    else
>>>>> -    repl = create_tmp_var (access->type, "SR");
>>>>> +    repl = create_tmp_var (build_qualified_type
>>>>> +                            (TYPE_MAIN_VARIANT (access->type),
>>>>> +                             TYPE_QUALS (access->type)), "SR");
>>>>>    if (TREE_CODE (access->type) == COMPLEX_TYPE
>>>>>        || TREE_CODE (access->type) == VECTOR_TYPE)
>>>>>      {
>>>>>
>>>>
>>>> I guess we could do something similar (use TYPE_MAIN_VARIANT) in the
>>>> backend code - but then we'd always ignore any over-alignment
>>>> requirements (or under, for that matter).
>>>
>>> The question is whether those are even in the scope of AACPS...
>>>
>>
>> Technically, they aren't.  The argument passing rules are all based on
>> the alignment rules for fundamental types (and derived rules for
>> composite types based on those fundamental types) written in the AAPCS
>> itself.  There's no provision for over-aligning a data type, so all of
>> this is off-piste.
> 
> So in arm_needs_doubleword_align you could do
> 
> /* Return true if mode/type need doubleword alignment.  */
> static bool
> arm_needs_doubleword_align (machine_mode mode, const_tree type)
> {
>   return (GET_MODE_ALIGNMENT (mode) > PARM_BOUNDARY
>           || (type
>               && (AGGREGATE_TYPE_P (type) || mode == BLKmode)
>               && TYPE_ALIGN (type) > PARM_BOUNDARY));
> }
> 
> ?  Thus always trust the 'mode' when it doesn't apply to an aggregate?
> 
> Richard.
> 

Experience has shown that the front/mid ends always end up buggering the
mode argument to the point where it even causes the wrong register class
to be chosen.  It really can't be trusted unless type is NULL
(libcalls).  The type, on the other hand, should always be correct and
should always be the preferred way to get the information needed.

R.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]