This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: New regression on ARM Linux
- From: Richard Earnshaw <Richard dot Earnshaw at foss dot arm dot com>
- To: Richard Biener <rguenther at suse dot de>
- Cc: Richard Biener <richard dot guenther at gmail dot com>, Alan Lawrence <Alan dot Lawrence at arm dot com>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, Marcus Shawcroft <Marcus dot Shawcroft at arm dot com>
- Date: Tue, 31 Mar 2015 13:15:15 +0100
- Subject: Re: New regression on ARM Linux
- Authentication-results: sourceware.org; auth=none
- References: <alpine dot LSU dot 2 dot 11 dot 1503111607530 dot 10796 at zhemvz dot fhfr dot qr> <55193E77 dot 3040401 at arm dot com> <55193EFB dot 6030009 at arm dot com> <alpine dot LSU dot 2 dot 11 dot 1503301501070 dot 31545 at zhemvz dot fhfr dot qr> <alpine dot LSU dot 2 dot 11 dot 1503301509150 dot 31545 at zhemvz dot fhfr dot qr> <55197DAE dot 1050004 at arm dot com> <16A8C5AA-994C-4FA3-BC25-8547166DC13C at suse dot de> <CAFiYyc1MOL8AAwtVhvYDYmSSo+MdGi2FFi-6ELfRxdsvDxtSew at mail dot gmail dot com> <551A6C46 dot 7010305 at foss dot arm dot com> <alpine dot LSU dot 2 dot 11 dot 1503311200090 dot 31545 at zhemvz dot fhfr dot qr> <551A729F dot 3090200 at foss dot arm dot com> <alpine dot LSU dot 2 dot 11 dot 1503311229310 dot 31545 at zhemvz dot fhfr dot qr> <551A7981 dot 30805 at foss dot arm dot com> <alpine dot LSU dot 2 dot 11 dot 1503311245090 dot 31545 at zhemvz dot fhfr dot qr> <551A7C29 dot 2080105 at foss dot arm dot com> <alpine dot LSU dot 2 dot 11 dot 1503311306050 dot 31545 at zhemvz dot fhfr dot qr>
On 31/03/15 12:08, Richard Biener wrote:
> On Tue, 31 Mar 2015, Richard Earnshaw wrote:
>
>> On 31/03/15 11:45, Richard Biener wrote:
>>> On Tue, 31 Mar 2015, Richard Earnshaw wrote:
>>>
>>>> On 31/03/15 11:36, Richard Biener wrote:
>>>>> On Tue, 31 Mar 2015, Richard Earnshaw wrote:
>>>>>
>>>>>> On 31/03/15 11:00, Richard Biener wrote:
>>>>>>> On Tue, 31 Mar 2015, Richard Earnshaw wrote:
>>>>>>>
>>>>>>>> On 31/03/15 08:50, Richard Biener wrote:
>>>>>>>>> On Mon, Mar 30, 2015 at 10:13 PM, Richard Biener <rguenther@suse.de> wrote:
>>>>>>>>>> On March 30, 2015 6:45:34 PM GMT+02:00, Alan Lawrence <alan.lawrence@arm.com> wrote:
>>>>>>>>>>> -O2 was what I first used; it also occurs at -O1. -fno-tree-sra fixes
>>>>>>>>>>> it.
>>>>>>>>>>>
>>>>>>>>>>> The problem appears to be in laying out arguments, specifically
>>>>>>>>>>> varargs. From
>>>>>>>>>>> the "good" -fdump-rtl-expand:
>>>>>>>>>>>
>>>>>>>>>>> (insn 18 17 19 2 (set (mem:SI (reg/f:SI 107 virtual-outgoing-args) [0
>>>>>>>>>>> S4 A32])
>>>>>>>>>>> (reg:SI 111 [ b1$16 ])) reduced.c:14 -1
>>>>>>>>>>> (nil))
>>>>>>>>>>> (insn 19 18 20 2 (set (reg:DF 2 r2)
>>>>>>>>>>> (reg:DF 112 [ b1$8 ])) reduced.c:14 -1
>>>>>>>>>>> (nil))
>>>>>>>>>>> (insn 20 19 21 2 (set (reg:SI 1 r1)
>>>>>>>>>>> (reg:SI 113 [ b1 ])) reduced.c:14 -1
>>>>>>>>>>> (nil))
>>>>>>>>>>> (insn 21 20 22 2 (set (reg:SI 0 r0)
>>>>>>>>>>> (reg:SI 118)) reduced.c:14 -1
>>>>>>>>>>> (nil))
>>>>>>>>>>> (call_insn 22 21 23 2 (parallel [
>>>>>>>>>>> (set (reg:SI 0 r0)
>>>>>>>>>>> (call (mem:SI (symbol_ref:SI ("printf") [flags 0x41]
>>>>>>>>>>> <function_decl 0x2ab50e856d00 __builtin_printf>) [0 __builtin_printf S4
>>>>>>>>>>> A32])
>>>>>>>>>>>
>>>>>>>>>>> The struct members are
>>>>>>>>>>> reg:SI 113 => int a;
>>>>>>>>>>> reg:DF 112 => double b;
>>>>>>>>>>> reg:SI 111 => int c;
>>>>>>>>>>>
>>>>>>>>>>> r0 gets the format string; r1 gets int a; r2+r3 get double b; int c is
>>>>>>>>>>> pushed
>>>>>>>>>>> into virtual-outgoing-args. In contrast, post-change to
>>>>>>>>>>> build_ref_of_offset, we get:
>>>>>>>>>>>
>>>>>>>>>>> (insn 17 16 18 2 (set (reg:SI 118)
>>>>>>>>>>> (symbol_ref/v/f:SI ("*.LC1") [flags 0x82] <var_decl 0x2ba57fa8d750
>>>>>>>>>>> *.LC1>)) reduced.c:14 -1
>>>>>>>>>>> (nil))
>>>>>>>>>>> (insn 18 17 19 2 (set (mem:SI (plus:SI (reg/f:SI 107
>>>>>>>>>>> virtual-outgoing-args)
>>>>>>>>>>> (const_int 8 [0x8])) [0 S4 A64])
>>>>>>>>>>> (reg:SI 111 [ b1$16 ])) reduced.c:14 -1
>>>>>>>>>>> (nil))
>>>>>>>>>>> (insn 19 18 20 2 (set (mem:DF (reg/f:SI 107 virtual-outgoing-args) [0
>>>>>>>>>>> S8 A64])
>>>>>>>>>>> (reg:DF 112 [ b1$8 ])) reduced.c:14 -1
>>>>>>>>>>> (nil))
>>>>>>>>>>> (insn 20 19 21 2 (set (reg:SI 2 r2)
>>>>>>>>>>> (reg:SI 113 [ b1 ])) reduced.c:14 -1
>>>>>>>>>>> (nil))
>>>>>>>>>>> (insn 21 20 22 2 (set (reg:SI 0 r0)
>>>>>>>>>>> (reg:SI 118)) reduced.c:14 -1
>>>>>>>>>>> (nil))
>>>>>>>>>>> (call_insn 22 21 23 2 (parallel [
>>>>>>>>>>> (set (reg:SI 0 r0)
>>>>>>>>>>> (call (mem:SI (symbol_ref:SI ("printf") [flags 0x41]
>>>>>>>>>>> <function_decl 0x2ba57f843d00 __builtin_printf>) [0 __builtin_printf S4
>>>>>>>>>>> A32])
>>>>>>>>>>>
>>>>>>>>>>> r0 still gets the format string, but 'int b1.a' now goes in r2, and the
>>>>>>>>>>>
>>>>>>>>>>> double+following int are all pushed into virtual-outgoing-args. This is
>>>>>>>>>>> because
>>>>>>>>>>> arm_function_arg is fed a 64-bit-aligned int as type of the second
>>>>>>>>>>> argument (the
>>>>>>>>>>> type constructed by build_ref_for_offset); it then executes
>>>>>>>>>>> (aapcs_layout_arg,
>>>>>>>>>>> arm.c line ~~5914)
>>>>>>>>>>>
>>>>>>>>>>> /* C3 - For double-word aligned arguments, round the NCRN up to the
>>>>>>>>>>> next even number. */
>>>>>>>>>>> ncrn = pcum->aapcs_ncrn;
>>>>>>>>>>> if ((ncrn & 1) && arm_needs_doubleword_align (mode, type))
>>>>>>>>>>> ncrn++;
>>>>>>>>>>>
>>>>>>>>>>> Which changes r1 to r2. Passing -fno-tree-sra, or removing from the
>>>>>>>>>>> testcase
>>>>>>>>>>> "*(cls_struct_16byte *)resp = b1", causes arm_function_arg to be fed a
>>>>>>>>>>> 32-bit-aligned int instead, which works as previously.
>>>>>>>>>>>
>>>>>>>>>>> Passing the same members of that struct in a non-vargs call, works ok -
>>>>>>>>>>> I think
>>>>>>>>>>> because these use the type of the declared parameters, rather than the
>>>>>>>>>>> provided
>>>>>>>>>>> arguments, and the former do not have the increased alignment from
>>>>>>>>>>> build_ref_for_offset.
>>>>>>>>>>
>>>>>>>>>> It doesn't make sense to use the alignment of passed values. That looks like bs.
>>>>>>>>>>
>>>>>>>>>> This means that
>>>>>>>>>>
>>>>>>>>>> Int I __aligned__(8);
>>>>>>>>>>
>>>>>>>>>> Is passed differently than int.
>>>>>>>>>>
>>>>>>>>>> Arm_function_arg needs to be fixed.
>>>>>>>>>
>>>>>>>>> That is,
>>>>>>>>>
>>>>>>>>> typedef int myint __attribute__((aligned(8)));
>>>>>>>>>
>>>>>>>>> int main()
>>>>>>>>> {
>>>>>>>>> myint i = 1;
>>>>>>>>> int j = 2;
>>>>>>>>> __builtin_printf("%d %d\n", i, j);
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> or
>>>>>>>>>
>>>>>>>>> myint i;
>>>>>>>>> int j;
>>>>>>>>> myint *p = &i;
>>>>>>>>> int *q = &j;
>>>>>>>>>
>>>>>>>>> int main()
>>>>>>>>> {
>>>>>>>>> __builtin_printf("%d %d", *p, *q);
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> should behave the same. There isn't a printf modifier for an "aligned int"
>>>>>>>>> because that sort of thing doesn't make sense. Special-casing aligned vs.
>>>>>>>>> non-aligned values only makes sense for things passed by value on the stack.
>>>>>>>>> And then obviously only dependent on the functuion type signature, not
>>>>>>>>> on the type of the passed value.
>>>>>>>>>
>>>>>>>>
>>>>>>>> I think the testcase is ill-formed. Just because printf doesn't have
>>>>>>>> such a modifier, doesn't mean that another variadic function might not
>>>>>>>> have a means to detect when an object in the variadic list needs to be
>>>>>>>> over-aligned. As such, the test should really be written as:
>>>>>>>
>>>>>>> A value doesn't have "alignment". A function may have alignment
>>>>>>> requirements on its arguments, clearly printf doesn't.
>>>>>>>
>>>>>>
>>>>>> Values don't. But types do and variadic functions are special in that
>>>>>> they derive their types from the types of the actual parameters passed
>>>>>> not from the formals in the prototype. Any manipulation of the types
>>>>>> should be done in the front end, not in the back end.
>>>>>
>>>>> The following seems to help the testcase (by luck I'd say?). It
>>>>> makes us drop alignment information from the temporaries that
>>>>> SRA creates as memory replacement.
>>>>>
>>>>> But I find it odd that on ARM passing *((aligned_int *)p) as
>>>>> vararg (only as varargs?) changes calling conventions independent
>>>>> of the functions type signature.
>>>>>
>>>>> Richard.
>>>>>
>>>>> Index: gcc/tree-sra.c
>>>>> ===================================================================
>>>>> --- gcc/tree-sra.c (revision 221770)
>>>>> +++ gcc/tree-sra.c (working copy)
>>>>> @@ -2012,7 +2012,9 @@ create_access_replacement (struct access
>>>>> DECL_CONTEXT (repl) = current_function_decl;
>>>>> }
>>>>> else
>>>>> - repl = create_tmp_var (access->type, "SR");
>>>>> + repl = create_tmp_var (build_qualified_type
>>>>> + (TYPE_MAIN_VARIANT (access->type),
>>>>> + TYPE_QUALS (access->type)), "SR");
>>>>> if (TREE_CODE (access->type) == COMPLEX_TYPE
>>>>> || TREE_CODE (access->type) == VECTOR_TYPE)
>>>>> {
>>>>>
>>>>
>>>> I guess we could do something similar (use TYPE_MAIN_VARIANT) in the
>>>> backend code - but then we'd always ignore any over-alignment
>>>> requirements (or under, for that matter).
>>>
>>> The question is whether those are even in the scope of AACPS...
>>>
>>
>> Technically, they aren't. The argument passing rules are all based on
>> the alignment rules for fundamental types (and derived rules for
>> composite types based on those fundamental types) written in the AAPCS
>> itself. There's no provision for over-aligning a data type, so all of
>> this is off-piste.
>
> So in arm_needs_doubleword_align you could do
>
> /* Return true if mode/type need doubleword alignment. */
> static bool
> arm_needs_doubleword_align (machine_mode mode, const_tree type)
> {
> return (GET_MODE_ALIGNMENT (mode) > PARM_BOUNDARY
> || (type
> && (AGGREGATE_TYPE_P (type) || mode == BLKmode)
> && TYPE_ALIGN (type) > PARM_BOUNDARY));
> }
>
> ? Thus always trust the 'mode' when it doesn't apply to an aggregate?
>
> Richard.
>
Experience has shown that the front/mid ends always end up buggering the
mode argument to the point where it even causes the wrong register class
to be chosen. It really can't be trusted unless type is NULL
(libcalls). The type, on the other hand, should always be correct and
should always be the preferred way to get the information needed.
R.