This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: New regression on ARM Linux
- From: Richard Biener <rguenther at suse dot de>
- To: Richard Earnshaw <Richard dot Earnshaw at foss dot arm dot com>
- Cc: Richard Biener <richard dot guenther at gmail dot com>, Alan Lawrence <Alan dot Lawrence at arm dot com>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, Marcus Shawcroft <Marcus dot Shawcroft at arm dot com>
- Date: Tue, 31 Mar 2015 13:08:56 +0200 (CEST)
- Subject: Re: New regression on ARM Linux
- Authentication-results: sourceware.org; auth=none
- References: <alpine dot LSU dot 2 dot 11 dot 1503111607530 dot 10796 at zhemvz dot fhfr dot qr> <55193E77 dot 3040401 at arm dot com> <55193EFB dot 6030009 at arm dot com> <alpine dot LSU dot 2 dot 11 dot 1503301501070 dot 31545 at zhemvz dot fhfr dot qr> <alpine dot LSU dot 2 dot 11 dot 1503301509150 dot 31545 at zhemvz dot fhfr dot qr> <55197DAE dot 1050004 at arm dot com> <16A8C5AA-994C-4FA3-BC25-8547166DC13C at suse dot de> <CAFiYyc1MOL8AAwtVhvYDYmSSo+MdGi2FFi-6ELfRxdsvDxtSew at mail dot gmail dot com> <551A6C46 dot 7010305 at foss dot arm dot com> <alpine dot LSU dot 2 dot 11 dot 1503311200090 dot 31545 at zhemvz dot fhfr dot qr> <551A729F dot 3090200 at foss dot arm dot com> <alpine dot LSU dot 2 dot 11 dot 1503311229310 dot 31545 at zhemvz dot fhfr dot qr> <551A7981 dot 30805 at foss dot arm dot com> <alpine dot LSU dot 2 dot 11 dot 1503311245090 dot 31545 at zhemvz dot fhfr dot qr> <551A7C29 dot 2080105 at foss dot arm dot com>
On Tue, 31 Mar 2015, Richard Earnshaw wrote:
> On 31/03/15 11:45, Richard Biener wrote:
> > On Tue, 31 Mar 2015, Richard Earnshaw wrote:
> >
> >> On 31/03/15 11:36, Richard Biener wrote:
> >>> On Tue, 31 Mar 2015, Richard Earnshaw wrote:
> >>>
> >>>> On 31/03/15 11:00, Richard Biener wrote:
> >>>>> On Tue, 31 Mar 2015, Richard Earnshaw wrote:
> >>>>>
> >>>>>> On 31/03/15 08:50, Richard Biener wrote:
> >>>>>>> On Mon, Mar 30, 2015 at 10:13 PM, Richard Biener <rguenther@suse.de> wrote:
> >>>>>>>> On March 30, 2015 6:45:34 PM GMT+02:00, Alan Lawrence <alan.lawrence@arm.com> wrote:
> >>>>>>>>> -O2 was what I first used; it also occurs at -O1. -fno-tree-sra fixes
> >>>>>>>>> it.
> >>>>>>>>>
> >>>>>>>>> The problem appears to be in laying out arguments, specifically
> >>>>>>>>> varargs. From
> >>>>>>>>> the "good" -fdump-rtl-expand:
> >>>>>>>>>
> >>>>>>>>> (insn 18 17 19 2 (set (mem:SI (reg/f:SI 107 virtual-outgoing-args) [0
> >>>>>>>>> S4 A32])
> >>>>>>>>> (reg:SI 111 [ b1$16 ])) reduced.c:14 -1
> >>>>>>>>> (nil))
> >>>>>>>>> (insn 19 18 20 2 (set (reg:DF 2 r2)
> >>>>>>>>> (reg:DF 112 [ b1$8 ])) reduced.c:14 -1
> >>>>>>>>> (nil))
> >>>>>>>>> (insn 20 19 21 2 (set (reg:SI 1 r1)
> >>>>>>>>> (reg:SI 113 [ b1 ])) reduced.c:14 -1
> >>>>>>>>> (nil))
> >>>>>>>>> (insn 21 20 22 2 (set (reg:SI 0 r0)
> >>>>>>>>> (reg:SI 118)) reduced.c:14 -1
> >>>>>>>>> (nil))
> >>>>>>>>> (call_insn 22 21 23 2 (parallel [
> >>>>>>>>> (set (reg:SI 0 r0)
> >>>>>>>>> (call (mem:SI (symbol_ref:SI ("printf") [flags 0x41]
> >>>>>>>>> <function_decl 0x2ab50e856d00 __builtin_printf>) [0 __builtin_printf S4
> >>>>>>>>> A32])
> >>>>>>>>>
> >>>>>>>>> The struct members are
> >>>>>>>>> reg:SI 113 => int a;
> >>>>>>>>> reg:DF 112 => double b;
> >>>>>>>>> reg:SI 111 => int c;
> >>>>>>>>>
> >>>>>>>>> r0 gets the format string; r1 gets int a; r2+r3 get double b; int c is
> >>>>>>>>> pushed
> >>>>>>>>> into virtual-outgoing-args. In contrast, post-change to
> >>>>>>>>> build_ref_of_offset, we get:
> >>>>>>>>>
> >>>>>>>>> (insn 17 16 18 2 (set (reg:SI 118)
> >>>>>>>>> (symbol_ref/v/f:SI ("*.LC1") [flags 0x82] <var_decl 0x2ba57fa8d750
> >>>>>>>>> *.LC1>)) reduced.c:14 -1
> >>>>>>>>> (nil))
> >>>>>>>>> (insn 18 17 19 2 (set (mem:SI (plus:SI (reg/f:SI 107
> >>>>>>>>> virtual-outgoing-args)
> >>>>>>>>> (const_int 8 [0x8])) [0 S4 A64])
> >>>>>>>>> (reg:SI 111 [ b1$16 ])) reduced.c:14 -1
> >>>>>>>>> (nil))
> >>>>>>>>> (insn 19 18 20 2 (set (mem:DF (reg/f:SI 107 virtual-outgoing-args) [0
> >>>>>>>>> S8 A64])
> >>>>>>>>> (reg:DF 112 [ b1$8 ])) reduced.c:14 -1
> >>>>>>>>> (nil))
> >>>>>>>>> (insn 20 19 21 2 (set (reg:SI 2 r2)
> >>>>>>>>> (reg:SI 113 [ b1 ])) reduced.c:14 -1
> >>>>>>>>> (nil))
> >>>>>>>>> (insn 21 20 22 2 (set (reg:SI 0 r0)
> >>>>>>>>> (reg:SI 118)) reduced.c:14 -1
> >>>>>>>>> (nil))
> >>>>>>>>> (call_insn 22 21 23 2 (parallel [
> >>>>>>>>> (set (reg:SI 0 r0)
> >>>>>>>>> (call (mem:SI (symbol_ref:SI ("printf") [flags 0x41]
> >>>>>>>>> <function_decl 0x2ba57f843d00 __builtin_printf>) [0 __builtin_printf S4
> >>>>>>>>> A32])
> >>>>>>>>>
> >>>>>>>>> r0 still gets the format string, but 'int b1.a' now goes in r2, and the
> >>>>>>>>>
> >>>>>>>>> double+following int are all pushed into virtual-outgoing-args. This is
> >>>>>>>>> because
> >>>>>>>>> arm_function_arg is fed a 64-bit-aligned int as type of the second
> >>>>>>>>> argument (the
> >>>>>>>>> type constructed by build_ref_for_offset); it then executes
> >>>>>>>>> (aapcs_layout_arg,
> >>>>>>>>> arm.c line ~~5914)
> >>>>>>>>>
> >>>>>>>>> /* C3 - For double-word aligned arguments, round the NCRN up to the
> >>>>>>>>> next even number. */
> >>>>>>>>> ncrn = pcum->aapcs_ncrn;
> >>>>>>>>> if ((ncrn & 1) && arm_needs_doubleword_align (mode, type))
> >>>>>>>>> ncrn++;
> >>>>>>>>>
> >>>>>>>>> Which changes r1 to r2. Passing -fno-tree-sra, or removing from the
> >>>>>>>>> testcase
> >>>>>>>>> "*(cls_struct_16byte *)resp = b1", causes arm_function_arg to be fed a
> >>>>>>>>> 32-bit-aligned int instead, which works as previously.
> >>>>>>>>>
> >>>>>>>>> Passing the same members of that struct in a non-vargs call, works ok -
> >>>>>>>>> I think
> >>>>>>>>> because these use the type of the declared parameters, rather than the
> >>>>>>>>> provided
> >>>>>>>>> arguments, and the former do not have the increased alignment from
> >>>>>>>>> build_ref_for_offset.
> >>>>>>>>
> >>>>>>>> It doesn't make sense to use the alignment of passed values. That looks like bs.
> >>>>>>>>
> >>>>>>>> This means that
> >>>>>>>>
> >>>>>>>> Int I __aligned__(8);
> >>>>>>>>
> >>>>>>>> Is passed differently than int.
> >>>>>>>>
> >>>>>>>> Arm_function_arg needs to be fixed.
> >>>>>>>
> >>>>>>> That is,
> >>>>>>>
> >>>>>>> typedef int myint __attribute__((aligned(8)));
> >>>>>>>
> >>>>>>> int main()
> >>>>>>> {
> >>>>>>> myint i = 1;
> >>>>>>> int j = 2;
> >>>>>>> __builtin_printf("%d %d\n", i, j);
> >>>>>>> }
> >>>>>>>
> >>>>>>> or
> >>>>>>>
> >>>>>>> myint i;
> >>>>>>> int j;
> >>>>>>> myint *p = &i;
> >>>>>>> int *q = &j;
> >>>>>>>
> >>>>>>> int main()
> >>>>>>> {
> >>>>>>> __builtin_printf("%d %d", *p, *q);
> >>>>>>> }
> >>>>>>>
> >>>>>>> should behave the same. There isn't a printf modifier for an "aligned int"
> >>>>>>> because that sort of thing doesn't make sense. Special-casing aligned vs.
> >>>>>>> non-aligned values only makes sense for things passed by value on the stack.
> >>>>>>> And then obviously only dependent on the functuion type signature, not
> >>>>>>> on the type of the passed value.
> >>>>>>>
> >>>>>>
> >>>>>> I think the testcase is ill-formed. Just because printf doesn't have
> >>>>>> such a modifier, doesn't mean that another variadic function might not
> >>>>>> have a means to detect when an object in the variadic list needs to be
> >>>>>> over-aligned. As such, the test should really be written as:
> >>>>>
> >>>>> A value doesn't have "alignment". A function may have alignment
> >>>>> requirements on its arguments, clearly printf doesn't.
> >>>>>
> >>>>
> >>>> Values don't. But types do and variadic functions are special in that
> >>>> they derive their types from the types of the actual parameters passed
> >>>> not from the formals in the prototype. Any manipulation of the types
> >>>> should be done in the front end, not in the back end.
> >>>
> >>> The following seems to help the testcase (by luck I'd say?). It
> >>> makes us drop alignment information from the temporaries that
> >>> SRA creates as memory replacement.
> >>>
> >>> But I find it odd that on ARM passing *((aligned_int *)p) as
> >>> vararg (only as varargs?) changes calling conventions independent
> >>> of the functions type signature.
> >>>
> >>> Richard.
> >>>
> >>> Index: gcc/tree-sra.c
> >>> ===================================================================
> >>> --- gcc/tree-sra.c (revision 221770)
> >>> +++ gcc/tree-sra.c (working copy)
> >>> @@ -2012,7 +2012,9 @@ create_access_replacement (struct access
> >>> DECL_CONTEXT (repl) = current_function_decl;
> >>> }
> >>> else
> >>> - repl = create_tmp_var (access->type, "SR");
> >>> + repl = create_tmp_var (build_qualified_type
> >>> + (TYPE_MAIN_VARIANT (access->type),
> >>> + TYPE_QUALS (access->type)), "SR");
> >>> if (TREE_CODE (access->type) == COMPLEX_TYPE
> >>> || TREE_CODE (access->type) == VECTOR_TYPE)
> >>> {
> >>>
> >>
> >> I guess we could do something similar (use TYPE_MAIN_VARIANT) in the
> >> backend code - but then we'd always ignore any over-alignment
> >> requirements (or under, for that matter).
> >
> > The question is whether those are even in the scope of AACPS...
> >
>
> Technically, they aren't. The argument passing rules are all based on
> the alignment rules for fundamental types (and derived rules for
> composite types based on those fundamental types) written in the AAPCS
> itself. There's no provision for over-aligning a data type, so all of
> this is off-piste.
So in arm_needs_doubleword_align you could do
/* Return true if mode/type need doubleword alignment. */
static bool
arm_needs_doubleword_align (machine_mode mode, const_tree type)
{
return (GET_MODE_ALIGNMENT (mode) > PARM_BOUNDARY
|| (type
&& (AGGREGATE_TYPE_P (type) || mode == BLKmode)
&& TYPE_ALIGN (type) > PARM_BOUNDARY));
}
? Thus always trust the 'mode' when it doesn't apply to an aggregate?
Richard.
--
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild,
Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)