This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: New regression on ARM Linux


On Tue, 31 Mar 2015, Richard Earnshaw wrote:

> On 31/03/15 11:45, Richard Biener wrote:
> > On Tue, 31 Mar 2015, Richard Earnshaw wrote:
> > 
> >> On 31/03/15 11:36, Richard Biener wrote:
> >>> On Tue, 31 Mar 2015, Richard Earnshaw wrote:
> >>>
> >>>> On 31/03/15 11:00, Richard Biener wrote:
> >>>>> On Tue, 31 Mar 2015, Richard Earnshaw wrote:
> >>>>>
> >>>>>> On 31/03/15 08:50, Richard Biener wrote:
> >>>>>>> On Mon, Mar 30, 2015 at 10:13 PM, Richard Biener <rguenther@suse.de> wrote:
> >>>>>>>> On March 30, 2015 6:45:34 PM GMT+02:00, Alan Lawrence <alan.lawrence@arm.com> wrote:
> >>>>>>>>> -O2 was what I first used; it also occurs at -O1. -fno-tree-sra fixes
> >>>>>>>>> it.
> >>>>>>>>>
> >>>>>>>>> The problem appears to be in laying out arguments, specifically
> >>>>>>>>> varargs. From
> >>>>>>>>> the "good" -fdump-rtl-expand:
> >>>>>>>>>
> >>>>>>>>> (insn 18 17 19 2 (set (mem:SI (reg/f:SI 107 virtual-outgoing-args) [0
> >>>>>>>>> S4 A32])
> >>>>>>>>>         (reg:SI 111 [ b1$16 ])) reduced.c:14 -1
> >>>>>>>>>      (nil))
> >>>>>>>>> (insn 19 18 20 2 (set (reg:DF 2 r2)
> >>>>>>>>>         (reg:DF 112 [ b1$8 ])) reduced.c:14 -1
> >>>>>>>>>      (nil))
> >>>>>>>>> (insn 20 19 21 2 (set (reg:SI 1 r1)
> >>>>>>>>>         (reg:SI 113 [ b1 ])) reduced.c:14 -1
> >>>>>>>>>      (nil))
> >>>>>>>>> (insn 21 20 22 2 (set (reg:SI 0 r0)
> >>>>>>>>>         (reg:SI 118)) reduced.c:14 -1
> >>>>>>>>>      (nil))
> >>>>>>>>> (call_insn 22 21 23 2 (parallel [
> >>>>>>>>>             (set (reg:SI 0 r0)
> >>>>>>>>>                 (call (mem:SI (symbol_ref:SI ("printf") [flags 0x41]
> >>>>>>>>> <function_decl 0x2ab50e856d00 __builtin_printf>) [0 __builtin_printf S4
> >>>>>>>>> A32])
> >>>>>>>>>
> >>>>>>>>> The struct members are
> >>>>>>>>> reg:SI 113 => int a;
> >>>>>>>>> reg:DF 112 => double b;
> >>>>>>>>> reg:SI 111 => int c;
> >>>>>>>>>
> >>>>>>>>> r0 gets the format string; r1 gets int a; r2+r3 get double b; int c is
> >>>>>>>>> pushed
> >>>>>>>>> into virtual-outgoing-args. In contrast, post-change to
> >>>>>>>>> build_ref_of_offset, we get:
> >>>>>>>>>
> >>>>>>>>> (insn 17 16 18 2 (set (reg:SI 118)
> >>>>>>>>>   (symbol_ref/v/f:SI ("*.LC1") [flags 0x82]  <var_decl 0x2ba57fa8d750
> >>>>>>>>> *.LC1>)) reduced.c:14 -1
> >>>>>>>>>      (nil))
> >>>>>>>>> (insn 18 17 19 2 (set (mem:SI (plus:SI (reg/f:SI 107
> >>>>>>>>> virtual-outgoing-args)
> >>>>>>>>>                 (const_int 8 [0x8])) [0  S4 A64])
> >>>>>>>>>         (reg:SI 111 [ b1$16 ])) reduced.c:14 -1
> >>>>>>>>>      (nil))
> >>>>>>>>> (insn 19 18 20 2 (set (mem:DF (reg/f:SI 107 virtual-outgoing-args) [0
> >>>>>>>>> S8 A64])
> >>>>>>>>>         (reg:DF 112 [ b1$8 ])) reduced.c:14 -1
> >>>>>>>>>      (nil))
> >>>>>>>>> (insn 20 19 21 2 (set (reg:SI 2 r2)
> >>>>>>>>>         (reg:SI 113 [ b1 ])) reduced.c:14 -1
> >>>>>>>>>      (nil))
> >>>>>>>>> (insn 21 20 22 2 (set (reg:SI 0 r0)
> >>>>>>>>>         (reg:SI 118)) reduced.c:14 -1
> >>>>>>>>>      (nil))
> >>>>>>>>> (call_insn 22 21 23 2 (parallel [
> >>>>>>>>>             (set (reg:SI 0 r0)
> >>>>>>>>>                 (call (mem:SI (symbol_ref:SI ("printf") [flags 0x41]
> >>>>>>>>> <function_decl 0x2ba57f843d00 __builtin_printf>) [0 __builtin_printf S4
> >>>>>>>>> A32])
> >>>>>>>>>
> >>>>>>>>> r0 still gets the format string, but 'int b1.a' now goes in r2, and the
> >>>>>>>>>
> >>>>>>>>> double+following int are all pushed into virtual-outgoing-args. This is
> >>>>>>>>> because
> >>>>>>>>> arm_function_arg is fed a 64-bit-aligned int as type of the second
> >>>>>>>>> argument (the
> >>>>>>>>> type constructed by build_ref_for_offset); it then executes
> >>>>>>>>> (aapcs_layout_arg,
> >>>>>>>>> arm.c line ~~5914)
> >>>>>>>>>
> >>>>>>>>>   /* C3 - For double-word aligned arguments, round the NCRN up to the
> >>>>>>>>>      next even number.  */
> >>>>>>>>>   ncrn = pcum->aapcs_ncrn;
> >>>>>>>>>   if ((ncrn & 1) && arm_needs_doubleword_align (mode, type))
> >>>>>>>>>     ncrn++;
> >>>>>>>>>
> >>>>>>>>> Which changes r1 to r2. Passing -fno-tree-sra, or removing from the
> >>>>>>>>> testcase
> >>>>>>>>> "*(cls_struct_16byte *)resp = b1", causes arm_function_arg to be fed a
> >>>>>>>>> 32-bit-aligned int instead, which works as previously.
> >>>>>>>>>
> >>>>>>>>> Passing the same members of that struct in a non-vargs call, works ok -
> >>>>>>>>> I think
> >>>>>>>>> because these use the type of the declared parameters, rather than the
> >>>>>>>>> provided
> >>>>>>>>> arguments, and the former do not have the increased alignment from
> >>>>>>>>> build_ref_for_offset.
> >>>>>>>>
> >>>>>>>> It doesn't make sense to use the alignment of passed values.  That looks like bs.
> >>>>>>>>
> >>>>>>>> This means that
> >>>>>>>>
> >>>>>>>> Int I __aligned__(8);
> >>>>>>>>
> >>>>>>>> Is passed differently than int.
> >>>>>>>>
> >>>>>>>> Arm_function_arg needs to be fixed.
> >>>>>>>
> >>>>>>> That is,
> >>>>>>>
> >>>>>>> typedef int myint __attribute__((aligned(8)));
> >>>>>>>
> >>>>>>> int main()
> >>>>>>> {
> >>>>>>>   myint i = 1;
> >>>>>>>   int j = 2;
> >>>>>>>   __builtin_printf("%d %d\n", i, j);
> >>>>>>> }
> >>>>>>>
> >>>>>>> or
> >>>>>>>
> >>>>>>> myint i;
> >>>>>>> int j;
> >>>>>>> myint *p = &i;
> >>>>>>> int *q = &j;
> >>>>>>>
> >>>>>>> int main()
> >>>>>>> {
> >>>>>>>   __builtin_printf("%d %d", *p, *q);
> >>>>>>> }
> >>>>>>>
> >>>>>>> should behave the same.  There isn't a printf modifier for an "aligned int"
> >>>>>>> because that sort of thing doesn't make sense.  Special-casing aligned vs.
> >>>>>>> non-aligned values only makes sense for things passed by value on the stack.
> >>>>>>> And then obviously only dependent on the functuion type signature, not
> >>>>>>> on the type of the passed value.
> >>>>>>>
> >>>>>>
> >>>>>> I think the testcase is ill-formed.  Just because printf doesn't have
> >>>>>> such a modifier, doesn't mean that another variadic function might not
> >>>>>> have a means to detect when an object in the variadic list needs to be
> >>>>>> over-aligned.  As such, the test should really be written as:
> >>>>>
> >>>>> A value doesn't have "alignment".  A function may have alignment
> >>>>> requirements on its arguments, clearly printf doesn't.
> >>>>>
> >>>>
> >>>> Values don't.  But types do and variadic functions are special in that
> >>>> they derive their types from the types of the actual parameters passed
> >>>> not from the formals in the prototype.  Any manipulation of the types
> >>>> should be done in the front end, not in the back end.
> >>>
> >>> The following seems to help the testcase (by luck I'd say?).  It
> >>> makes us drop alignment information from the temporaries that
> >>> SRA creates as memory replacement.
> >>>
> >>> But I find it odd that on ARM passing *((aligned_int *)p) as
> >>> vararg (only as varargs?) changes calling conventions independent
> >>> of the functions type signature.
> >>>
> >>> Richard.
> >>>
> >>> Index: gcc/tree-sra.c
> >>> ===================================================================
> >>> --- gcc/tree-sra.c      (revision 221770)
> >>> +++ gcc/tree-sra.c      (working copy)
> >>> @@ -2012,7 +2012,9 @@ create_access_replacement (struct access
> >>>        DECL_CONTEXT (repl) = current_function_decl;
> >>>      }
> >>>    else
> >>> -    repl = create_tmp_var (access->type, "SR");
> >>> +    repl = create_tmp_var (build_qualified_type
> >>> +                            (TYPE_MAIN_VARIANT (access->type),
> >>> +                             TYPE_QUALS (access->type)), "SR");
> >>>    if (TREE_CODE (access->type) == COMPLEX_TYPE
> >>>        || TREE_CODE (access->type) == VECTOR_TYPE)
> >>>      {
> >>>
> >>
> >> I guess we could do something similar (use TYPE_MAIN_VARIANT) in the
> >> backend code - but then we'd always ignore any over-alignment
> >> requirements (or under, for that matter).
> > 
> > The question is whether those are even in the scope of AACPS...
> > 
> 
> Technically, they aren't.  The argument passing rules are all based on
> the alignment rules for fundamental types (and derived rules for
> composite types based on those fundamental types) written in the AAPCS
> itself.  There's no provision for over-aligning a data type, so all of
> this is off-piste.

So in arm_needs_doubleword_align you could do

/* Return true if mode/type need doubleword alignment.  */
static bool
arm_needs_doubleword_align (machine_mode mode, const_tree type)
{
  return (GET_MODE_ALIGNMENT (mode) > PARM_BOUNDARY
          || (type
              && (AGGREGATE_TYPE_P (type) || mode == BLKmode)
              && TYPE_ALIGN (type) > PARM_BOUNDARY));
}

?  Thus always trust the 'mode' when it doesn't apply to an aggregate?

Richard.

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild,
Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]