This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: increase alignment of global structs in increase_alignment pass
- From: Richard Biener <rguenther at suse dot de>
- To: Prathamesh Kulkarni <prathamesh dot kulkarni at linaro dot org>
- Cc: "gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>
- Date: Thu, 19 May 2016 09:49:33 +0200 (CEST)
- Subject: Re: increase alignment of global structs in increase_alignment pass
- Authentication-results: sourceware.org; auth=none
- References: <CAAgBjMnzo0xM_+JAiRS7J3P9sDF-Uoe-k6Q_7fL1dZD+W+x3qA at mail dot gmail dot com> <alpine dot LSU dot 2 dot 11 dot 1602221304130 dot 31547 at t29 dot fhfr dot qr> <CAAgBjM=Oa6Ex9D7qjDq71DsaPvSqNgzM5xpxbsai0Eyuy+PfGg at mail dot gmail dot com> <alpine dot LSU dot 2 dot 11 dot 1602231300130 dot 31547 at t29 dot fhfr dot qr> <CAAgBjMkWYk9XKkugc1SXPqW2cWVWJ2MsT5Pb=QEGCxLgw9rC7A at mail dot gmail dot com> <CAAgBjMnxqw74MKL0vuFNdv1M3mpDo-FAW4ocWEv-rZnDgSPjTw at mail dot gmail dot com> <alpine dot LSU dot 2 dot 11 dot 1605061337470 dot 13384 at t29 dot fhfr dot qr> <CAAgBjM=2o1Hr5Np8UQ=731JTHeqbfx7LrD2mzhN4Uxq98wbmYw at mail dot gmail dot com> <alpine dot LSU dot 2 dot 11 dot 1605171455390 dot 18037 at t29 dot fhfr dot qr> <CAAgBjM=1i_fuz=vJWL2KMw+yUg-4naJEmZ7nHtAXGP0Ldj2Nrw at mail dot gmail dot com> <alpine dot LSU dot 2 dot 11 dot 1605181603211 dot 18037 at t29 dot fhfr dot qr> <CAAgBjMm9ckx-432DU_3aGzm8JrzdQhZq8wJfd0GMZswxdNGRPQ at mail dot gmail dot com>
On Thu, 19 May 2016, Prathamesh Kulkarni wrote:
> On 18 May 2016 at 19:38, Richard Biener <rguenther@suse.de> wrote:
> > On Wed, 18 May 2016, Prathamesh Kulkarni wrote:
> >
> >> On 17 May 2016 at 18:36, Richard Biener <rguenther@suse.de> wrote:
> >> > On Wed, 11 May 2016, Prathamesh Kulkarni wrote:
> >> >
> >> >> On 6 May 2016 at 17:20, Richard Biener <rguenther@suse.de> wrote:
> >> >> >
> >> >> > You can't simply use
> >> >> >
> >> >> > + offset = int_byte_position (field);
> >> >> >
> >> >> > as it can be placed at variable offset which will make int_byte_position
> >> >> > ICE. Note it also returns a truncated byte position (bit position
> >> >> > stripped) which may be undesirable here. I think you want to use
> >> >> > bit_position and if and only if DECL_FIELD_OFFSET and
> >> >> > DECL_FIELD_BIT_OFFSET are INTEGER_CST.
> >> >> oops, I didn't realize offsets could be variable.
> >> >> Will that be the case only for VLA member inside struct ?
> >> >
> >> > And non-VLA members after such member.
> >> >
> >> >> > Your observation about the expensiveness of the walk still stands I guess
> >> >> > and eventually you should at least cache the
> >> >> > get_vec_alignment_for_record_decl cases. Please make those workers
> >> >> > _type rather than _decl helpers.
> >> >> Done
> >> >> >
> >> >> > You seem to simply get at the maximum vectorized field/array element
> >> >> > alignment possible for all arrays - you could restrict that to
> >> >> > arrays with at least vector size (as followup).
> >> >> Um sorry, I didn't understand this part.
> >> >
> >> > It doesn't make sense to align
> >> >
> >> > struct { int a; int b; int c; int d; float b[3]; int e; };
> >> >
> >> > because we have a float[3] member. There is no vector size that
> >> > would cover the float[3] array.
> >> Thanks for the explanation.
> >> So we do not want to align struct if sizeof (array_field) < sizeof
> >> (vector_type).
> >> This issue is also present without patch for global arrays, so I modified
> >> get_vec_alignment_for_array_type, to return 0 if sizeof (array_type) <
> >> sizeof (vectype).
> >> >
> >> >> >
> >> >> > + /* Skip artificial decls like typeinfo decls or if
> >> >> > + record is packed. */
> >> >> > + if (DECL_ARTIFICIAL (record_decl) || TYPE_PACKED (type))
> >> >> > + return 0;
> >> >> >
> >> >> > I think we should honor DECL_USER_ALIGN as well and not mess with those
> >> >> > decls.
> >> >> Done
> >> >> >
> >> >> > Given the patch now does quite some extra work it might make sense
> >> >> > to split the symtab part out of the vect_can_force_dr_alignment_p
> >> >> > predicate and call that early.
> >> >> In the patch I call symtab_node::can_increase_alignment_p early. I tried
> >> >> moving that to it's callers - vect_compute_data_ref_alignment and
> >> >> increase_alignment::execute, however that failed some tests in vect, and
> >> >> hence I didn't add the following hunk in the patch. Did I miss some
> >> >> check ?
> >> >
> >> > Not sure.
> >> >
> >> >> diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
> >> >> index 7652e21..2c1acee 100644
> >> >> --- a/gcc/tree-vect-data-refs.c
> >> >> +++ b/gcc/tree-vect-data-refs.c
> >> >> @@ -795,7 +795,10 @@ vect_compute_data_ref_alignment (struct data_reference *dr)
> >> >> && TREE_CODE (TREE_OPERAND (base, 0)) == ADDR_EXPR)
> >> >> base = TREE_OPERAND (TREE_OPERAND (base, 0), 0);
> >> >>
> >> >> - if (!vect_can_force_dr_alignment_p (base, TYPE_ALIGN (vectype)))
> >> >> + if (!(TREE_CODE (base) == VAR_DECL
> >> >> + && decl_in_symtab_p (base)
> >> >> + && symtab_node::get (base)->can_increase_alignment_p ()
> >> >> + && vect_can_force_dr_alignment_p (base, TYPE_ALIGN (vectype))))
> >> >> {
> >> >> if (dump_enabled_p ())
> >> >> {
> >> >
> >> > + for (tree field = first_field (type);
> >> > + field != NULL_TREE;
> >> > + field = DECL_CHAIN (field))
> >> > + {
> >> > + /* Skip if not FIELD_DECL or has variable offset. */
> >> > + if (TREE_CODE (field) != FIELD_DECL
> >> > + || TREE_CODE (DECL_FIELD_OFFSET (field)) != INTEGER_CST
> >> > + || TREE_CODE (DECL_FIELD_BIT_OFFSET (field)) != INTEGER_CST
> >> > + || DECL_USER_ALIGN (field)
> >> > + || DECL_ARTIFICIAL (field))
> >> > + continue;
> >> >
> >> > you can stop processing the type and return 0 here if the offset
> >> > is not an INTEGER_CST. All following fields will have the same issue.
> >> >
> >> > + /* FIXME: is this check necessary since we check for variable
> >> > offset above ? */
> >> > + if (TREE_CODE (offset_tree) != INTEGER_CST)
> >> > + continue;
> >> >
> >> > the check should not be necessary.
> >> >
> >> > offset = tree_to_uhwi (offset_tree);
> >> >
> >> > but in theory offset_tree might not fit a unsigned HOST_WIDE_INT, so
> >> > instead of for INTEGER_CST please check ! tree_fits_uhwi_p (offset_tree).
> >> > As above all following fields will also fail the check if this fails so
> >> > you can return 0 early.
> >> >
> >> > +static unsigned
> >> > +get_vec_alignment_for_type (tree type)
> >> > +{
> >> > + if (type == NULL_TREE)
> >> > + return 0;
> >> > +
> >> > + gcc_assert (TYPE_P (type));
> >> > +
> >> > + unsigned *slot = type_align_map->get (type);
> >> > + if (slot)
> >> > + return *slot;
> >> >
> >> > I suggest to apply the caching only for the RECORD_TYPE case to keep
> >> > the size of the map low.
> >> >
> >> > Otherwise the patch looks ok now.
> >> I have done the suggested changes in attached patch.
> >> Is this version OK to commit after bootstrap+test ?
> >> The patch survives cross-testing on arm*-*-* and aarch64*-*-*.
> >
> > + tree type_size_tree = TYPE_SIZE (type);
> > + if (!type_size_tree)
> > + return TYPE_ALIGN (vectype);
> > +
> > + if (TREE_CODE (type_size_tree) != INTEGER_CST)
> > + return TYPE_ALIGN (vectype);
> > +
> > + /* If array has constant size, ensure that it's at least equal to
> > + size of vector type. */
> > + if (!tree_fits_uhwi_p (type_size_tree))
> > + return TYPE_ALIGN (vectype);
> > + unsigned HOST_WIDE_INT type_size = tree_to_uhwi (type_size_tree);
> > +
> > + tree vectype_size_tree = TYPE_SIZE (vectype);
> > + if (!tree_fits_uhwi_p (vectype_size_tree))
> > + return TYPE_ALIGN (vectype);
> > +
> > + unsigned HOST_WIDE_INT vectype_size = tree_to_uhwi (vectype_size_tree);
> > + return (type_size > vectype_size) ? TYPE_ALIGN (vectype) : 0;
> >
> > please change this to
> >
> > if (! TYPE_SIZE (type)
> > || TREE_CODE (TYPE_SIZE (type)) != INTEGER_CST
> > || tree_int_cst_lt (TYPE_SIZE (type), TYPE_SIZE (vectype)))
> > return 0;
> >
> > return TYPE_ALIGN (vectype);
> >
> > consistent with the handling for "VLA" records.
> >
> > + unsigned *slot = type_align_map->get (type);
> > + if (slot)
> > + return *slot;
> > +
> > + unsigned max_align = 0, alignment;
> > + HOST_WIDE_INT offset;
> > + tree offset_tree;
> > +
> > + if (TYPE_PACKED (type))
> > + return 0;
> > +
> >
> > move the tye_align_map query after the TYPE_PACKED check.
> >
> > + /* We don't need to process the type further if offset is variable,
> > + since the offsets of remaining members will also be variable. */
> > + if (TREE_CODE (DECL_FIELD_OFFSET (field)) != INTEGER_CST
> > + || TREE_CODE (DECL_FIELD_BIT_OFFSET (field)) != INTEGER_CST)
> > + return 0;
> >
> > just break; here (and in the other case returning zero). If a
> > previous record field said we'd want to align we still should align.
> > Also we want to store 0 into the type_align_map as well.
> >
> > Ok with those changes.
> Hi,
> Is this version OK ?
> Cross tested on arm*-*-* and aarch64*-*-*
Yes. Btw, please always do a regular bootstrap and regtest, cross-testing
alone is _not_ sufficient!
Thanks,
Richard.
> Thanks,
> Prathamesh
> >
> > Thanks,
> > Richard.
>
--
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)