[PATCH] Fix PR 100944 - Array boundary misscalculation

Richard Biener richard.guenther@gmail.com
Tue Jun 15 06:23:43 GMT 2021


On Mon, Jun 14, 2021 at 7:32 PM Martin Sebor <msebor@gmail.com> wrote:
>
> On 6/14/21 12:36 AM, Richard Biener via Gcc-patches wrote:
> > On Sun, Jun 13, 2021 at 7:00 PM Giuliano Belinassi via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> >>
> >> This patch proposes a fix to PR 100944 by improving the array boundary
> >> computation when the flexible array has no clear constructor: if no
> >> constructor were found in the input code, we compute the size of the
> >> array as:
> >>
> >>    offset(array begin) - offset(next element in RECORD_TYPE)
> >>
> >> If no next element is found in the RECORD, simply fall back to:
> >>
> >>    offset(array begin) - offset(RECORD_TYPE ends)
> >>
> >> We avoid doing this calculation if the RECORD_TYPE is actually an
> >> UNION_TYPE, once things get very complicated in this case.
> >
> > I'm looking at Wzero-length-array-bounds.c and wonder why the patch
> > makes a difference here.
> >
> > struct X { int a[0]; int b, c; };
> >
> > a is clearly not a flex-array?
>
> The warning issued here shouldn't change.  -Wzero-length-bounds is
> documented to
>
>    Warn about Accesses to elements of zero-length array members that
>    might overlap other members of the same object.
>
> By "the object" the text means the complete object (if one can be
> determined), not necessarily just the subobject of the enclosing
> struct.  This is contrast to accesses to zero-length arrays that
> are out of the bounds of the complete object.  This distinction
> was made to help the kernel transition to the new warning (I'm
> not sure if they're done with the cleanup yet).
>
> >
> > The code in component_ref_size is truly weird.  IMHO using
> > get_ref_base_and_unit_offset for computing the size of a flexarray
> > member is the wrong utility - instead a much better choice
> > (also for consistency) would be array_at_struct_end_p itself,
> > for example by giving it an optional HOST_WIDE_INT *size
> > argument.
>
> array_at_struct_end_p() is permissive and treats all trailing arrays
> as effectively unbounded if it doesn't know the enclosing object.
> -Warray-bounds and all other access warnings are more conservative
> and treat only flexible array members and trailing arrays of zero
> length as unbounded (except in complete objects where they consider
> the initializer for the former).

Sure, but that distinction is easy to fix for the consumer by looking at
the array field in case array_at_struct_end_p returns true.  What I'm
saying is that array_at_struct_end_p already performs analysis
whether the array is the last field in the object and it already performs
analysis of the object size - meaning it computes the possible extent
of the array (and it matches that up with the array declaration).  It could
simply communicate the length [bound] to the caller.

And if looking at an initializer is really necessary (IMHO that smells like
the FE issue that fails to adjust DECL_SIZE of the object from it?) then
even array_at_struct_end_p could be improved by looking at it by
passing the (optional) initializer to the function.

Richard.

> Martin
>
> >
> > Richard.
> >
> >> gcc/ChangeLog:
> >> 2021-13-08  Giuliano Belinassi  <giuliano.belinassi@usp.br>
> >>
> >>          PR middle-end/100944
> >>          * tree-dfa.c (get_ref_base_and_unit_offset): Add option to compute size
> >>          of next field.
> >>          * (get_ref_base_and_unit_offset_1): Same as above.
> >>          * tree-dfa.h (get_ref_base_and_unit_offset): Same as above.
> >>          (get_ref_base_and_unit_offset_1): Same as above.
> >>          * tree.c (least_common_record_1): New.
> >>          (least_common_record): New.
> >>          (component_ref_size): Improve array size calculation.
> >>
> >> gcc/testsuite/ChangeLog:
> >> 2021-13-08  Giuliano Belinassi  <giuliano.belinassi@usp.br>
> >>
> >>          PR middle-end/100944
> >>          * gcc.dg/Wzero-length-array-bounds.c: Update diagnostic.
> >>          * gcc.dg/Warray-bounds-71.c: New test.
> >> ---
> >>   gcc/testsuite/gcc.dg/Warray-bounds-71.c       |  42 +++++++
> >>   .../gcc.dg/Wzero-length-array-bounds.c        |  18 +--
> >>   gcc/tree-dfa.c                                |  31 ++++-
> >>   gcc/tree-dfa.h                                |   6 +-
> >>   gcc/tree.c                                    | 115 ++++++++++++++----
> >>   5 files changed, 172 insertions(+), 40 deletions(-)
> >>   create mode 100644 gcc/testsuite/gcc.dg/Warray-bounds-71.c
> >>
> >> diff --git a/gcc/testsuite/gcc.dg/Warray-bounds-71.c b/gcc/testsuite/gcc.dg/Warray-bounds-71.c
> >> new file mode 100644
> >> index 00000000000..cc5b083bc77
> >> --- /dev/null
> >> +++ b/gcc/testsuite/gcc.dg/Warray-bounds-71.c
> >> @@ -0,0 +1,42 @@
> >> +/* PR middle-end/100944 - missing -Warray-bounds accessing a flexible array
> >> +   member of a nested struct
> >> +   { dg-do compile }
> >> +   { dg-options "-O2 -Wall" } */
> >> +
> >> +struct A0
> >> +{
> >> +  int i, a[0];
> >> +};
> >> +
> >> +struct B0
> >> +{
> >> +  struct A0 a;
> >> +  long x;
> >> +} b0;
> >> +
> >> +void f0 (int i)
> >> +{
> >> +  long t = b0.x;
> >> +  b0.a.a[i] = 0;    // { dg-warning "\\\[-Warray-bounds" }
> >> +  if (t != b0.x)    // folded to false
> >> +    __builtin_abort ();
> >> +}
> >> +
> >> +struct Ax
> >> +{
> >> +  int i, a[];
> >> +};
> >> +
> >> +struct Bx
> >> +{
> >> +  struct Ax a;
> >> +  long x;
> >> +} bx;
> >> +
> >> +void fx (int i)
> >> +{
> >> +  long t = bx.x;
> >> +  bx.a.a[i] = 0;    // { dg-warning "\\\[-Warray-bounds" }
> >> +  if (t != bx.x)    // folded to false
> >> +    __builtin_abort ();
> >> +}
> >> diff --git a/gcc/testsuite/gcc.dg/Wzero-length-array-bounds.c b/gcc/testsuite/gcc.dg/Wzero-length-array-bounds.c
> >> index 8e880d92dea..117b30ff294 100644
> >> --- a/gcc/testsuite/gcc.dg/Wzero-length-array-bounds.c
> >> +++ b/gcc/testsuite/gcc.dg/Wzero-length-array-bounds.c
> >> @@ -68,21 +68,21 @@ extern struct Y y;
> >>
> >>   void access_to_member (int i)
> >>   {
> >> -  y.a[0].a[0] = 0;      // { dg-warning "\\\[-Wzero-length-bounds" }
> >> -  y.a[0].a[1] = 0;      // { dg-warning "\\\[-Wzero-length-bounds" }
> >> -  y.a[0].a[2] = 0;      // { dg-warning "\\\[-Wzero-length-bounds" }
> >> +  y.a[0].a[0] = 0;      // { dg-warning "\\\[-Warray-bounds" }
> >> +  y.a[0].a[1] = 0;      // { dg-warning "\\\[-Warray-bounds" }
> >> +  y.a[0].a[2] = 0;      // { dg-warning "\\\[-Warray-bounds" }
> >>     sink (a);
> >>
> >> -  y.a[1].a[0] = 0;      // { dg-warning "\\\[-Wzero-length-bounds" }
> >> -  y.a[1].a[1] = 0;      // { dg-warning "\\\[-Wzero-length-bounds" }
> >> +  y.a[1].a[0] = 0;      // { dg-warning "\\\[-Warray-bounds" }
> >> +  y.a[1].a[1] = 0;      // { dg-warning "\\\[-Warray-bounds" }
> >>     /* Similar to the array case above, accesses to a subsequent member
> >>        of the "parent" struct seem like a more severe problem than those
> >>        to the next member of the same struct.  */
> >> -  y.a[1].a[2] = 0;      // { dg-warning "\\\[-Wzero-length-bounds" }
> >> +  y.a[1].a[2] = 0;      // { dg-warning "\\\[-Warray-bounds" }
> >>     sink (a);
> >>
> >> -  y.b.a[0] = 0;         // { dg-warning "\\\[-Wzero-length-bounds" }
> >> -  y.b.a[1] = 0;         // { dg-warning "\\\[-Wzero-length-bounds" }
> >> -  y.b.a[2] = 0;         // { dg-warning "\\\[-Wzero-length-bounds" }
> >> +  y.b.a[0] = 0;         // { dg-warning "\\\[-Warray-bounds" }
> >> +  y.b.a[1] = 0;         // { dg-warning "\\\[-Warray-bounds" }
> >> +  y.b.a[2] = 0;         // { dg-warning "\\\[-Warray-bounds" }
> >>     y.b.a[3] = 0;         // { dg-warning "\\\[-Warray-bounds" }
> >>   }
> >> diff --git a/gcc/tree-dfa.c b/gcc/tree-dfa.c
> >> index 1d20de0c400..dc3c15f11d5 100644
> >> --- a/gcc/tree-dfa.c
> >> +++ b/gcc/tree-dfa.c
> >> @@ -772,9 +772,11 @@ get_ref_base_and_extent_hwi (tree exp, HOST_WIDE_INT *poffset,
> >>
> >>   tree
> >>   get_addr_base_and_unit_offset_1 (tree exp, poly_int64_pod *poffset,
> >> -                                tree (*valueize) (tree))
> >> +                                tree (*valueize) (tree),
> >> +                                bool of_next_component /* = false.  */)
> >>   {
> >>     poly_int64 byte_offset = 0;
> >> +  tree next_field = NULL_TREE;
> >>
> >>     /* Compute cumulative byte-offset for nested component-refs and array-refs,
> >>        and find the ultimate containing object.  */
> >> @@ -797,11 +799,27 @@ get_addr_base_and_unit_offset_1 (tree exp, poly_int64_pod *poffset,
> >>          case COMPONENT_REF:
> >>            {
> >>              tree field = TREE_OPERAND (exp, 1);
> >> -           tree this_offset = component_ref_field_offset (exp);
> >> +           tree this_offset;
> >> +
> >>              poly_int64 hthis_offset;
> >>
> >> +           if (of_next_component && !next_field)
> >> +             {
> >> +               /* We are looking for the next component of the record.  */
> >> +               next_field = TREE_CHAIN (field);
> >> +               if (!next_field)
> >> +                 break;
> >> +
> >> +               /* We found a next component.  Flag that we found it and
> >> +                  update the target field.  */
> >> +               field = next_field;
> >> +             }
> >> +
> >> +           this_offset = component_ref_field_offset (exp);
> >> +
> >>              if (!this_offset
> >>                  || !poly_int_tree_p (this_offset, &hthis_offset)
> >> +               || TREE_CODE (field) != FIELD_DECL
> >>                  || (TREE_INT_CST_LOW (DECL_FIELD_BIT_OFFSET (field))
> >>                      % BITS_PER_UNIT))
> >>                return NULL_TREE;
> >> @@ -904,6 +922,9 @@ get_addr_base_and_unit_offset_1 (tree exp, poly_int64_pod *poffset,
> >>   done:
> >>
> >>     *poffset = byte_offset;
> >> +
> >> +  if (of_next_component)
> >> +    return next_field;
> >>     return exp;
> >>   }
> >>
> >> @@ -913,9 +934,11 @@ done:
> >>      is not BITS_PER_UNIT-aligned.  */
> >>
> >>   tree
> >> -get_addr_base_and_unit_offset (tree exp, poly_int64_pod *poffset)
> >> +get_addr_base_and_unit_offset (tree exp, poly_int64_pod *poffset, bool
> >> +                              of_next_component /* = false.  */)
> >>   {
> >> -  return get_addr_base_and_unit_offset_1 (exp, poffset, NULL);
> >> +  return get_addr_base_and_unit_offset_1 (exp, poffset, NULL,
> >> +                                         of_next_component);
> >>   }
> >>
> >>   /* Returns true if STMT references an SSA_NAME that has
> >> diff --git a/gcc/tree-dfa.h b/gcc/tree-dfa.h
> >> index b1457ab065c..94e44d9c3f6 100644
> >> --- a/gcc/tree-dfa.h
> >> +++ b/gcc/tree-dfa.h
> >> @@ -34,8 +34,10 @@ extern tree get_ref_base_and_extent (tree, poly_int64_pod *, poly_int64_pod *,
> >>   extern tree get_ref_base_and_extent_hwi (tree, HOST_WIDE_INT *,
> >>                                           HOST_WIDE_INT *, bool *);
> >>   extern tree get_addr_base_and_unit_offset_1 (tree, poly_int64_pod *,
> >> -                                            tree (*) (tree));
> >> -extern tree get_addr_base_and_unit_offset (tree, poly_int64_pod *);
> >> +                                            tree (*) (tree),
> >> +                                            bool of_next_component = false);
> >> +extern tree get_addr_base_and_unit_offset (tree, poly_int64_pod *,
> >> +                                          bool = false);
> >>   extern bool stmt_references_abnormal_ssa_name (gimple *);
> >>   extern void replace_abnormal_ssa_names (gimple *);
> >>   extern void dump_enumerated_decls (FILE *, dump_flags_t);
> >> diff --git a/gcc/tree.c b/gcc/tree.c
> >> index 1aa6e557a04..45d7fa2ae92 100644
> >> --- a/gcc/tree.c
> >> +++ b/gcc/tree.c
> >> @@ -12649,6 +12649,47 @@ get_initializer_for (tree init, tree decl)
> >>     return NULL_TREE;
> >>   }
> >>
> >> +static int
> >> +least_common_record_1 (tree basetype, tree field1, tree field2,
> >> +                      tree *least_basetype)
> >> +{
> >> +  int ret = 0;
> >> +
> >> +  for (tree fld = TYPE_FIELDS (basetype); fld; fld = TREE_CHAIN (fld))
> >> +    {
> >> +      if (fld == field1 || fld == field2)
> >> +       ret++;
> >> +
> >> +      if (TREE_CODE (TREE_TYPE (fld)) == UNION_TYPE
> >> +         || TREE_CODE (TREE_TYPE (fld)) == RECORD_TYPE)
> >> +       ret += least_common_record_1 (TREE_TYPE (fld), field1, field2,
> >> +                                     least_basetype);
> >> +    }
> >> +
> >> +  if (ret == 2)
> >> +    {
> >> +      *least_basetype = basetype;
> >> +      ret++; /* Avoid getting in this block again if a common basetype were
> >> +               found.  */
> >> +    }
> >> +
> >> +  return ret;
> >> +}
> >> +
> >> +/* Find the least common RECORD type common to two FIELDS from base.  */
> >> +static tree
> >> +least_common_record (tree basetype, tree field1, tree field2)
> >> +{
> >> +  if (!(TREE_CODE (basetype) == RECORD_TYPE
> >> +       || TREE_CODE (basetype) == UNION_TYPE))
> >> +    return NULL_TREE;
> >> +
> >> +  tree least_basetype = NULL_TREE;
> >> +  least_common_record_1 (basetype, field1, field2, &least_basetype);
> >> +
> >> +  return least_basetype;
> >> +}
> >> +
> >>   /* Determines the size of the member referenced by the COMPONENT_REF
> >>      REF, using its initializer expression if necessary in order to
> >>      determine the size of an initialized flexible array member.
> >> @@ -12768,28 +12809,30 @@ component_ref_size (tree ref, special_array_member *sam /* = NULL */)
> >>     memsize = NULL_TREE;
> >>
> >>     if (typematch)
> >> -    /* MEMBER is a true flexible array member.  Compute its size from
> >> -       the initializer of the BASE object if it has one.  */
> >> -    if (tree init = DECL_P (base) ? DECL_INITIAL (base) : NULL_TREE)
> >> -      if (init != error_mark_node)
> >> -       {
> >> -         init = get_initializer_for (init, member);
> >> -         if (init)
> >> -           {
> >> -             memsize = TYPE_SIZE_UNIT (TREE_TYPE (init));
> >> -             if (tree refsize = TYPE_SIZE_UNIT (argtype))
> >> -               {
> >> -                 /* Use the larger of the initializer size and the tail
> >> -                    padding in the enclosing struct.  */
> >> -                 poly_int64 rsz = tree_to_poly_int64 (refsize);
> >> -                 rsz -= baseoff;
> >> -                 if (known_lt (tree_to_poly_int64 (memsize), rsz))
> >> -                   memsize = wide_int_to_tree (TREE_TYPE (memsize), rsz);
> >> -               }
> >> -
> >> -             baseoff = 0;
> >> -           }
> >> -       }
> >> +    {
> >> +      /* MEMBER is a true flexible array member.  Compute its size from
> >> +        the initializer of the BASE object if it has one.  */
> >> +      if (tree init = DECL_P (base) ? DECL_INITIAL (base) : NULL_TREE)
> >> +       if (init != error_mark_node)
> >> +         {
> >> +           init = get_initializer_for (init, member);
> >> +           if (init)
> >> +             {
> >> +               memsize = TYPE_SIZE_UNIT (TREE_TYPE (init));
> >> +               if (tree refsize = TYPE_SIZE_UNIT (argtype))
> >> +                 {
> >> +                   /* Use the larger of the initializer size and the tail
> >> +                      padding in the enclosing struct.  */
> >> +                   poly_int64 rsz = tree_to_poly_int64 (refsize);
> >> +                   rsz -= baseoff;
> >> +                   if (known_lt (tree_to_poly_int64 (memsize), rsz))
> >> +                     memsize = wide_int_to_tree (TREE_TYPE (memsize), rsz);
> >> +                 }
> >> +
> >> +               baseoff = 0;
> >> +             }
> >> +         }
> >> +    }
> >>
> >>     if (!memsize)
> >>       {
> >> @@ -12810,9 +12853,31 @@ component_ref_size (tree ref, special_array_member *sam /* = NULL */)
> >>            memsize = TYPE_SIZE_UNIT (bt);
> >>          }
> >>         else if (DECL_P (base))
> >> -       /* Use the size of the BASE object (possibly an array of some
> >> -          other type such as char used to store the struct).  */
> >> -       memsize = DECL_SIZE_UNIT (base);
> >> +       {
> >> +         /* It is possible that the flexible array has no constructor at all.
> >> +            In such cases, GCC will allocate the array in some weird way.
> >> +            Assume that the array size is the difference between the start
> >> +            address of the array and the next component, if it exists.  */
> >> +
> >> +         tree lcr;
> >> +
> >> +         poly_int64 baseoff2 = 0;
> >> +         tree next_field = get_addr_base_and_unit_offset (ref, &baseoff2,
> >> +                                                     /* next_elmnt =  */ true);
> >> +         if (next_field
> >> +             && (lcr = least_common_record (basetype, member, next_field))
> >> +             && TREE_CODE (lcr) == RECORD_TYPE
> >> +             && known_ge (baseoff2, baseoff))
> >> +           {
> >> +               poly_int64 size =  baseoff2 - baseoff;
> >> +               memsize = wide_int_to_tree (TREE_TYPE (DECL_SIZE_UNIT (base)),
> >> +                                           size);
> >> +           }
> >> +         else
> >> +           /* Use the size of the BASE object (possibly an array of some
> >> +              other type such as char used to store the struct).  */
> >> +           memsize = DECL_SIZE_UNIT (base);
> >> +       }
> >>         else
> >>          return NULL_TREE;
> >>       }
> >> --
> >> 2.32.0
> >>
>


More information about the Gcc-patches mailing list