[PATCH] issue consistent warning for past-the-end array stores (PR 91457)

Jeff Law law@redhat.com
Thu Aug 22 18:50:00 GMT 2019


On 8/20/19 1:26 AM, Richard Biener wrote:
> On Tue, Aug 20, 2019 at 4:32 AM Martin Sebor <msebor@gmail.com> wrote:
>>
>> On 8/19/19 8:10 AM, Richard Biener wrote:
>>> On Sat, Aug 17, 2019 at 12:43 AM Martin Sebor <msebor@gmail.com> wrote:
>>>>
>>>> With the recent enhancement to the strlen handling of multibyte
>>>> stores the g++.dg/warn/Warray-bounds-4.C for zero-length arrays
>>>> started failing on hppa (and probably elsewhere as well).  This
>>>> is partly the result of the added detection of past-the-end
>>>> writes into the strlen pass which detects more instances of
>>>> the problem than -Warray-bounds.  Since the IL each warning
>>>> works with varies between targets, the same invalid code can
>>>> be diagnosed by one warning one target and different warning
>>>> on another.
>>>>
>>>> The attached patch does three things:
>>>>
>>>> 1) It enhances compute_objsize to also determine the size of
>>>> a flexible array member (and its various variants), including
>>>> from its initializer if necessary.  (This resolves 91457 but
>>>> introduces another warning where was previously just one.)
>>>> 2) It guards the new instance of -Wstringop-overflow with
>>>> the no-warning bit on the assignment to avoid warning on code
>>>> that's already been diagnosed.
>>>> 3) It arranges for -Warray-bounds to set the no-warning bit on
>>>> the enclosing expression to keep -Wstringop-overflow from issuing
>>>> another warning for the same problem.
>>>>
>>>> Testing the compute_objsize enhancement to bring it up to par
>>>> with -Warray-bounds in turn exposed a weakness in the latter
>>>> warning for flexible array members.  Rather than snowballing
>>>> additional improvements into this one I decided to put that
>>>> off until later, so the new -Warray-bounds test has a bunch
>>>> of XFAILs.  I'll see if I can find the time to deal with those
>>>> either still in stage 1 or in stage 3 (one of them is actually
>>>> an ancient regression).
>>>
>>> +static tree
>>> +get_initializer_for (tree init, tree decl)
>>> +{
>>>
>>> can't you use fold_ctor_reference here?
>>
>> Yes, but only with an additional enhancement.  Char initializers
>> for flexible array members aren't transformed to STRING_CSTs yet,
>> so without the size of the initializer specified, the function
>> returns the initializer for the smallest subobject, or char in
>> this case.  I've enhanced the function to handle them.
> 
> So at the moment it returns an empty array constructor, correct?
> Isn't that the more canonical representation?  The STRING_CST
> index type doesn't really handle "empty" strings and I expect code
> more confused about that than about an empty CTOR?
> 
>>>
>>> +/* Determine the size of the flexible array FLD from the initializer
>>> +   expression for the struct object DECL in which the meber is declared
>>> +   (possibly recursively).  Return the size or zero constant if it isn't
>>> +   initialized.  */
>>> +
>>> +static tree
>>> +get_flexarray_size (tree decl, tree fld)
>>> +{
>>> +  if (tree init = DECL_INITIAL (decl))
>>> +    {
>>> +      init = get_initializer_for (init, fld);
>>> +      if (init)
>>> +       return TYPE_SIZE_UNIT (TREE_TYPE (init));
>>> +    }
>>> +
>>> +  return integer_zero_node;
>>>
>>> so you're hoping that the (sub-)CONSTRUCTOR get_initializer_for
>>> returns has a complete type but the initialized object didn't get it
>>> completed.  Isnt that wishful thinking?
>>
>> I don't know what you mean.  When might a CONSTRUCTOR not have
>> a complete type, and if/when it doesn't, why would that be
>> a problem here?  TYPE_SIZE_UNIT will evaluate to null meaning
>> "don't know" and that's fine.  Could you try to be more specific
>> about the problem you're pointing out?
>>
>>> And why return integer_zero_node
>>> rather than NULL_TREE here?
>>
>> Because the size of a flexible array member with no initializer
>> is zero.
>>
>>>
>>> +  if (TREE_CODE (dest) == COMPONENT_REF)
>>> +    {
>>> +      *pdecl = TREE_OPERAND (dest, 1);
>>> +
>>> +      /* If the member has a size return it.  Otherwise it's a flexible
>>> +        array member.  */
>>> +      if (tree size = DECL_SIZE_UNIT (*pdecl))
>>> +       return size;
>>>
>>> because here you do.
>>
>> Not sure what you mean here either.  (This code was also a bit
> 
> You return NULL_TREE.
> 
>> out of date WRT to the patch I had tested.  Not sure how that
>> happened.  The attached patch is up to date.)
>>
>>>
>>> Also once you have an underlying VAR_DECL you can compute
>>> the flexarray size by DECL_SIZE (var) - offset-of flexarray member.
>>> Isn't that way cheaper than walking the initializer (possibly many
>>> times?)
>>
>> It would be nice if it were this easy.  Is the value of DECL_SIZE
>> (var) supposed to include the size of the flexible array member?
> 
> Yes, DECL_SIZE of the VAR_DECL is the size we use for assembling.
> It is usually only the types that remain incomplete (or too small) since
> the FE doesn't create many variants of a struct S { int n; char x[]; }
> when "instantiating" it via struct S s = { 5, "abcde" };  that then holds
> true for the DECL_SIZE of the FIELD_DECL of x as well.
> 
>> I don't see it mentioned in the comments in tree.h and in my tests
>> it only does in C but not in C++.  Is that a bug that in C++ it
>> doesn't?
> 
> tree.h dcoumentation isn't complete.  And for me the C++ FE for
> the above simple example has
> 
>  <var_decl 0x7ffff7fedb40 s
>     type <record_type 0x7ffff6d94f18 S cxx-odr-p type_5 type_6 BLK
>         size <integer_cst 0x7ffff6c5e0a8 constant 32>
>         unit-size <integer_cst 0x7ffff6c5e0c0 constant 4>
>         align:32 warn_if_not_align:0 symtab:0 alias-set -1
> canonical-type 0x7ffff6d94f18
>         fields <function_decl 0x7ffff6d98b00 __dt  type <method_type
> 0x7ffff6da8150>
>             public abstract external autoinline decl_3 QI t.ii:1:8
> align:16 warn_if_not_align:0 context <record_type 0x7ffff6d94f18 S>
>             full-name "S::~S() noexcept (<uninstantiated>)"
>             not-really-extern chain <function_decl 0x7ffff6d98d00
> __dt_base >> context <translation_unit_decl 0x7ffff6c4a168 t.ii>
>         full-name "struct S"
>         X() X(constX&) this=(X&) n_parents=0 use_template=0 interface-unknown
>         pointer_to_this <pointer_type 0x7ffff6da85e8>
> reference_to_this <reference_type 0x7ffff6da8a80> chain <type_decl
> 0x7ffff6c6b850 S>>
>     public static tree_1 tree_2 tree_3 BLK t.ii:3:3 size <integer_cst
> 0x7ffff6c5e0a8 32> unit-size <integer_cst 0x7ffff6c5e0c0 4>
>     align:32 warn_if_not_align:0 context <translation_unit_decl
> 0x7ffff6c4a168 t.ii> initial <constructor 0x7ffff6da52a0> chain
> <type_decl 0x7ffff6c6b850 S>>
> 
> and we assemble it like
> 
>         .globl  s
>         .data
>         .align 4
>         .type   s, @object
>         .size   s, 4
> s:
>         .long   5
>         .string "abcde"
> 
> note how we have .size s, 4 here...
> 
> of course in the end nobody cares about a symbols size ...(?)
The linker may care.  I've certainly run across those which would
warn/error when the size of an object with global scope differed across
TUs.   I wouldn't expect to be running into that kind of issue in the
modern world though.

Jeff



More information about the Gcc-patches mailing list