This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] fold constant flexarrays to strings (PR 91490)


On 8/22/19 3:55 PM, Martin Sebor wrote:

>>> diff --git a/gcc/expr.c b/gcc/expr.c
>>> index 92979289e83..d16e0982dcf 100644
>>> --- a/gcc/expr.c
>>> +++ b/gcc/expr.c
>>> @@ -11402,6 +11402,16 @@ is_aligning_offset (const_tree offset,
>>> const_tree exp)
>>>   tree
>>>   string_constant (tree arg, tree *ptr_offset, tree *mem_size, tree
>>> *decl)
>>>   {
>>> +  tree dummy = NULL_TREE;;
>>> +  if (!mem_size)
>>> +    mem_size = &dummy;
>>> +
>>> +  tree chartype = TREE_TYPE (arg);
>>> +  if (POINTER_TYPE_P (chartype))
>>> +    chartype = TREE_TYPE (chartype);
>>> +  while (TREE_CODE (chartype) == ARRAY_TYPE)
>>> +    chartype = TREE_TYPE (chartype);
>> I'm hesitant to ACK this bit.  I can understand that if we're given an
>> ADDR_EXPR, which should be a POINTER_TYPE_P, that we'll want to look at
>> what it's pointing to.  But shouldn't we verify that it's an ADDR_EXPR
>> before just blindly stripping away the outer type?
> 
> The character type is that of the expression, before any casts
> are stripped.  For example, given:
> 
>   struct S { char n, a[4]; } s[1] = { 0 };
> 
> and strlen (s->a), ARG is the POINTER_PLUS_EXPR
> '(const char *) &s + 1'  The type of the operand is a pointer to
> the struct S.
Given "chartype" is only used in the CONSTRUCTOR handling you're adding,
couldn't we just use "char_type_node" in there instead and drop the bits
noted above?


>>> +  if (TREE_CODE (init) == CONSTRUCTOR && !CONSTRUCTOR_ELTS (init))
>> Would initializer_zerop be better here?  That would catch
>> zero-initialized constructors as well.
> 
> I originally had initializer_zerop there and removed it at the last
> minute it because none of my tests needed it.  But my tests only
> exercised the flexible array members (which have to be initialized
> in constant objects) and not the case where initializer_zerop does
> make a difference such as:
> 
>   struct A { char a[4]; };
>   struct B { struct A a; };
> 
>   const struct B b[2] = { 0 };
> 
>   void f (void)
>   {
>     if (__builtin_strlen (b->a.a) == 3)
>       __builtin_abort ();
>   }
> 
> Here, fold_ctor_reference returns a non-empty CONSTRUCTOR that
> the code doesn't handle.  Adding initializer_zerop lets it create
> the empty string and the caller fold the call.  But it's just
> a special case of the more general problem where the CONSTRUCTOR
> is both non-empty and non-zero for the parts of the object before
> the array we're interested in, such as in
> 
>   const struct B b[1] = { 1 };
> 
> Here, strlen (b->a.a) again isn't folded.  Ironically, the equivalent
> strlen (b[0].a.a) is folded.  The difference between the two is that
> b[0].a.a is represented as itself (i.e., NOP (char*, ADDR (b.a.a))
> while b->a.a as (const char *) &b + 1.  In the former case
> fold_ctor_reference returns an empty CONSTRUCTOR for the char array.
> In the latter, it returns the non-empty, non-zero CTOR for all of
> b the we don't know how to extract the empty string from.
> 
> Incidentally, it's only the C front-end that "bastardizes"
> the expression like this.  The C++ FE preserves the original form
> of the expression and so it's able to fold the call.
> 
> (Uncovering and trying to fix these problems, by the way, is what
> I meant the other day when I said how patches in this area have
> a tendency to snownball into "projects."  Folding empty ctors of
> constant objects probably isn't terribly important but the lack
> of consistency is what makes writing tests that behave the same
> way across different front-ends and back-ends challenging.)
Yup.  I understand.  When presented with these kinds of snowballing
issues, the best advice I can give is to break things down into logical
units and make patches which are a series rather than a single unified
patch to address multiple issues.  git rebase can really help.

> 
>> If not, then you want to make sure to check !TREE_CLOBBER_P (init) since
>> that's something completely different than zero initialization.
> 
> I went with the initializer_zerop since it improves things, if
> negligibly, and added tests for it.  If you or Richard have a test
> case that I could add to the patch showing when TREE_CLOBBER_P is
> set on a CTOR and that not checking would cause trouble.
Note that initializer_zerop checks TREE_CLOBBER internally.  So if
you're using initializer_zerop, then you're good.


So all that's left is the "chartype" stuff in string_constant.  If we
can use char_type_node instead of trying to extract a character type,
then the problematical code would just go away.

jeff


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]