[PATCH 1/4] introduce struct strlen_data_t into gimple-fold

Martin Sebor msebor@gmail.com
Mon Dec 10 21:01:00 GMT 2018

Jeff, is there something you are expecting me to change in
response to this or have you just not gotten around to reviewing
the rest?


On 11/28/18 9:26 PM, Jeff Law wrote:
> On 11/25/18 5:05 PM, Martin Sebor wrote:
>>> If so, then I think we need
>>> to look for a better name than MAXSIZE and MAXLEN.
>> I find these names quite fitting and I'm not sure what might work
>> better.  I renamed MAXSIZE to MAXBOUND but nothing comes to mind
>> as a replacement for MAXLEN.  Please suggest something you think
>> is better.
>>>> +  /* When non-null, NONSTR refers to the declaration known to store
>>>> +     an unterminated constant character array, as in:
>>>> +     const char s[] = { 'a', 'b', 'c' };
>>>> +     It is used to diagnose uses of such arrays in functions such as
>>>> +     strlen() that expect a nul-terminated string as an argument.  */
>>>> +  tree nonstr;
>>> So rather than NONSTR, DECL may make more sense -- if for no other
>>> reason than you don't have to think in terms of "not a string".
>> Done, but I think DECL is a poor choice for the reasons below.
>> The field is only set when the thing the object refers to is
>> a character array that is not a string.  It identifies the first
>> array the expression refers to that's not a terminated string
>> (there could be multiple).  I can't think of anything else one
>> might want to think of it as than "a declaration of an array
>> that is not a string."
>> As a name, DECL is generic and used all over the place for any
>> sort of a declaration so it's not a good choice also for that
>> reason.  It's only marginally more descriptive that the pervasive
>> NODE or T, but just as useless to grep for (which I have been
>> relying on when working with this patch).
>> I have been using the name NONSTR in all contexts where
>> I introduced the unterminated array handling, so renaming
>> the member to DECL makes this scheme inconsistent
> NONSTR requires you to think in the negative and it sounds more like a
> boolean property to me, but it's actually carrying more information than
> just a boolean.
> I'm certainly not wed to DECL.  If you've got a better name, please
> suggest one.
>>>> +  /* ELTSIZE is set to 1 for single byte character strings, and 2 or 4
>>>> +     for wide characer strings.  */
>>>> +  unsigned eltsize;
>>> Bernd's suggestion that we separate the input vs output paramters may be
>>> a reasonable one -- I think this is the only in-parameter you're passing
>>> with the structure, right?  And everything else is a pure output?  If so
>>> we may be better off continuing to pass the element size as a separate
>>> parameter.
>> I changed it in the updated patch.  I had chosen to make it
>> a member to reduce the number of arguments to these functions and
>> in anticipation of having them update it before returning if they
>> discover that the actual element size doesn't match the expected
>> size, as in:
>>    printf ("%ls", "narrow string");
>> Similarly to what I proposed here:
>>    https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01321.html
>> I don't see what has been gained by making it an argument again.
> It's the separation of inputs from outputs he was trying to achieve that
> I said *may* be a reasonable one.
> It's not always a good thing, nor is it always a bad thing.  If there
> are subsequent patches are going to have callees changing it, then it
> absolutely makes sense to include it.  But adding it to the structure
> for the mere sake of reducing arguments may not.
> And just to be clear, I'm not inherently against pulling stuff into
> structures and classes.   Quite the opposite.
>>>> +  /* FLEXARRAY is true if the range of the string lengths has been
>>>> +     obtained from the upper bound of an array at the end of a struct.
>>>> +     Such an array may hold a string that's longer than its upper bound
>>>> +     due to it being used as a poor-man's flexible array member.  */
>>>> +  bool flexarray;
>>>> +
>>>> +  /* Set ELTSIZE and value-initialize all other members.  */
>>>> +  strlen_data_t (unsigned eltbytes)
>>>> +    : minlen (), maxlen (), maxsize (), nonstr (), eltsize (eltbytes),
>>>> +      flexarray () { }
>>> I think if you pull ELTSIZE out and pass it as a distinct parameter,
>>> then you don't need a ctor and you can have a POD.  You can then
>>> initialize with memset rather than having to individually initialize
>>> each field -- meaning it's easier to add more output fields in the
>>> future.
>> Without ELTSIZE neither a ctor nor memset is necessary for
>> initialization.  This works too and is the preferred style
>> in C++ 98:
>>    c_strlen_data data = { };
> I thought that didn't work somewhere.  I certainly would have preferred
> that over memset when I was twiddling yours and Bernd's earlier patches.
>>> I don't think any of the suggestions above change the behavior of the
>>> patch.  Let's hold off committing though (I assume you've got a GIT
>>> topic branch where you can make these changes and update the subsequent
>>> patches independent of everything else...)
>> I do but I don't know how to make these changes without having
>> to resolve all the conflicts with the intervening changes on
>> trunk at each step so I have just a single patch now that resolves
>> the conflicts with trunk committed since I started this work and
>> that makes the changes you requested above.  The majority of
>> the changes in the patch are just minor adjustments (the meat
>> of the change is in gimple-fold.c; all the rest is just minor
>> adjustments) so hopefully this won't be a problem.
> That's a sign you're trying to do too much at once and too long a wait
> between iterations.
> Jeff

More information about the Gcc-patches mailing list