[PATCH] Add a character size parameter to c_strlen/get_range_strlen

Bernd Edlinger bernd.edlinger@hotmail.de
Thu Aug 16 21:56:00 GMT 2018


On 08/16/18 23:27, Jeff Law wrote:
> On 08/16/2018 12:47 PM, Bernd Edlinger wrote:
> 
>>>>> Parameterizing the function to return either the number of
>>>>> bytes or the number of elements makes sense as an enhancement.
>>>>> It makes less sense (and could be the source of bugs) to let
>>>>> callers pass in anything else.  A boolean flag to toggle
>>>>> between the two alternatives would make the API narrower
>>>>> and safer to use.  It doesn't seem necessary to fix any
>>>>> bugs.
>>> No bugs that we're currently aware of. But ensuring that routines that
>>> expect to be counting bytes are counting bytes is a good thing.  If some
>>> pass (say forwprop) changed the char * argument to an array of something
>>> other than chars then we'd likely end up giving a wrong result.
>>>
>>> I don't mind making the interface narrower with a boolean or an enum to
>>> specify how we count rather than having the caller pass in a type.
>>>
>>>
>>
>> Yes, the caller _must_ know what to expect.
>> The function should not try to be smarter than the caller.
> So I have a tweaked implementation here which uses a boolean.  I
> actually don't like it much.  I think it's largely because true/false is
> just a bad match for the objects and what we're doing.
> 
> It's not like something is on or off -- we're selecting between sizes.
> 
> An enum would work, but it just seems like overkill.  So I'm leaning
> back towards just having the caller pass it in as an integer.  We can
> have c_strlen verify the argument for sanity.
> 
>>>
>>>> get_string_length (arg, dir.specifier == 'S' dir.modifier == FMT_LEN_l ? 4 : 1);
>>> Presumably  "||" is missing here.
>>>
>>
>> Yes, and on another line, I confirmed that it works, and removes the xfail.
> Similarly.  I actually had all this working last night before sending
> out the message.
> 
> 
>>
>>
>>>>
>>>>
>>>> That should be easy, I will change the patch to match your patch in that
>>>> area.
>>>>
>>>>
>>>> The other xfail with memcmp needs to be fixed separately.
>>>> The issue is, callers of string_constant that do not pass the mem_size
>>>> parameter may access up to TREE_STRING_LENGTH byte, therefore
>>>> returning longer strings is not appropriate: the string constant
>>>> contains 9 bytes, mind the terminating nul.
>>> If we make the patch to add the size parameter to c_strlen independent
>>> of the changes related to 86711/86714, then we don't need to xfail or
>>> change the testsuite in any way shape or form.
>>>
>>> My overall thinking is to carve out this patch so that it's independent
>>> of anything else.  That's actually easy to do after addressing two very
>>> minor issues.
>>>
>>
>> I can give it a try, but I think that 86711/86714 is only a manifestation
>> of the design issues, and it is probably impossible to fix the design
>> without fixing 86711/86714 at the same time.  It is likely not the only
>> one IMHO.
> You don't need to do the work -- I've already got all the bits here.  I
> don't mind taking care of the final post/commit.
> 
> 
>>
>> They are due to string_constant, c_strlen and c_getstr returning data that is
>> invalid in some corner cases, like over-length string constants.  These were
>> previously only happening rarely, but due to the recent enhancements, much more
>> corner cases were added, and some of them trigger those issues now.
> Let's not go down this yet :-)
> 
> 
>>
>>
>>> We can then add Martin's patch to fix handling of wide chars in sprintf
>>> (86853).
>>>
>>
>> Yes, I think that seems likely an independent issue.  If done properly it
>> will probably not even be a merge conflict.
> It doesn't cause any conflicts.  I've already verified that.
> 
>>
>>> --
>>>
>>> This overall plan addresses the need to be able to tell c_strlen how to
>>> count and reverts its behavior for callers outside of the sprintf
>>> warning pass to historical (pre-gcc7) behavior of counting bytes by
>>> default.  It does this without losing any warnings and allow us to move
>>> forward with Martin's more complete fix for 86853.
>>>
>>
>> Yes, I have already started to write a similar patch on c_getstr, which
>> seems to fix the other xfail.
>>
>> The issue with c_getstr is similar, some call it with one parameter,
>> those expect a zero terminated string, that needs to be valid.
>> Others use a two parameter overload, which retuns the memory length,
>> those use memcmp or memchr, and need no zero termination.
>>
>> But I start to think that I should merge all three patches into one,
>> to avoid unnecessary confusion.
> Let's wait and please let's try to avoid mashing too many things
> together.  I expect to start looking at the next round of stuff tomorrow
> or Monday.
> 

Don't worry, I did not mean to merge that with the varasm patch(es).

Those are currently broken by Martin's PR 71625 :-( because this does
create lots of non-zero terminated string constants.

Rather the opposite, strictly speaking the change in c_getstr was
in anticipation of the guaranteed zero-termination, as I removed
the check "if (string[string_length - 1] != '\0')".

I will restore that check and make sure that the string is actually
a single byte string, otherwise this check would be meaningless.



Bernd.



More information about the Gcc-patches mailing list