This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Add a character size parameter to c_strlen/get_range_strlen
On 08/16/18 19:35, Jeff Law wrote:
> On 08/15/2018 12:51 PM, Bernd Edlinger wrote:
> [ snip -- comment attribution is likely lost... ]
>
>>>>>
>>>>> 2018-08-14 Bernd Edlinger <bernd.edlinger@hotmail.de>
>>>>>
>>>>> * builtins.c (c_strlen): Add new parameter eltsize.
>>>>> * builtins.h (c_strlen): Adjust prototype.
>>>>> * expr.c (string_constant): Add new parameter mem_size.
>>>>> * expr.h (string_constant): Adjust protoype.
>>>>> * gimple-fold.c (get_range_strlen): Add new parameter eltsize.
>>>>> * gimple-fold.h (get_range_strlen): Adjust prototype.
>>>>> * gimple-ssa-sprintf.c (get_string_length): Add new parameter eltsize.
>>>>> (format_string): Call get_string_length with eltsize.
>>>>>
>>>>> 2018-08-14 Bernd Edlinger <bernd.edlinger@hotmail.de>
>>>>>
>>>>> * gcc.dg/strlenopt-49.c: Adjust test case.
>>>>> * gcc.dg/tree-ssa/builtin-sprintf-warn-18.c: Likewise.
>>>>
>>>> Your patch appears to pass in an ELTSIZE which is used to determine how
>>>> to count. This has some advantages in that the routine can be used to
>>>> count bytes or other elements such as 2 or 4 byte wide characters.
> [ ... ]
> So yesterday I sent a message where I was totally offbase with the
> intent of this patch. After further discussions and re-review ISTM the
> real intent here is to restore prior behavior of counting bytes to the
> clients outside the sprintf pass. Terribly sorry for that and the
> conclusions I drew as a result.
>
>
no problem at all.
I really appreciate the time you take to review
this patch.
And I see how confusing this must be, but I think
some things went in a wrong direction for quite some time.
>
>
>>>
>>> Parameterizing the function to return either the number of
>>> bytes or the number of elements makes sense as an enhancement.
>>> It makes less sense (and could be the source of bugs) to let
>>> callers pass in anything else. A boolean flag to toggle
>>> between the two alternatives would make the API narrower
>>> and safer to use. It doesn't seem necessary to fix any
>>> bugs.
> No bugs that we're currently aware of. But ensuring that routines that
> expect to be counting bytes are counting bytes is a good thing. If some
> pass (say forwprop) changed the char * argument to an array of something
> other than chars then we'd likely end up giving a wrong result.
>
> I don't mind making the interface narrower with a boolean or an enum to
> specify how we count rather than having the caller pass in a type.
>
>
Yes, the caller _must_ know what to expect.
The function should not try to be smarter than the caller.
>>>
>>>> Your testsuite change appears to remove the xfails in strlenopt-49 which
>>>> is usually OK. So no concerns there.
> So it turns out we don't really need any of the testsuite changes anyway
> if we make this patch independent of the 86711/86714 patches. And
> breaking that dependency is IMHO a good thing.
>
> [ Another snip. ]
>
>> Anyway, I thought your patch would convert dir.specifier == 'S'
>> into dir.modifier == FMT_LEN_l, since %S is like %ls.
>>
>> the xfail will go away if I change this:
>>
>> get_string_length (arg, dir.modifier == FMT_LEN_l ? 4 : 1);
>>
>> to:
>
>> get_string_length (arg, dir.specifier == 'S' dir.modifier == FMT_LEN_l ? 4 : 1);
> Presumably "||" is missing here.
>
Yes, and on another line, I confirmed that it works, and removes the xfail.
>>
>>
>> That should be easy, I will change the patch to match your patch in that
>> area.
>>
>>
>> The other xfail with memcmp needs to be fixed separately.
>> The issue is, callers of string_constant that do not pass the mem_size
>> parameter may access up to TREE_STRING_LENGTH byte, therefore
>> returning longer strings is not appropriate: the string constant
>> contains 9 bytes, mind the terminating nul.
> If we make the patch to add the size parameter to c_strlen independent
> of the changes related to 86711/86714, then we don't need to xfail or
> change the testsuite in any way shape or form.
>
> My overall thinking is to carve out this patch so that it's independent
> of anything else. That's actually easy to do after addressing two very
> minor issues.
>
I can give it a try, but I think that 86711/86714 is only a manifestation
of the design issues, and it is probably impossible to fix the design
without fixing 86711/86714 at the same time. It is likely not the only
one IMHO.
They are due to string_constant, c_strlen and c_getstr returning data that is
invalid in some corner cases, like over-length string constants. These were
previously only happening rarely, but due to the recent enhancements, much more
corner cases were added, and some of them trigger those issues now.
> We can then add Martin's patch to fix handling of wide chars in sprintf
> (86853).
>
Yes, I think that seems likely an independent issue. If done properly it
will probably not even be a merge conflict.
> --
>
> This overall plan addresses the need to be able to tell c_strlen how to
> count and reverts its behavior for callers outside of the sprintf
> warning pass to historical (pre-gcc7) behavior of counting bytes by
> default. It does this without losing any warnings and allow us to move
> forward with Martin's more complete fix for 86853.
>
Yes, I have already started to write a similar patch on c_getstr, which
seems to fix the other xfail.
The issue with c_getstr is similar, some call it with one parameter,
those expect a zero terminated string, that needs to be valid.
Others use a two parameter overload, which retuns the memory length,
those use memcmp or memchr, and need no zero termination.
But I start to think that I should merge all three patches into one,
to avoid unnecessary confusion.
Bernd.
> With those two patches out of the way we can then dig further into the
> issues around 86711/86714.
>
>
> jeff
>
>
>