This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Add a character size parameter to c_strlen/get_range_strlen


On 08/23/18 00:50, Jeff Law wrote:
> On 08/22/2018 02:14 AM, Richard Biener wrote:
>> On Wed, 22 Aug 2018, Bernd Edlinger wrote:
>>
>>> On 08/22/18 09:26, Richard Biener wrote:
>>>> On Wed, 22 Aug 2018, Bernd Edlinger wrote:
>>>>
>>>>> On 08/21/18 10:59, Richard Biener wrote:
>>>>>> On Tue, 21 Aug 2018, Bernd Edlinger wrote:
>>>>>>
>>>>>>> gcc -S -O2 -Wall -Wformat-overflow -ftrack-macro-expansion=0 -fshort-wchar builtin-sprintf-warn-20.c
>>>>>>> builtin-sprintf-warn-20.c: In function 'test':
>>>>>>> builtin-sprintf-warn-20.c:19:39: warning: hex escape sequence out of range
>>>>>>> 19 |     ? (char*)L"\x4142\x4344" : (char*)L"\x41424344\x45464748";
>>>>>>>        |                                       ^~~~~~~~~~~~~~~~~~~~~~~
>>>>>>>
>>>>>>> Hmm, this test might create some noise on short-wchar targets.
>>>>>>>
>>>>>>> I would prefer a warning here, about the wrong type of the parameter.
>>>>>>> The buffer overflow is only a secondary thing.
>>>>>>>
>>>>>>> For constant objects like those, the GIMPLE type is still guaranteed to be reliable,
>>>>>>> right?
>>>>>>
>>>>>> TREE_TYPE of tcc_declaration and tcc_constant trees should more-or-less
>>>>>> (minus qualifications not affecting semantics) be those set by
>>>>>> frontends.
>>>>>>
>>>>>
>>>>> and in this case:
>>>>>
>>>>> const union
>>>>> { struct {
>>>>>        wchar_t x[4];
>>>>>      };
>>>>>      struct {
>>>>>        char z[8];
>>>>>      };
>>>>> } u = {{L"123"}};
>>>>>
>>>>> int test()
>>>>> {
>>>>>      return __builtin_strlen(u.z);
>>>>> }
>>>>>
>>>>>
>>>>> string_constant works out the initializer for u.x
>>>>> which has a different type than u.z
>>>>
>>>> Yes.  That's because it uses ctor-for-folding and friends.  It's
>>>> a question of the desired semantics of string_constant whether
>>>> it should better return NULL_TREE in this case or whether the
>>>> caller has to deal with type mismatches.
>>>>
>>>
>>> Yes, absolutely.
>>>
>>> c_getstr needs to bail out if the string is not zero-terminated
>>> within the limits given by the decl, or the string_cst-type or whatever
>>> may help.
>>>
>>> Furthermore I also consider it possible that the byteoffset
>>> is not a multiple of eltsize.  So fail in that case as well.
>>>
>>>
>>> I am currently boot-strapping a patch for this (pr87053):
>>> $ cat u.c
>>> const union
>>> { struct {
>>>       char x[4];
>>>       char y[4];
>>>     };
>>>     struct {
>>>       char z[8];
>>>     };
>>> } u = {{"1234", "567"}};
>>>
>>> int test()
>>> {
>>>     return __builtin_strlen(u.z);
>>> }
>>>
>>> gets folded to 4.
>>>
>>> ... but unfortunately it will depend on my pr86714 fix which fixes
>>> the mem_size parameter returned from string_constant.
>>>
>>> Frankly, in the moment I feel like I fell in a deep deep hole.   :-O
>>
>> /me too with > 100 mails in this and related threads unread ;)
>>
>> I thought Jeff applied the mem_size parameter thing but maybe it
>> was something else.  *fetches coffee*  Ah, Jeff is still "working"
>> on it.
> The patch that was applied adds a parameter to c_strlen to indicate how
> it should count (ie, bytes, 16bit wchars or 32bit wchars).
> 
> There was one hunk that was dependent upon an earlier, more
> controversial, patch from Bernd which is still under discussion.  That
> hunk was twiddled to preserve the overall goal of Bernd's patch which
> added the how to count parameter without being dependent upon the older,
> more controversial patch.
> 
> It's all confusing and it'd actually help if folks stopped issuing
> updated patches which try to handle more cases or which touch on the
> areas of code already being looked at.  Otherwise the patches just get
> more tangled and we're more likely to end up with developers getting
> more entrenched in a particular approach.
> 

Sorry, Jeff.

Actually I would not do all that work in my spare time, if I would not
really feel the need to help (in the end also help myself of course,
since I have basically a need of a working GCC to get my paid work done).

Unfortunately all those issues are somehow connected via
the unspecified semantical differences between STRING_CST
used for literals and used for initializers, but without working
on those patches I would not have found that out.


Bernd.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]