[PATCH] fix a couple of bugs in const string folding (PR 86532)

Richard Biener rguenther@suse.de
Mon Jul 23 18:33:00 GMT 2018


On July 23, 2018 7:46:08 PM GMT+02:00, Martin Sebor <msebor@gmail.com> wrote:
>On 07/23/2018 02:05 AM, Jakub Jelinek wrote:
>> On Sun, Jul 22, 2018 at 04:47:45PM -0600, Martin Sebor wrote:
>>>> No, I mean something like:
>>>>
>>>> $ cat y.c
>>>> const char a[2][3] = { "1234", "xyz" };
>>>> char b[6];
>>>>
>>>> int main ()
>>>> {
>>>>    __builtin_memcpy(b, a, 4);
>>>>    __builtin_memset(b + 4, 'a', 2);
>>>>    __builtin_printf("%.6s\n", b);
>>>> }
>>>> $ gcc y.c
>>>> y.c:1:24: warning: initializer-string for array of chars is too
>long
>>>>   const char a[2][3] = { "1234", "xyz" };
>>>>                          ^~~~~~
>>>> y.c:1:24: note: (near initialization for 'a[0]')
>>>> $ ./a.out
>>>> 1234aa
>>>>
>>>> but expected would be "123xaa".
>>>
>>> Hmm.  I assumed this was undefined in C but after double
>>> checking I'm not sure.  If it's in fact valid and the excess
>>> elements are required to be ignored I'll of course fix it in
>>> a subsequent patch.  Let me find out.
>>
>> If we just warn about the initializer and treat it some way, an
>optimization
>> should not change how the initializer is treated.
>> The memcpy and memset themselves must be valid and they should just
>copy
>> whatever is in the initializer without optimizations.
>
>The calls are valid and the initializer doesn't change with or
>without optimization.  The concern is that the string_constant
>folds this case and returns the whole initializer rather than
>taking care to avoid folding it at all, or returning just
>the leading portion(*).
>
>Since the code is undefined (and since there is a warning for
>it that's enabled by default) it shouldn't matter what happens
>in this case.  But if it's thought to be preferable to do either
>of the other two (avoid folding or returning what fits, if
>the caller asks for a non-nul terminated array) I'm fine making
>that change.
>
>It seems to me that the excessive characters should be stripped
>by the front-end.

That would indeed be preferable. Unfortunately GENERIC is even less specified semantically than GIMPLE. That means it is not clear whether this is valid GENERIC nor what the desired semantics are. The best you can do for constructors is looking at varasm and gimplification. 

That way the middle-end won't have to worry
>about what to do with apparently contradictory data.
>
>Martin
>
>[*] The initial (non-nul terminated) portion of the string
>would only be returned to callers that explicitly request it
>and are prepared handle it -- it's not on trunk yet but it is
>implemented in the followup patch for bug 86552 as a mechanism
>to detect uses of non-nul terminated arrays in string functions.



More information about the Gcc-patches mailing list