[PATCH] handle local aggregate initialization in strlen (PR 83821)
Marc Glisse
marc.glisse@inria.fr
Fri May 25 22:01:00 GMT 2018
On Fri, 25 May 2018, Martin Sebor wrote:
>> Why couldn't nonzero_chars be constant in that case?
>>
>> void f(const char*s){
>> s[0]='a';
>> // I know that strlen(s) is at least 1 here
>> }
>
> I was responding specifically to your question about the strlen()
> CSE. Above there is no call to strlen(). What I understood you
> were asking about is something like:
>
> int f (char *s)
> {
> int n0 = strlen (s);
> s[7]='a';
> int n1 = strlen (s); // can this be replaced by n0?
> return n0 == n1;
> }
Yes, but I realized afterwards that calling strlen causes strinfo to be
updated, replacing a non-tight nonzero_chars with the lhs of the call. So
I need a different use of strinfo to demonstrate the issue.
>> The example may be completely off, but just to try and describe what I
>> am wondering about:
>>
>> int f(const char*s,char*restrict t,char c){
The const shouldn't have been there.
>> // we know nothing about s for now.
>> s[0]='a'; // nonzero_chars should now be 1
>> strcpy(t,s);
>> s[2]=c; // this is after nonzero_chars
>> strcpy(t,s);
>> }
>>
>> If we assume that writing to s[2] is after the end of the original
>> string s, we can remove one of the strcpy. But that's not a valid
>> transformation in general.
>
> Above, s[0] creates strinfo for s with nonzero_chars == 1 and
> full_string_p == false. Then, after the first call to strcpy,
> that same strinfo is invalidated/removed because the length of
> s is unknown.
strcpy writes to t, which cannot alias s, so I don't see why this should
invalidate the strinfo for s.
> The next assignment to s[2] doesn't do anything
> (in the strlen pass) because the right operand is not constant).
Well, it should at least invalidate stuff.
> In the second strcpy there is no strinfo for s.
This was a bad example again, because we don't have the relevant
optimization anyway (2 identical calls to strcpy(t,s) in a row are not
simplified to a single one).
>> I originally wanted to see if we could CSE calls to strlen(s) before and
>> after the write, but the lhs of the first call to strlen would replace
>> nonzero_chars. And I don't know so well what we use strinfo for and thus
>> how dangerous it is not to invalidate it.
>
> The pass does track non-constant strlen() results so I think
> the optimization you describe should be possible. For instance,
> here it should be safe to eliminate the second strlen:
>
> int f (char *s)
> {
> s[0] = 'a';
> s[1] = 'b';
> int n0 = __builtin_strlen (s);
>
> s[0] = 'x';
> int n1 = __builtin_strlen (s);
>
> return n0 == n1;
> }
>
> It isn't optimized today because (as I explained above) the first
> strlen() replaces the constant nonzero_chars == 2 with the non-
> constant result of strlen(s).
Let me try a completely different example
char*f(){
char*p=__builtin_calloc(12,1);
__builtin_memset(p+5,1,2);
__builtin_memset(p,0,12);
return p;
}
With your patch from 12/1 and -fno-tree-dse, this is miscompiled and
misses the final memset.
I also still fear it may be possible to come up with a more strlen-like
example along the lines of
s[0]='a'; // nonzero_chars=1
- do something with s that relies on its length and doesn't invalidate
s[2]='a'; // patch prevents it from invalidating
- do something else with s
where the fact that s has a different length in lines 2 and 4 is hidden by
the patch. But maybe all the transformations are carefully written to
avoid this problem...
--
Marc Glisse
More information about the Gcc-patches
mailing list