CLANG warns for the following code, GCC doesn't (with the options tried): foo.cc:4:24: warning: adding 'char' to a string does not append to the string [-Wstring-plus-int] const char *a = "aa" + 'a'; ~~~~~^~~~~ foo.cc:4:24: note: use array indexing to silence this warning const char *a = "aa" + 'a'; ^ & [ ] 1 warning generated. Interestingly, the warning does not trigger for integer literals - only for single-character literals and for long/int/char returning functions. That's probably because "abcd" + 1 is the pointer address shifted by one, i.e. pointing to "bcd". One can also argue whether the note is helpful or not. In the real-world code, the LHS was a string class and the conversion of "aa" to the string class was missing such that the "operator+" wasn't triggered. Test case, gives three warnings (and three notes) with CLANG: #include <stdio.h> char bar() { return 1; } int foobar() { return 1; } int main() { const char *a = "aa" + 'c'; const char *b = "bb" + bar(); const char *c = "cc" + foobar(); printf("%s, %s, %s\n", a, b, c); return 0; }
Confirmed. It does warn for "aaa" + x, when x is char/int.
In both FEs there is a function build_binary_op. Put a break-point inside, examine the operands with debug_tree and figure out how to detect this case and when to warn. Add the new warning option to c-family/c.opt and document it in gcc/doc/invoke.texi. I think the note is useful but we cannot give the fix-it hints yet. Add a FIXME in the code on top of the inform call.
I think a better fixit would be to suggest using strcat or strncat or strlcat or something, since that actually appends to the string like the programmer probably intended to do. Abusing array indexing like that just looks wrong.
In the recent version of clang this diagnostic was separated to two warnings, -Wstring-plus-char and -Wstring-plus-int. -Wstring-plus-char warns "adding <char type> to a string pointer does not append to the string" where <char type> may be char, unsigned char in C/C++, and char16_t, char32_t, wchar_t in C++ 2011 and above. -Wstring-plus-int warns "adding <int type> to a string does not append to the string" where the string is a literal and the integer is not a literal (like Tobias showed in the description). It's more clear to use two different options. But the two warnings may be reported at the same location. It seems a little redundant. Should we silence one while the other has been reported?
Some clang-5.0 warning case: ~~~ Wstring-plus-int.c:20:24: warning: adding 'int' to a string does not append to the string [-Wstring-plus-int] const char *a = "aa" + 'a'; /* { dg-warning "does not append" } */ ~~~~~^~~~~ Wstring-plus-int.c:20:24: note: use array indexing to silence this warning const char *a = "aa" + 'a'; /* { dg-warning "does not append" } */ ^ & [ ] Wstring-plus-int.c:20:24: warning: adding 'char' to a string pointer does not append to the string [-Wstring-plus-char] const char *a = "aa" + 'a'; /* { dg-warning "does not append" } */ ~~~~~^~~~~ Wstring-plus-int.c:20:24: note: use array indexing to silence this warning const char *a = "aa" + 'a'; /* { dg-warning "does not append" } */ ^ & [ ] Wstring-plus-int.c:21:28: warning: adding 'int' to a string does not append to the string [-Wstring-plus-int] const wchar_t *b = L"aa" + L'a'; /* { dg-warning "does not append" } */ ~~~~~~^~~~~~ Wstring-plus-int.c:21:28: note: use array indexing to silence this warning const wchar_t *b = L"aa" + L'a'; /* { dg-warning "does not append" } */ ^ & [ ] Wstring-plus-int.c:22:24: warning: adding 'int' to a string does not append to the string [-Wstring-plus-int] const char *e = "aa" + getchar(); /* { dg-warning "does not append" } */ ~~~~~^~~~~~~~~~~ Wstring-plus-int.c:22:24: note: use array indexing to silence this warning const char *e = "aa" + getchar(); /* { dg-warning "does not append" } */ ^ & [ ] Wstring-plus-int.c:25:21: warning: adding 'char' to a string pointer does not append to the string [-Wstring-plus-char] const char *f = a + 'a'; /* { dg-warning "does not append" } */ ~~^~~~~ Wstring-plus-int.c:25:21: note: use array indexing to silence this warning const char *f = a + 'a'; /* { dg-warning "does not append" } */ ^ & [ ] Wstring-plus-int.c:26:23: warning: adding 'char' to a string pointer does not append to the string [-Wstring-plus-char] const char *g = 'a' + a; /* { dg-warning "does not append" } */ ~~~~^~~ Wstring-plus-int.c:26:23: note: use array indexing to silence this warning 6 warnings generated. ~~~ Note the both warnings are reported at line 22. By the way clang -Wstring-plus-char doesn't warn for `a += 'a'`. I think we should handle it. I've submitted a series of patches about this PR. I'll adjust them to be like clang-5.0 and resubmit them in stage 1.
With constant arguments (or those whose value or range is known), GCC should warn on the first declaration in comment #0 (copied below) not necessarily because the addition doesn't append 'c' to "aa" (i.e., with -Wstring-plus-char warns) but mainly because it results in an invalid pointer. const char *a = "aa" + 'c'; GCC should warn for a constant operand that results in an out-of-bounds pointer regardless of its type, such as the following: const char *a = "aa" + 4; GCC could (and, in my view, should) also warn on the following where the value of i isn't known but where its range is such that the addition will never result in a valid pointer: void f (int i) { if (i < 4) return; const char *a = "aa" + i; ... }
(In reply to Martin Sebor from comment #6) > With constant arguments (or those whose value or range is known), GCC should > warn on the first declaration in comment #0 (copied below) not necessarily > because the addition doesn't append 'c' to "aa" (i.e., with > -Wstring-plus-char warns) but mainly because it results in an invalid > pointer. > > const char *a = "aa" + 'c'; > > GCC should warn for a constant operand that results in an out-of-bounds > pointer regardless of its type, such as the following: > > const char *a = "aa" + 4; > Clang-5.0 is doing this now for -Wstring-plus-int. I'll try to do this. > GCC could (and, in my view, should) also warn on the following where the > value of i isn't known but where its range is such that the addition will > never result in a valid pointer: > > void f (int i) > { > if (i < 4) return; > const char *a = "aa" + i; > ... > } We should make a new PR requesting for clang -Warray-bounds as well. It's a part of meta-bug PR30334.
(In reply to Xi Ruoyao from comment #7) > We should make a new PR requesting for clang -Warray-bounds as well. It's > a part of meta-bug PR30334. Sorry. We have -Warray-bounds, but not as well as clang's. For example clang warns for "aa"[4], but GCC doesn't.
(In reply to Xi Ruoyao from comment #8) > (In reply to Xi Ruoyao from comment #7) > > > We should make a new PR requesting for clang -Warray-bounds as well. It's > > a part of meta-bug PR30334. > > Sorry. We have -Warray-bounds, but not as well as clang's. For example clang > warns for "aa"[4], but GCC doesn't. Right. It only warns on arrays, and only with optimization. That seems to be because array_at_struct_end_p() in tree.c (called from check_array_ref() in tree-vrp.c) doesn't handle ARRAY_REF with a STRING_CST operand correctly. It treats it as if it were a reference to an array at the end of a structure. This only superficially tested change lets GCC warn on that case as well (as long as the result is used). To detect this without optimization -Warray-bounds would need to also do some checking much earlier. diff --git a/gcc/tree.c b/gcc/tree.c index 8f87e7c..5fcbf65 100644 --- a/gcc/tree.c +++ b/gcc/tree.c @@ -13229,6 +13229,9 @@ array_at_struct_end_p (tree ref, bool allow_compref) ref = TREE_OPERAND (ref, 0); } + if (TREE_CODE (ref) == STRING_CST) + return false; + tree size = NULL; if (TREE_CODE (ref) == MEM_REF
A series of patch sent to gcc-patches: <https://gcc.gnu.org/ml/gcc-patches/2017-06/msg00729.html>
(In reply to Xi Ruoyao from comment #8) > (In reply to Xi Ruoyao from comment #7) > > > We should make a new PR requesting for clang -Warray-bounds as well. It's > > a part of meta-bug PR30334. > > Sorry. We have -Warray-bounds, but not as well as clang's. For example clang > warns for "aa"[4], but GCC doesn't. In that case it's a different meta-bug, PR56456
*** Bug 49481 has been marked as a duplicate of this bug. ***
*** Bug 78679 has been marked as a duplicate of this bug. ***
Bump. This hit me recently, is there a reason the patch hasn't been merged? Any changes I can contribute to get it in faster? I've checked on 8.2.0 and it doesn't look like this is in either gcc or g++.
Was this question ever answered? https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01337.html
(In reply to Jonathan Wakely from comment #15) > Was this question ever answered? > https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01337.html Oh that's intentional. This would make this warning more useful, while most people won't use a "char" to address an array or something like. For example, if someone has wrote a string-like class but forgot to overload operator+=, this warning will detect it when he writes "char c = getchar(); buggy_string t = s + c;". And for something like "char *p = s; char x; scanf("%hhd", &x); p = p + x;" 1. In C++ __INT8_TYPE__ is not __CHAR_TYPE__ so there will be no warning. He can use "int8_t x;" instead of "char x;". 2. He can use p = &p[x], which is more clear and nobody will think this is an append. I remember I wrote a response, with an option to split this into -Wstring-plus-int=1 and -Wstring-plus-int=2. But why it wasn't sent? I can't remember.
(In reply to Xi Ruoyao from comment #16) > (In reply to Jonathan Wakely from comment #15) > > Was this question ever answered? > > https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01337.html > > Oh that's intentional. > > This would make this warning more useful, while most people won't use a > "char" to address an array or something like. For example, if someone has > wrote a string-like class but forgot to overload operator+=, this warning > will detect it when he writes "char c = getchar(); buggy_string t = s + c;". > > And for something like > > "char *p = s; char x; scanf("%hhd", &x); p = p + x;" > > 1. In C++ __INT8_TYPE__ is not __CHAR_TYPE__ so there will be no warning. > He can use "int8_t x;" instead of "char x;". > 2. He can use p = &p[x], which is more clear and nobody will think this is > an append. > > I remember I wrote a response, with an option to split this into > -Wstring-plus-int=1 and -Wstring-plus-int=2. But why it wasn't sent? I > can't remember. It was in other branches of the thread: https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01345.html https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01347.html Again, as I always do when people propose new numeric levels to warning flags, I'd prefer splitting into new named options instead of adding numeric warning levels, for individual controllability, and end-user clarity. So in this case, maybe the final set could be: -Wstring-plus-int -Wstring-plus-char (-Wstring-plus-any-char?) -Wstring-plus-char-literal
A large patch will often get lost in comments and revisions unless the submitter is very insistent and committed. If you want to get this moving, my advice would be to split out the smallest piece possible (string + char literal) and just submit that for review. Once that is committed, then look for the next smallest step and repeat.