Created attachment 47937 [details] Reduced testcase If the attached testcase is compiled with `-O1 -foptimize-strlen' on amd64, the function foo is miscompiled: the assignment to res.target[1] is omitted. It also happens with -O2, but not with -O1 alone or -O3. This bug is somewhat similar to bug 93213. It is a regression from 9.2.0. The generated assembly is: foo: subq $8, %rsp call my_alloc movq $.LC0, (%rax) movq $.LC1, 16(%rax) movq $.LC1, 24(%rax) movq $.LC1, 32(%rax) addq $8, %rsp ret Note the absence of `movq $.LC1, 8(%rax)'. I tested with trunk, latest pull from git, revision 117baab8. In particular the patch for bug 93213 (e13f37d9) is included. The program is compiled correctly by gcc 9.2.0 with the same options (and all others I tried). I did a git bisect and the offending commit is 34fcf41e. The cast in foo() at first looked questionable from a strict aliasing standpoint, but I believe the code is legal since the memory returned by calloc has no declared type, and we never access this memory except as objects of type `const char *'. Also, the miscompilation persists with -fno-strict-aliasing. I am no gcc expert, but I dug into the source a little bit, out of curiosity. It looks like the deletion happens in handle_store(), at gcc/tree-ssa-strlen.c:5021. It seems that in this function, the code is being treated as if the string "12345678" itself were being stored at address res.target, rather than the address of the string; as if the code were `strcpy(res.target, "12345678")'. In particular, it thinks the trailing null was stored at address res.target+8. The following statement, `res.target[1] = ""', is likewise treated as if it were `strcpy(res.target+8, "")', which would also just store a null byte at res.target+8, so it is seen as redundant and is removed. I would like to acknowledge StackOverflow user BrodieG for initially discovering this bug and helping to investigate, as well as users KamilCuk and John Bollinger for helpful comments. The original question is at https://stackoverflow.com/q/60406042/634919. Output of `gcc -v`: Using built-in specs. COLLECT_GCC=/home/nate/gcc/bin/gcc COLLECT_LTO_WRAPPER=/home/nate/do-not-backup/gcc-inst/bin/../libexec/gcc/x86_64-pc-linux-gnu/10.0.1/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../configure --prefix=/home/nate/gcc --disable-bootstrap --enable-languages=c Thread model: posix Supported LTO compression algorithms: zlib gcc version 10.0.1 20200229 (experimental) (GCC)
Started with r10-2528-g34fcf41e30ff56155e996f5e04f6ca13948a19b6. I'll have a look.
Simplified testcase: struct A { const char **a; }; const char *buf[5]; __attribute__((noipa)) struct A foo (char *p) { struct A r = { (const char **) p }; r.a[0] = "12345678"; r.a[1] = ""; r.a[2] = ""; r.a[3] = ""; r.a[4] = ""; return r; } int main () { struct A r = foo ((char *) &buf[0]); if (!r.a[1] || r.a[1][0] != '\0') __builtin_abort (); return 0; }
bool is_char_store = is_char_type (type); if (!is_char_store && TREE_CODE (lhs) == MEM_REF) { /* To consider stores into char objects via integer types other than char but not those to non-character objects, determine the type of the destination rather than just the type of the access. */ for (int i = 0; i != 2; ++i) { tree ref = TREE_OPERAND (lhs, i); type = TREE_TYPE (ref); if (TREE_CODE (type) == POINTER_TYPE) type = TREE_TYPE (type); if (TREE_CODE (type) == ARRAY_TYPE) type = TREE_TYPE (type); if (is_char_type (type)) { is_char_store = true; break; } } } is completely bogus. With pointer conversions being useless, the type of the MEM_REF's first operand means nothing at all and the type of second MEM_REF operand is for alias analysis, again nothing that should matter for how the strlen pass optimizes code.
This was caused by the same mistake as PR 93829 which is now fixed.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92982#c4 reports Martin added test coverage added for this bug, in: https://gcc.gnu.org/g:f26688fbe441375e907f0dd2f35837681870b1f4 commit r10-6978-gf26688fbe441375e907f0dd2f35837681870b1f4 (but appears to have been typoed as PR 92982 rather than PR 93982).
I don't really think this is fixed. Besides the bogus hunk mentioned above (either do it unconditionally, or not at all, what it does isn't even a good heuristics, the pointer type on the MEM_REF first operand really is irrelevant both for the VN reasons and for because user can as the testcase shows use other pointer types too, I'd say the count_nonzero_bytes function isn't well designed. The problem is that it doesn't differentiate between whether the EXP it is looking at are the bytes that are being stored, or is a pointer which points to those bytes. Each of those cases needs to be treated significantly differently. When not counting count_nonzero_bytes called recursively or from the other count_nonzero_bytes, I see it called from handle_store twice, each time the argument is the gimple_assign_rhs1 (store), i.e. either the constant that is being stored (e.g. INTEGER_CST), or e.g. a MEM_REF and similarly in handle_integral_assign where it is called with the MEM_REF that is being loaded. Now, count_nonzero_bytes starts with get_stridx (exp) call. This makes no sense, that doesn't give you anything related to the bytes being stored, but to bytes pointed by exp. Then it goes through into the ADDR_EXPR case and just uses its operand. Now, with the PR93829 change it at least sets nbytes to something, but shouldn't e.g. the get_stridx call be done only if nbytes is non-zero? A clean design would be to have separate functions, one that handles the length of the exp bytes itself and another one that handles the case where exp points to those bytes. Otherwise, you just need to hope zero sized objects (such as MEM_REFs) won't appear anywhere etc. E.g. in void foo (char *p) { char buf[16]; __builtin_memcpy (buf, "abcde", 6); *(char **) p = &buf[0]; ... } case get_stridx (exp) returns > 0, even when it is in any way relevant to the store. The native_encode_expr call should be guarded, the usual check is CHAR_BIT == 8 && BITS_PER_UNIT == 8. And the formatting is off, e.g. the hunk mentioned in #c3 is misindented as whole, const value_range_equiv *vr = CONST_CAST (class vr_values *, rvals) ->get_value_range (si->nonzero_chars); should have been something like: const value_range_equiv *vr = (CONST_CAST (class vr_values *, rvals) ->get_value_range (si->nonzero_chars)); or better vr_values *r = CONT_CAST (vr_values *, rvals); const value_range_equiv *vr = r->get_value_range (si->nonzero_chars);
The reported bug is fixed. Please open a new bug for the other problems.