I've just isolated following code snippet from tar package: $ cat buffer.c struct posix_header { char magic[2]; }; union Union { char buffer[512]; struct posix_header header; }; union Union a; union Union *ptr; int main(int argc, char **argv) { a.buffer[0] = 'a'; a.buffer[1] = 'b'; a.buffer[2] = '\0'; if (argc != 123) ptr = &a; if (__builtin_strcmp(ptr->header.magic, "x") == 0 || __builtin_strcmp(ptr->buffer + __builtin_offsetof(struct posix_header, magic), "ab") == 0) ; else __builtin_abort (); return 0; } $ gcc -Wstring-compare buffer.c -O2 && ./a.out buffer.c: In function ‘main’: buffer.c:23:10: warning: ‘__builtin_strcmp’ of a string of length 2 and an array of size 2 evaluates to nonzero [-Wstring-compare] 23 | || __builtin_strcmp(ptr->buffer + __builtin_offsetof(struct posix_header, magic), "ab") == 0) | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Aborted (core dumped) $ gcc -Wstring-compare buffer.c && ./a.out [exits with 0]
You don't need -Wstring-compare for that, it is miscompiled even with just -O2, during strlen1 pass when when incorrect range for the return value of the second strcmp is determined.
I'd say the bug is in determine_min_objsize, which makes assumption that are simply not valid in GIMPLE after optimizations. Before fre3 we have: _2 = &ptr.0_1->header.magic; _3 = __builtin_strcmp (_2, "x"); ... _4 = &ptr.0_1->buffer; _5 = __builtin_strcmp (_4, "ab"); but fre3 changes that to: _2 = &ptr.0_1->header.magic; _3 = __builtin_strcmp (_2, "x"); ... _5 = __builtin_strcmp (_2, "ab"); because it determines that _2 and _4 have the same value. They do, but determine_min_objsize attempts to argue from this change that the strcmp will never be true, because &ptr.0_1->header.magic is a 2 byte array in a struct inside of union. The source didn't have such an expression there though. So, either we change GIMPLE and claim that such properties need to be preserved, in that case SCCVN would need to either not do that optimization (and other passes too) or say modify the expression that is kept say to ptr.0_1 or &ptr.0_1->buffer as one that has the larger minimum object size, or determine_min_objsize simply can't make such assumptions.
But isn't the case magic could be considered a variable length field since it is st the end of the struct?
(In reply to Jakub Jelinek from comment #2) > I'd say the bug is in determine_min_objsize, which makes assumption that are > simply not valid in GIMPLE after optimizations. > Before fre3 we have: > _2 = &ptr.0_1->header.magic; > _3 = __builtin_strcmp (_2, "x"); > ... > _4 = &ptr.0_1->buffer; > _5 = __builtin_strcmp (_4, "ab"); > but fre3 changes that to: > _2 = &ptr.0_1->header.magic; > _3 = __builtin_strcmp (_2, "x"); > ... > _5 = __builtin_strcmp (_2, "ab"); > because it determines that _2 and _4 have the same value. They do, but > determine_min_objsize attempts to argue from this change that the strcmp > will never be true, because &ptr.0_1->header.magic is a 2 byte array in a > struct inside of union. The source didn't have such an expression there > though. > So, either we change GIMPLE and claim that such properties need to be > preserved, in that case SCCVN would need to either not do that optimization > (and other passes too) or say modify the expression that is kept say to > ptr.0_1 or &ptr.0_1->buffer as one that has the larger minimum object size, > or determine_min_objsize simply can't make such assumptions. determine_min_objsize cannot make such assumptions (when producing answers used for optimization rather than just warnings). We're sort-of fencing off FRE to aid the warning machinery but only for zero-offset components and only before IPA inlining.
Note, determine_min_objsize calls compute_builtin_object_size with 2 rather than 3, which means it is in this regard conservative and uses whole object size rather than just subobject, we've been there in the past, punt on some optimizations before the first objsz pass and for subobject sizes compute them early now. But the rest of determine_min_objsize, if compute_builtin_object_size fails, ignores this and thinks it can use subobject sizes when it can't.
(In reply to Andrew Pinski from comment #3) > But isn't the case magic could be considered a variable length field since > it is st the end of the struct? At end of struct or not doesn't really matter. Consider another testcase that is miscompiled because of this: union U { struct S { char a[2]; char b[2]; char c[2]; } s; char d[6]; } u; __attribute__((noipa)) void bar (char *p) { asm volatile ("" : : "g" (p) : "memory"); } __attribute__((noipa)) void foo (union U *x) { char *p = (char *) &x->s.b; bar (p); if (__builtin_strcmp (&x->d[2], "cde")) __builtin_abort (); } int main () { __builtin_strcpy (u.d, "abcde"); foo (&u); return 0; } Note, it works if instead of using (char *) &x->s.b it uses &x->s.b[0], because determine_min_objsize has a hack to ignore sizes of one and zero, guess it failed too often that way.
Now, determine_min_objsize has been introduced for the strcmp_eq optimization and maybe doing something conservative for the strcmp -> ~[0, 0] or [0, 0] optimization there like get_base_address on the ADDR_EXPR argument would not be appropriate for that use case, if the original use case is "is there a guarantee at least that many bytes will be actually accessible". So maybe the bug is actually in the caller, that assumes that if determine_min_objsize returns something small and we need to strcmp that with some larger string that it can't be equal. That is wrong though. Say if determine_min_objsize successfully uses compute_builtin_object_size with kind 2, on _3 = PHI <_1, _2> where _1 = __builtin_malloc (2); and _2 = __builtin_malloc (16); Here, __builtin_object_size (_3, 2) is 2 and __builtin_object_size (_3, 0) is 16. If there is __builtin_strcmp (_3, "abcd"), we can't assume it is not equal simply because determine_min_objsize returned 2 for _3, we would need to use (non-existent) determine_max_objsize for that. Perhaps to determine if it should warn determine_min_objsize is acceptable, though it likely should use both determine_min_objsize and determine_max_objsize and differentiate between is vs. may be in the warning (perhaps have two levels of the warning). But for code generation, determine_min_objsize is wrong. __attribute__((noipa)) int bar (char *x, int y) { asm volatile ("" : : "g" (x), "g" (y) : "memory"); if (y == 0) __builtin_strcpy (x, "abcd"); return y; } __attribute__((noipa)) char * foo (int x) { char *p; if (x) p = __builtin_malloc (2); else p = __builtin_calloc (16, 1); if (bar (p, x)) return p; if (__builtin_strcmp (p, "abcd") != 0) __builtin_abort (); return p; } int main () { __builtin_free (foo (0)); __builtin_free (foo (1)); return 0; }
Perhaps related to: https://gcc.gnu.org/ml/gcc-patches/2019-10/msg01874.html
(In reply to Jeffrey A. Law from comment #8) > Perhaps related to: > > https://gcc.gnu.org/ml/gcc-patches/2019-10/msg01874.html Yes, that is pretty much the same thing. One thing is whether it is safe or unsafe to use what determine_min_objsize does for the purposes Qing introduced (probably unsafe, if we e.g. consider what GCC uses for tree/rtl/gimple, unions containing members with different sizes and where the mapped size is actually just the size of the corresponding member; if VN would see first &ptr->member.elt of some larger element and then &ptr->smaller_member which has the same address, Qing's code could assume at least that many bytes are mapped and perform *_eq). And another thing as the #c7 testcase shows that it is wrong to use minimum object size, whatever exact definition it has, at least for code generation purposes, it has to consider maximum object size instead, as the code tries to optimize the str{,n}cmp into not equal if one object is small enough and the other object is too large to fit in there.
Testcase showing wrong-code with the __builtin_strcmp_eq stuff (assuming the testcase is considered valid): /* { dg-do run { target mmap } } */ /* { dg-options "-O2" } */ #include <stdlib.h> #include <sys/mman.h> #ifndef MAP_ANONYMOUS #define MAP_ANONYMOUS MAP_ANON #endif #ifndef MAP_ANON #define MAP_ANON 0 #endif #ifndef MAP_FAILED #define MAP_FAILED ((void *)-1) #endif union U { struct T { char a[2]; char b[4094]; } c; struct S { char d[4]; int e; } f; }; __attribute__((noipa)) void bar (char *p) { asm volatile ("" : : "g" (p) : "memory"); } __attribute__((noipa)) int foo (union U *p) { bar ((char *) &p->c.b); return __builtin_strcmp (&p->f.d[2], "abcdefghijk") == 0; } int main () { char *p = mmap (NULL, 131072, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); if (p == MAP_FAILED) return 0; if (munmap (p + 65536, 65536) < 0) return 0; union U *u; u = (union U *) (p + 65536 - sizeof (u->f)); __builtin_memcpy (u->f.d, "abc", 4); u->f.e = 1; if (foo (u) != 0) __builtin_abort (); return 0; }
Untested patch to fix all these wrong-code issues would be something like: --- gcc/tree-ssa-strlen.c.jj 2019-11-28 09:35:32.443298424 +0100 +++ gcc/tree-ssa-strlen.c 2019-12-03 17:02:32.131658020 +0100 @@ -3473,54 +3473,30 @@ compute_string_length (int idx) static unsigned HOST_WIDE_INT determine_min_objsize (tree dest) { - unsigned HOST_WIDE_INT size = 0; + unsigned HOST_WIDE_INT size; init_object_sizes (); if (compute_builtin_object_size (dest, 2, &size)) return size; - /* Try to determine the size of the object through the RHS - of the assign statement. */ - if (TREE_CODE (dest) == SSA_NAME) - { - gimple *stmt = SSA_NAME_DEF_STMT (dest); - if (!is_gimple_assign (stmt)) - return HOST_WIDE_INT_M1U; - - if (!gimple_assign_single_p (stmt) - && !gimple_assign_unary_nop_p (stmt)) - return HOST_WIDE_INT_M1U; + return HOST_WIDE_INT_M1U; +} - dest = gimple_assign_rhs1 (stmt); - return determine_min_objsize (dest); - } +/* Similarly, but maximum instead of minimum, and return 0 when the + size cannot be determined. */ - /* Try to determine the size of the object from its type. */ - if (TREE_CODE (dest) != ADDR_EXPR) - return HOST_WIDE_INT_M1U; - - tree type = TREE_TYPE (dest); - if (TREE_CODE (type) == POINTER_TYPE) - type = TREE_TYPE (type); - - type = TYPE_MAIN_VARIANT (type); - - /* The size of a flexible array cannot be determined. Otherwise, - for arrays with more than one element, return the size of its - type. GCC itself misuses arrays of both zero and one elements - as flexible array members so they are excluded as well. */ - if (TREE_CODE (type) != ARRAY_TYPE - || !array_at_struct_end_p (dest)) - { - tree type_size = TYPE_SIZE_UNIT (type); - if (type_size && TREE_CODE (type_size) == INTEGER_CST - && !integer_onep (type_size) - && !integer_zerop (type_size)) - return tree_to_uhwi (type_size); - } +static unsigned HOST_WIDE_INT +determine_max_objsize (tree dest) +{ + unsigned HOST_WIDE_INT size; - return HOST_WIDE_INT_M1U; + init_object_sizes (); + + if (compute_builtin_object_size (dest, 0, &size)) + return size; + + return 0; } /* Given strinfo IDX for ARG, set LENRNG[] to the range of lengths @@ -3603,7 +3579,9 @@ get_len_or_size (tree arg, int idx, unsi /* Set *SIZE to the size of the smallest object referenced by ARG if ARG denotes a single object, or to HWI_M1U otherwise. */ - *size = determine_min_objsize (arg); + *size = determine_max_objsize (arg); + if (*size == 0) + *size = HOST_WIDE_INT_M1U; *nulterm = false; } } plus add testsuite coverage from the testcases in this PR and deal with testsuite regressions: FAIL: gcc.dg/Wstring-compare.c (test for warnings, line 123) FAIL: gcc.dg/strcmpopt_2.c scan-tree-dump-times strlen1 "cmp_eq \\(" 8 FAIL: gcc.dg/strcmpopt_4.c scan-tree-dump-times strlen1 "cmp_eq \\(" 1 FAIL: gcc.dg/strlenopt-69.c scan-tree-dump-not optimized "abort|strcmp|strncmp" where strcmpopt_2.c now matches just 4 times instead of 8. Or do we consider the #c10 testcase valid, but what strcmpopt_2.c does like: typedef struct { char s[8]; int x; } S; __attribute__ ((noinline)) int f1 (S *s) { return __builtin_strcmp (s->s, "abc") != 0; } ok (in that if there is a pointer to S, at least sizeof (s->s) bytes will be mapped? We could look through COMPONENT_REFs in the access path and if there is any UNION_TYPE in there, be conservative and consider the same addresses in all the union members and take conservative answer from all of them? Wouldn't that still be a problem if we have union somewhere earlier and just take address of a union member?
The __builtin_strcmp(ptr->header.magic, "x") call in comment #0 is undefined because the two-element array ptr->header.magic is not a nul-terminated string. The warning was designed to point that out. Removing the call removes the bug from the test case and lets it run to completion.
(In reply to Martin Sebor from comment #12) > The __builtin_strcmp(ptr->header.magic, "x") call in comment #0 is undefined > because the two-element array ptr->header.magic is not a nul-terminated > string. The warning was designed to point that out. I don't agree with that view, but if you replace that first __builtin_strcmp with say __builtin_memcmp(ptr->header.magic, "x", 2) == 0 it fails the same way. Plus all the other testcases in this PR. The first call really doesn't matter, all that matters is that there is something taking address of ptr->header.magic earlier in the function. And a warning designed on something the GIMPLE IL does not preserve is wrong.
(In reply to Martin Sebor from comment #12) > The __builtin_strcmp(ptr->header.magic, "x") call in comment #0 is undefined > because the two-element array ptr->header.magic is not a nul-terminated > string. The warning was designed to point that out. If so, the warning needs to be done early. And it shouldn't affect code generation. See e.g. tree-ssa-sccvn.c: /* Probibit value-numbering zero offset components of addresses the same before the pass folding __builtin_object_size had a chance to run (checking cfun->after_inlining does the trick here). */ if (TREE_CODE (orig) != ADDR_EXPR || maybe_ne (off, 0) || cfun->after_inlining) off.to_shwi (&temp.off); I'm surprised non-zero offs can be handled this way even before objsz1 pass, before that we really should consider the same only if the offset is the same and FIELD_DECLs type has the same size, but we could have some cfun flag or property set by objsz1 pass. That said, after that what COMPONENT_REFs are used in ADDR_EXPR needs to be ignored.
As for __builtin_*_eq, we could have some analysis that looks at accesses from the same base, and e.g. if there is struct S { char name[16]; int whatever; } and we see strcmp (p->name, ...) and p->whatever stored or loaded before that or after that with no possibility to exit/abort etc. in between, we could still optimize into *_eq. Doesn't tree-if-conv.c have similar analysis (ifc_dr etc.)? Maybe it would need to know minimal page size and only trigger if the read/write of a field after it isn't too far away as a guarantee that in a valid program if the first byte of p->name doesn't trap, then others won't either.
The warning doesn't affect code generation. It's issued independent of it. We'll have to agree to disagree about the validity of the test case in comment #0, but I do agree that at least some of your test cases unfortunately are valid in C (although I don't think they're also valid in C++ where only the union member that was last written to can be read). I'll look into this when I'm back from PTO in January.
We explicitly document supporting type punning through unions as an extension, so it is valid in C++ too. And, what union member is active at which point really doesn't matter for the testcases, e.g. in #c6 you could very well pass the union address to bar too and store there to the union member you want active afterwards (or have the union address in some global var from before the function and do it in bar again). The #c7 testcase doesn't contain any union. The #c10 testcase could be changes similarly to the #c6 one.
Also, note that union doesn't have to be visible on the access path, e.g. union U { struct S { char a[2]; char b[2]; char c[2]; } s; struct T { char d[6]; } t; } u; __attribute__((noipa)) void bar (char *p) { if (__builtin_strcmp (p, "a") != 0) __builtin_abort (); __builtin_strcpy (u.t.d, "abcde"); } __attribute__((noipa)) void foo (void *x) { struct S *s = (struct S *) x; char *p = (char *) &s->b; bar (p); struct T *t = (struct T *) x; if (__builtin_strcmp (&t->d[2], "cde")) __builtin_abort (); } int main () { __builtin_strcpy (u.s.b, "a"); foo (&u); return 0; } is miscompiled too, there is no type punning through union, only active union member is ever read...
Note that while the code in this BZ may or may not be valid, there is certainly code in the wild which is incorrectly compiled as a result of Qing's patch which uses TYPE_SIZE. In particular the flatbuffers and acpica-tools package in Fedora is mis-compiled due to this issue (there may be others). See the link in c#8.
@Martin, Jakub: What's the status of the PR? Is there a consensus that it'a a wrong-code issue?
There is no consensus on the #c0 testcase, but there are several other testcases in this PR that are IMO unquestionably valid (or can be easily turned into).
I've been going through the test cases here. IIUC, the one in comment #10 is a separate issue and should get its own bug. (Arguably, so is the one in comment #7.) It's unfortunate that GIMPLE doesn't preserve the basic property mentioned in comment #2. Not only does it make working with it error-prone (vis-a-vis this bug and others like it), it also prevents interesting optimizations, and makes warnings that depend on the property regardless inconsistent with the guarantees GCC does provide. I'm hoping this can change in the future. The only way I can think of to get all the test cases to pass, including the one comment #18 in particular, would be to disable the optimization for all struct members. But that seems unnecessary. As discussed in pr14319, GCC does expect the union type to be visible in accesses to overlapping members. Treating the test case in comment #18 as unsupported, A change that lets the remaining tests pass (i.e., all but those in comment #10 and comment #18) is much more selective: a) disable the optimization references containing a union among the handled components along the "access path" to the member, and b) use the maximum object size for PHI nodes (to let the test case in comment #7 pass).
(In reply to Martin Sebor from comment #22) > I've been going through the test cases here. IIUC, the one in comment #10 > is a separate issue and should get its own bug. (Arguably, so is the one in > comment #7.) > > It's unfortunate that GIMPLE doesn't preserve the basic property mentioned > in comment #2. Not only does it make working with it error-prone (vis-a-vis > this bug and others like it), it also prevents interesting optimizations, > and makes warnings that depend on the property regardless inconsistent with > the guarantees GCC does provide. I'm hoping this can change in the future. I don't think this will change in future. You have to look at this from the angle of value-equivalences which is the number one thing a compiler is supposed to compute and utilize. All this analyzer work and also this stupid pointer provenance stuff makes value-equivalences no longer work. At the same time the compiler is still expected it exploit those equivalences unless it breaks something. That's never going to work.
Minimal patch: https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00880.html
The master branch has been updated by Martin Sebor <msebor@gcc.gnu.org>: https://gcc.gnu.org/g:e7868dc6a79daec4f46552fa5c99f88e4eb8de4c commit r10-6465-ge7868dc6a79daec4f46552fa5c99f88e4eb8de4c Author: Martin Sebor <msebor@redhat.com> Date: Wed Feb 5 16:55:26 2020 -0700 PR tree-optimization/92765 - wrong code for strcmp of a union member gcc/ChangeLog: PR tree-optimization/92765 * gimple-fold.c (get_range_strlen_tree): Handle MEM_REF and PARM_DECL. * tree-ssa-strlen.c (compute_string_length): Remove. (determine_min_objsize): Remove. (get_len_or_size): Add an argument. Call get_range_strlen_dynamic. Avoid using type size as the upper bound on string length. (handle_builtin_string_cmp): Add an argument. Adjust. (strlen_check_and_optimize_call): Pass additional argument to handle_builtin_string_cmp. gcc/testsuite/ChangeLog: PR tree-optimization/92765 * g++.dg/tree-ssa/strlenopt-1.C: New test. * g++.dg/tree-ssa/strlenopt-2.C: New test. * gcc.dg/Warray-bounds-58.c: New test. * gcc.dg/Wrestrict-20.c: Avoid a valid -Wformat-overflow. * gcc.dg/Wstring-compare.c: Xfail a test. * gcc.dg/strcmpopt_2.c: Disable tests. * gcc.dg/strcmpopt_4.c: Adjust tests. * gcc.dg/strcmpopt_10.c: New test. * gcc.dg/strcmpopt_11.c: New test. * gcc.dg/strlenopt-69.c: Disable tests. * gcc.dg/strlenopt-92.c: New test. * gcc.dg/strlenopt-93.c: New test. * gcc.dg/strlenopt.h: Declare calloc. * gcc.dg/tree-ssa/pr92056.c: Xfail tests until pr93518 is resolved. * gcc.dg/tree-ssa/builtin-sprintf-warn-23.c: Correct test (pr93517).
Patch committed.