Bug 92765 - [10 Regression] wrong code for strcmp of a union member
Summary: [10 Regression] wrong code for strcmp of a union member
Status: ASSIGNED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 10.0
: P1 normal
Target Milestone: 10.0
Assignee: Martin Sebor
URL:
Keywords: patch, wrong-code
Depends on:
Blocks:
 
Reported: 2019-12-03 09:02 UTC by Martin Liška
Modified: 2020-01-15 13:24 UTC (History)
5 users (show)

See Also:
Host:
Target:
Build:
Known to work: 9.2.0
Known to fail: 10.0
Last reconfirmed: 2019-12-03 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Liška 2019-12-03 09:02:15 UTC
I've just isolated following code snippet from tar package:

$ cat buffer.c
struct posix_header {
  char magic[2];
};

union Union {
  char buffer[512];
  struct posix_header header;
};

union Union a;
union Union *ptr;

int main(int argc, char **argv)
{
  a.buffer[0] = 'a';
  a.buffer[1] = 'b';
  a.buffer[2] = '\0';

  if (argc != 123)
    ptr = &a;

  if (__builtin_strcmp(ptr->header.magic, "x") == 0
      || __builtin_strcmp(ptr->buffer + __builtin_offsetof(struct posix_header, magic), "ab") == 0)
    ;
  else
    __builtin_abort ();

  return 0;
}

$ gcc -Wstring-compare buffer.c -O2 && ./a.out
buffer.c: In function ‘main’:
buffer.c:23:10: warning: ‘__builtin_strcmp’ of a string of length 2 and an array of size 2 evaluates to nonzero [-Wstring-compare]
   23 |       || __builtin_strcmp(ptr->buffer + __builtin_offsetof(struct posix_header, magic), "ab") == 0)
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Aborted (core dumped)

$ gcc -Wstring-compare buffer.c && ./a.out
[exits with 0]
Comment 1 Jakub Jelinek 2019-12-03 10:16:57 UTC
You don't need -Wstring-compare for that, it is miscompiled even with just -O2, during strlen1 pass when when incorrect range for the return value of the second strcmp is determined.
Comment 2 Jakub Jelinek 2019-12-03 10:38:20 UTC
I'd say the bug is in determine_min_objsize, which makes assumption that are simply not valid in GIMPLE after optimizations.
Before fre3 we have:
  _2 = &ptr.0_1->header.magic;
  _3 = __builtin_strcmp (_2, "x");
...
  _4 = &ptr.0_1->buffer;
  _5 = __builtin_strcmp (_4, "ab");
but fre3 changes that to:
  _2 = &ptr.0_1->header.magic;
  _3 = __builtin_strcmp (_2, "x");
...
  _5 = __builtin_strcmp (_2, "ab");
because it determines that _2 and _4 have the same value.  They do, but determine_min_objsize attempts to argue from this change that the strcmp will never be true, because &ptr.0_1->header.magic is a 2 byte array in a struct inside of union.  The source didn't have such an expression there though.
So, either we change GIMPLE and claim that such properties need to be preserved, in that case SCCVN would need to either not do that optimization (and other passes too) or say modify the expression that is kept say to ptr.0_1 or &ptr.0_1->buffer as one that has the larger minimum object size, or determine_min_objsize simply can't make such assumptions.
Comment 3 Andrew Pinski 2019-12-03 10:42:36 UTC
But isn't the case magic could be considered a variable length field since it is st the end of the struct?
Comment 4 Richard Biener 2019-12-03 10:43:09 UTC
(In reply to Jakub Jelinek from comment #2)
> I'd say the bug is in determine_min_objsize, which makes assumption that are
> simply not valid in GIMPLE after optimizations.
> Before fre3 we have:
>   _2 = &ptr.0_1->header.magic;
>   _3 = __builtin_strcmp (_2, "x");
> ...
>   _4 = &ptr.0_1->buffer;
>   _5 = __builtin_strcmp (_4, "ab");
> but fre3 changes that to:
>   _2 = &ptr.0_1->header.magic;
>   _3 = __builtin_strcmp (_2, "x");
> ...
>   _5 = __builtin_strcmp (_2, "ab");
> because it determines that _2 and _4 have the same value.  They do, but
> determine_min_objsize attempts to argue from this change that the strcmp
> will never be true, because &ptr.0_1->header.magic is a 2 byte array in a
> struct inside of union.  The source didn't have such an expression there
> though.
> So, either we change GIMPLE and claim that such properties need to be
> preserved, in that case SCCVN would need to either not do that optimization
> (and other passes too) or say modify the expression that is kept say to
> ptr.0_1 or &ptr.0_1->buffer as one that has the larger minimum object size,
> or determine_min_objsize simply can't make such assumptions.

determine_min_objsize cannot make such assumptions (when producing answers
used for optimization rather than just warnings).

We're sort-of fencing off FRE to aid the warning machinery but only for
zero-offset components and only before IPA inlining.
Comment 5 Jakub Jelinek 2019-12-03 10:45:15 UTC
Note, determine_min_objsize calls compute_builtin_object_size with 2 rather than 3, which means it is in this regard conservative and uses whole object size rather than just subobject, we've been there in the past, punt on some optimizations before the first objsz pass and for subobject sizes compute them early now.  But the rest of determine_min_objsize, if compute_builtin_object_size fails, ignores this and thinks it can use subobject sizes when it can't.
Comment 6 Jakub Jelinek 2019-12-03 11:00:27 UTC
(In reply to Andrew Pinski from comment #3)
> But isn't the case magic could be considered a variable length field since
> it is st the end of the struct?

At end of struct or not doesn't really matter.
Consider another testcase that is miscompiled because of this:
union U { struct S { char a[2]; char b[2]; char c[2]; } s; char d[6]; } u;

__attribute__((noipa)) void
bar (char *p)
{
  asm volatile ("" : : "g" (p) : "memory");
}

__attribute__((noipa)) void
foo (union U *x)
{
  char *p = (char *) &x->s.b;
  bar (p);
  if (__builtin_strcmp (&x->d[2], "cde"))
    __builtin_abort ();
}

int
main ()
{
  __builtin_strcpy (u.d, "abcde");
  foo (&u);
  return 0;
}

Note, it works if instead of using (char *) &x->s.b it uses &x->s.b[0], because determine_min_objsize has a hack to ignore sizes of one and zero, guess it failed too often that way.
Comment 7 Jakub Jelinek 2019-12-03 11:36:28 UTC
Now, determine_min_objsize has been introduced for the strcmp_eq optimization and maybe doing something conservative for the strcmp -> ~[0, 0] or [0, 0] optimization there like get_base_address on the ADDR_EXPR argument would not be appropriate for that use case, if the original use case is "is there a guarantee at least that many bytes will be actually accessible".  So maybe the bug is actually in the caller, that assumes that if determine_min_objsize returns something small and we need to strcmp that with some larger string that it can't be equal.  That is wrong though.  Say if determine_min_objsize successfully uses compute_builtin_object_size with kind 2, on
  _3 = PHI <_1, _2>
where _1 = __builtin_malloc (2); and _2 = __builtin_malloc (16);
Here, __builtin_object_size (_3, 2) is 2 and __builtin_object_size (_3, 0) is 16.
If there is __builtin_strcmp (_3, "abcd"), we can't assume it is not equal simply because determine_min_objsize returned 2 for _3, we would need to use (non-existent) determine_max_objsize for that.
Perhaps to determine if it should warn determine_min_objsize is acceptable, though it likely should use both determine_min_objsize and determine_max_objsize and differentiate between is vs. may be in the warning (perhaps have two levels of the warning).  But for code generation, determine_min_objsize is wrong.

__attribute__((noipa)) int
bar (char *x, int y)
{
  asm volatile ("" : : "g" (x), "g" (y) : "memory");
  if (y == 0)
    __builtin_strcpy (x, "abcd");
  return y;
}

__attribute__((noipa)) char *
foo (int x)
{
  char *p;
  if (x)
    p = __builtin_malloc (2);
  else
    p = __builtin_calloc (16, 1);
  if (bar (p, x))
    return p;
  if (__builtin_strcmp (p, "abcd") != 0)
    __builtin_abort ();
  return p;
}

int
main ()
{
  __builtin_free (foo (0));
  __builtin_free (foo (1));
  return 0;
}
Comment 8 Jeffrey A. Law 2019-12-03 14:34:32 UTC
Perhaps related to:

https://gcc.gnu.org/ml/gcc-patches/2019-10/msg01874.html
Comment 9 Jakub Jelinek 2019-12-03 14:44:28 UTC
(In reply to Jeffrey A. Law from comment #8)
> Perhaps related to:
> 
> https://gcc.gnu.org/ml/gcc-patches/2019-10/msg01874.html

Yes, that is pretty much the same thing.  One thing is whether it is safe or unsafe to use what determine_min_objsize does for the purposes Qing introduced (probably unsafe, if we e.g. consider what GCC uses for tree/rtl/gimple, unions
containing members with different sizes and where the mapped size is actually just the size of the corresponding member; if VN would see first &ptr->member.elt
of some larger element and then &ptr->smaller_member which has the same address,
Qing's code could assume at least that many bytes are mapped and perform *_eq).
And another thing as the #c7 testcase shows that it is wrong to use minimum object size, whatever exact definition it has, at least for code generation purposes, it has to consider maximum object size instead, as the code tries to optimize the str{,n}cmp into not equal if one object is small enough and the other object is too large to fit in there.
Comment 10 Jakub Jelinek 2019-12-03 15:03:27 UTC
Testcase showing wrong-code with the __builtin_strcmp_eq stuff (assuming the testcase is considered valid):
/* { dg-do run { target mmap } } */
/* { dg-options "-O2" } */

#include <stdlib.h>
#include <sys/mman.h>
#ifndef MAP_ANONYMOUS
#define MAP_ANONYMOUS MAP_ANON
#endif
#ifndef MAP_ANON
#define MAP_ANON 0
#endif
#ifndef MAP_FAILED
#define MAP_FAILED ((void *)-1)
#endif

union U { struct T { char a[2]; char b[4094]; } c; struct S { char d[4]; int e; } f; };

__attribute__((noipa)) void
bar (char *p)
{
  asm volatile ("" : : "g" (p) : "memory");
}

__attribute__((noipa)) int
foo (union U *p)
{
  bar ((char *) &p->c.b);
  return __builtin_strcmp (&p->f.d[2], "abcdefghijk") == 0;
}

int
main ()
{
  char *p = mmap (NULL, 131072, PROT_READ | PROT_WRITE,
                  MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
  if (p == MAP_FAILED)
    return 0;
  if (munmap (p + 65536, 65536) < 0)
    return 0;
  union U *u;
  u = (union U *) (p + 65536 - sizeof (u->f));
  __builtin_memcpy (u->f.d, "abc", 4);
  u->f.e = 1;
  if (foo (u) != 0)
    __builtin_abort ();
  return 0;
}
Comment 11 Jakub Jelinek 2019-12-03 16:12:41 UTC
Untested patch to fix all these wrong-code issues would be something like:
--- gcc/tree-ssa-strlen.c.jj	2019-11-28 09:35:32.443298424 +0100
+++ gcc/tree-ssa-strlen.c	2019-12-03 17:02:32.131658020 +0100
@@ -3473,54 +3473,30 @@ compute_string_length (int idx)
 static unsigned HOST_WIDE_INT
 determine_min_objsize (tree dest)
 {
-  unsigned HOST_WIDE_INT size = 0;
+  unsigned HOST_WIDE_INT size;
 
   init_object_sizes ();
 
   if (compute_builtin_object_size (dest, 2, &size))
     return size;
 
-  /* Try to determine the size of the object through the RHS
-     of the assign statement.  */
-  if (TREE_CODE (dest) == SSA_NAME)
-    {
-      gimple *stmt = SSA_NAME_DEF_STMT (dest);
-      if (!is_gimple_assign (stmt))
-	return HOST_WIDE_INT_M1U;
-
-      if (!gimple_assign_single_p (stmt)
-	  && !gimple_assign_unary_nop_p (stmt))
-	return HOST_WIDE_INT_M1U;
+  return HOST_WIDE_INT_M1U;
+}
 
-      dest = gimple_assign_rhs1 (stmt);
-      return determine_min_objsize (dest);
-    }
+/* Similarly, but maximum instead of minimum, and return 0 when the
+   size cannot be determined.  */
 
-  /* Try to determine the size of the object from its type.  */
-  if (TREE_CODE (dest) != ADDR_EXPR)
-    return HOST_WIDE_INT_M1U;
-
-  tree type = TREE_TYPE (dest);
-  if (TREE_CODE (type) == POINTER_TYPE)
-    type = TREE_TYPE (type);
-
-  type = TYPE_MAIN_VARIANT (type);
-
-  /* The size of a flexible array cannot be determined.  Otherwise,
-     for arrays with more than one element, return the size of its
-     type.  GCC itself misuses arrays of both zero and one elements
-     as flexible array members so they are excluded as well.  */
-  if (TREE_CODE (type) != ARRAY_TYPE
-      || !array_at_struct_end_p (dest))
-    {
-      tree type_size = TYPE_SIZE_UNIT (type);
-      if (type_size && TREE_CODE (type_size) == INTEGER_CST
-	  && !integer_onep (type_size)
-	  && !integer_zerop (type_size))
-        return tree_to_uhwi (type_size);
-    }
+static unsigned HOST_WIDE_INT
+determine_max_objsize (tree dest)
+{
+  unsigned HOST_WIDE_INT size;
 
-  return HOST_WIDE_INT_M1U;
+  init_object_sizes ();
+
+  if (compute_builtin_object_size (dest, 0, &size))
+    return size;
+
+  return 0;
 }
 
 /* Given strinfo IDX for ARG, set LENRNG[] to the range of lengths
@@ -3603,7 +3579,9 @@ get_len_or_size (tree arg, int idx, unsi
 	  /* Set *SIZE to the size of the smallest object referenced
 	     by ARG if ARG denotes a single object, or to HWI_M1U
 	     otherwise.  */
-	  *size = determine_min_objsize (arg);
+	  *size = determine_max_objsize (arg);
+	  if (*size == 0)
+	    *size = HOST_WIDE_INT_M1U;
 	  *nulterm = false;
 	}
     }
plus add testsuite coverage from the testcases in this PR and deal with testsuite regressions:
FAIL: gcc.dg/Wstring-compare.c  (test for warnings, line 123)
FAIL: gcc.dg/strcmpopt_2.c scan-tree-dump-times strlen1 "cmp_eq \\(" 8
FAIL: gcc.dg/strcmpopt_4.c scan-tree-dump-times strlen1 "cmp_eq \\(" 1
FAIL: gcc.dg/strlenopt-69.c scan-tree-dump-not optimized "abort|strcmp|strncmp"
where strcmpopt_2.c now matches just 4 times instead of 8.

Or do we consider the #c10 testcase valid, but what strcmpopt_2.c does like:
typedef struct { char s[8]; int x; } S;

__attribute__ ((noinline)) int
f1 (S *s)
{
  return __builtin_strcmp (s->s, "abc") != 0;
}

ok (in that if there is a pointer to S, at least sizeof (s->s) bytes will be mapped?  We could look through COMPONENT_REFs in the access path and if there is  any UNION_TYPE in there, be conservative and consider the same addresses in all the union members and take conservative answer from all of them?  Wouldn't that still be a problem if we have union somewhere earlier and just take address of a union member?
Comment 12 Martin Sebor 2019-12-03 16:35:10 UTC
The __builtin_strcmp(ptr->header.magic, "x") call in comment #0 is undefined because the two-element array ptr->header.magic is not a nul-terminated string.  The warning was designed to point that out.

Removing the call removes the bug from the test case and lets it run to completion.
Comment 13 Jakub Jelinek 2019-12-03 16:47:01 UTC
(In reply to Martin Sebor from comment #12)
> The __builtin_strcmp(ptr->header.magic, "x") call in comment #0 is undefined
> because the two-element array ptr->header.magic is not a nul-terminated
> string.  The warning was designed to point that out.

I don't agree with that view, but if you replace that first __builtin_strcmp with say __builtin_memcmp(ptr->header.magic, "x", 2) == 0 it fails the same way.  Plus all the other testcases in this PR.
The first call really doesn't matter, all that matters is that there is something taking address of ptr->header.magic earlier in the function.
And a warning designed on something the GIMPLE IL does not preserve is wrong.
Comment 14 Jakub Jelinek 2019-12-17 12:50:01 UTC
(In reply to Martin Sebor from comment #12)
> The __builtin_strcmp(ptr->header.magic, "x") call in comment #0 is undefined
> because the two-element array ptr->header.magic is not a nul-terminated
> string.  The warning was designed to point that out.

If so, the warning needs to be done early.  And it shouldn't affect code generation.

See e.g. tree-ssa-sccvn.c:
                    /* Probibit value-numbering zero offset components
                       of addresses the same before the pass folding
                       __builtin_object_size had a chance to run
                       (checking cfun->after_inlining does the
                       trick here).  */
                    if (TREE_CODE (orig) != ADDR_EXPR
                        || maybe_ne (off, 0)
                        || cfun->after_inlining)
                      off.to_shwi (&temp.off);
I'm surprised non-zero offs can be handled this way even before objsz1 pass, before that we really should consider the same only if the offset is the same and FIELD_DECLs type has the same size, but we could have some cfun flag or property set by objsz1 pass.  That said, after that what COMPONENT_REFs are used in ADDR_EXPR needs to be ignored.
Comment 15 Jakub Jelinek 2019-12-17 12:57:26 UTC
As for __builtin_*_eq, we could have some analysis that looks at accesses from the same base, and e.g. if there is struct S { char name[16]; int whatever; } and
we see strcmp (p->name, ...) and p->whatever stored or loaded before that or after that with no possibility to exit/abort etc. in between, we could still optimize into *_eq.  Doesn't tree-if-conv.c have similar analysis (ifc_dr etc.)?
Maybe it would need to know minimal page size and only trigger if the read/write of a field after it isn't too far away as a guarantee that in a valid program if the first byte of p->name doesn't trap, then others won't either.
Comment 16 Martin Sebor 2019-12-17 21:59:59 UTC
The warning doesn't affect code generation.  It's issued independent of it.  We'll have to agree to disagree about the validity of the test case in comment #0, but I do agree that at least some of your test cases unfortunately are valid in C (although I don't think they're also valid in C++ where only the union member that was last written to can be read).  I'll look into this when I'm back from PTO in January.
Comment 17 Jakub Jelinek 2019-12-18 08:14:47 UTC
We explicitly document supporting type punning through unions as an extension, so it is valid in C++ too.  And, what union member is active at which point really doesn't matter for the testcases, e.g. in #c6 you could very well pass the union address to bar too and store there to the union member you want active afterwards (or have the union address in some global var from before the function and do it in bar again).  The #c7 testcase doesn't contain any union.  The #c10 testcase could be changes similarly to the #c6 one.
Comment 18 Jakub Jelinek 2019-12-18 09:19:52 UTC
Also, note that union doesn't have to be visible on the access path, e.g.
union U { struct S { char a[2]; char b[2]; char c[2]; } s; struct T { char d[6]; } t; } u;

__attribute__((noipa)) void
bar (char *p)
{
  if (__builtin_strcmp (p, "a") != 0)
    __builtin_abort ();
  __builtin_strcpy (u.t.d, "abcde");
}

__attribute__((noipa)) void
foo (void *x)
{
  struct S *s = (struct S *) x;
  char *p = (char *) &s->b;
  bar (p);
  struct T *t = (struct T *) x;
  if (__builtin_strcmp (&t->d[2], "cde"))
    __builtin_abort ();
}

int
main ()
{
  __builtin_strcpy (u.s.b, "a");
  foo (&u);
  return 0;
}

is miscompiled too, there is no type punning through union, only active union member is ever read...
Comment 19 Jeffrey A. Law 2019-12-18 17:06:58 UTC
Note that while the code in this BZ may or may not be valid, there is certainly code in the wild which is incorrectly compiled as a result of Qing's patch which uses TYPE_SIZE.  In particular the flatbuffers and acpica-tools package in Fedora is mis-compiled due to this issue (there may be others).   See the link in c#8.
Comment 20 Martin Liška 2020-01-09 09:58:54 UTC
@Martin, Jakub: What's the status of the PR? Is there a consensus that it'a a wrong-code issue?
Comment 21 Jakub Jelinek 2020-01-09 10:04:16 UTC
There is no consensus on the #c0 testcase, but there are several other testcases in this PR that are IMO unquestionably valid (or can be easily turned into).
Comment 22 Martin Sebor 2020-01-13 16:32:15 UTC
I've been going through the test cases here.  IIUC, the one in comment #10 is a separate issue and should get its own bug.  (Arguably, so is the one in comment #7.)

It's unfortunate that GIMPLE doesn't preserve the basic property mentioned in comment #2.  Not only does it make working with it error-prone (vis-a-vis this bug and others like it), it also prevents interesting optimizations, and makes warnings that depend on the property regardless inconsistent with the guarantees GCC does provide.  I'm hoping this can change in the future.

The only way I can think of to get all the test cases to pass, including the one comment #18 in particular, would be to disable the optimization for all struct members.  But that seems unnecessary.  As discussed in pr14319, GCC does expect the union type to be visible in accesses to overlapping members.

Treating the test case in comment #18 as unsupported, A change that lets the remaining tests pass (i.e., all but those in comment #10 and comment #18) is much more selective: a) disable the optimization references containing a union among the handled components along the "access path" to the member, and b) use the maximum object size for PHI nodes (to let the test case in comment #7 pass).
Comment 23 Richard Biener 2020-01-14 12:36:44 UTC
(In reply to Martin Sebor from comment #22)
> I've been going through the test cases here.  IIUC, the one in comment #10
> is a separate issue and should get its own bug.  (Arguably, so is the one in
> comment #7.)
> 
> It's unfortunate that GIMPLE doesn't preserve the basic property mentioned
> in comment #2.  Not only does it make working with it error-prone (vis-a-vis
> this bug and others like it), it also prevents interesting optimizations,
> and makes warnings that depend on the property regardless inconsistent with
> the guarantees GCC does provide.  I'm hoping this can change in the future.

I don't think this will change in future.  You have to look at this from
the angle of value-equivalences which is the number one thing a compiler
is supposed to compute and utilize.  All this analyzer work and also
this stupid pointer provenance stuff makes value-equivalences no longer
work.  At the same time the compiler is still expected it exploit those
equivalences unless it breaks something.  That's never going to work.
Comment 24 Martin Sebor 2020-01-15 13:24:12 UTC
Minimal patch: https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00880.html