Bug 93982 - [10 Regression] Assignment incorrectly omitted by -foptimize-strlen since r10-2528
Summary: [10 Regression] Assignment incorrectly omitted by -foptimize-strlen since r10...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 10.0
: P1 normal
Target Milestone: 10.0
Assignee: Jakub Jelinek
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2020-02-29 16:19 UTC by Nate Eldredge
Modified: 2020-03-02 20:40 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work: 9.2.0
Known to fail:
Last reconfirmed: 2020-02-29 00:00:00


Attachments
Reduced testcase (252 bytes, text/plain)
2020-02-29 16:19 UTC, Nate Eldredge
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Nate Eldredge 2020-02-29 16:19:09 UTC
Created attachment 47937 [details]
Reduced testcase

If the attached testcase is compiled with `-O1 -foptimize-strlen' on amd64, the function foo is miscompiled: the assignment to res.target[1] is omitted.  It also happens with -O2, but not with -O1 alone or -O3.

This bug is somewhat similar to bug 93213.  It is a regression from 9.2.0.

The generated assembly is:

foo:
        subq    $8, %rsp
        call    my_alloc
        movq    $.LC0, (%rax)
        movq    $.LC1, 16(%rax)
        movq    $.LC1, 24(%rax)
        movq    $.LC1, 32(%rax)
        addq    $8, %rsp
        ret
 
Note the absence of `movq $.LC1, 8(%rax)'.

I tested with trunk, latest pull from git, revision 117baab8.  In particular the patch for bug 93213 (e13f37d9) is included.  The program is compiled correctly by gcc 9.2.0 with the same options (and all others I tried). I did a git bisect and the offending commit is 34fcf41e.

The cast in foo() at first looked questionable from a strict aliasing standpoint, but I believe the code is legal since the memory returned by calloc has no declared type, and we never access this memory except as objects of type `const char *'.  Also, the miscompilation persists with -fno-strict-aliasing.

I am no gcc expert, but I dug into the source a little bit, out of curiosity.  
It looks like the deletion happens in handle_store(), at gcc/tree-ssa-strlen.c:5021.  It seems that in this function, the code is being treated as if the string "12345678" itself were being stored at address res.target, rather than the address of the string; as if the code were `strcpy(res.target, "12345678")'.  In particular, it thinks the trailing null was stored at address res.target+8.  The following statement, `res.target[1] = ""', is likewise treated as if it were `strcpy(res.target+8, "")', which would also just store a null byte at res.target+8, so it is seen as redundant and is removed.

I would like to acknowledge StackOverflow user BrodieG for initially discovering this bug and helping to investigate, as well as users KamilCuk and John Bollinger for helpful comments. The original question is at https://stackoverflow.com/q/60406042/634919.

Output of `gcc -v`:

Using built-in specs.
COLLECT_GCC=/home/nate/gcc/bin/gcc
COLLECT_LTO_WRAPPER=/home/nate/do-not-backup/gcc-inst/bin/../libexec/gcc/x86_64-pc-linux-gnu/10.0.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../configure --prefix=/home/nate/gcc --disable-bootstrap --enable-languages=c
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 10.0.1 20200229 (experimental) (GCC)
Comment 1 Jakub Jelinek 2020-02-29 22:32:33 UTC
Started with r10-2528-g34fcf41e30ff56155e996f5e04f6ca13948a19b6.
I'll have a look.
Comment 2 Jakub Jelinek 2020-02-29 22:48:35 UTC
Simplified testcase:

struct A { const char **a; };
const char *buf[5];

__attribute__((noipa)) struct A
foo (char *p)
{
  struct A r = { (const char **) p };
  r.a[0] = "12345678";
  r.a[1] = "";
  r.a[2] = "";
  r.a[3] = "";
  r.a[4] = "";
  return r;
}

int
main ()
{
  struct A r = foo ((char *) &buf[0]);
  if (!r.a[1] || r.a[1][0] != '\0')
    __builtin_abort ();
  return 0;
}
Comment 3 Jakub Jelinek 2020-02-29 23:06:19 UTC
        bool is_char_store = is_char_type (type);
        if (!is_char_store && TREE_CODE (lhs) == MEM_REF)
          {
            /* To consider stores into char objects via integer types
               other than char but not those to non-character objects,
               determine the type of the destination rather than just
               the type of the access.  */
            for (int i = 0; i != 2; ++i)
              {
                tree ref = TREE_OPERAND (lhs, i);
                type = TREE_TYPE (ref);
                if (TREE_CODE (type) == POINTER_TYPE)
                  type = TREE_TYPE (type);
                if (TREE_CODE (type) == ARRAY_TYPE)
                  type = TREE_TYPE (type);
                if (is_char_type (type))
                  {
                    is_char_store = true;
                    break;
                  }
              }
          }
is completely bogus.  With pointer conversions being useless, the type of the MEM_REF's first operand means nothing at all and the type of second MEM_REF operand is for alias analysis, again nothing that should matter for how the strlen pass optimizes code.
Comment 4 Martin Sebor 2020-03-02 18:51:42 UTC
This was caused by the same mistake as PR 93829 which is now fixed.
Comment 5 David Malcolm 2020-03-02 19:34:58 UTC
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92982#c4 reports Martin added test coverage added for this bug, in:

  https://gcc.gnu.org/g:f26688fbe441375e907f0dd2f35837681870b1f4
  commit r10-6978-gf26688fbe441375e907f0dd2f35837681870b1f4

(but appears to have been typoed as PR 92982 rather than PR 93982).
Comment 6 Jakub Jelinek 2020-03-02 20:06:12 UTC
I don't really think this is fixed.
Besides the bogus hunk mentioned above (either do it unconditionally, or not at all, what it does isn't even a good heuristics, the pointer type on the MEM_REF first operand really is irrelevant both for the VN reasons and for because user can as the testcase shows use other pointer types too, I'd say the count_nonzero_bytes function isn't well designed.
The problem is that it doesn't differentiate between whether the EXP it is looking at are the bytes that are being stored, or is a pointer which points to those bytes.  Each of those cases needs to be treated significantly differently.
When not counting count_nonzero_bytes called recursively or from the other count_nonzero_bytes, I see it called from handle_store twice, each time the argument is the gimple_assign_rhs1 (store), i.e. either the constant that is being stored (e.g. INTEGER_CST), or e.g. a MEM_REF and similarly in handle_integral_assign where it is called with the MEM_REF that is being loaded.
Now, count_nonzero_bytes starts with get_stridx (exp) call.  This makes no sense, that doesn't give you anything related to the bytes being stored, but to bytes pointed by exp.
Then it goes through into the ADDR_EXPR case and just uses its operand.
Now, with the PR93829 change it at least sets nbytes to something, but shouldn't e.g. the get_stridx call be done only if nbytes is non-zero?  A clean design would be to have separate functions, one that handles the length of the exp bytes itself and another one that handles the case where exp points to those bytes.  Otherwise, you just need to hope zero sized objects (such as MEM_REFs) won't appear anywhere etc.
E.g. in
void
foo (char *p)
{
  char buf[16];
  __builtin_memcpy (buf, "abcde", 6);
  *(char **) p = &buf[0];
...
}
case get_stridx (exp) returns > 0, even when it is in any way relevant to the store.
The native_encode_expr call should be guarded, the usual check is
CHAR_BIT == 8 && BITS_PER_UNIT == 8.
And the formatting is off, e.g. the hunk mentioned in #c3 is misindented as whole,
          const value_range_equiv *vr
            = CONST_CAST (class vr_values *, rvals)
            ->get_value_range (si->nonzero_chars);
should have been something like:
          const value_range_equiv *vr
            = (CONST_CAST (class vr_values *, rvals)
               ->get_value_range (si->nonzero_chars));
or better
          vr_values *r = CONT_CAST (vr_values *, rvals);
          const value_range_equiv *vr = r->get_value_range (si->nonzero_chars);
Comment 7 Martin Sebor 2020-03-02 20:40:45 UTC
The reported bug is fixed.  Please open a new bug for the other problems.