Bug 95189

Summary: [9 Regression] memcmp being wrongly stripped like strcmp
Product: gcc Reporter: ux <gcc>
Component: middle-endAssignee: Not yet assigned to anyone <unassigned>
Status: RESOLVED FIXED    
Severity: normal CC: acoplan, amonakov, blashyrkh, bugdal, bugs, davmac, dushistov, dwmw2, egallager, jakub, jerryfromearth, luke-jr+gccbugs, pbrobinson, public, rguenth, sjames, slyfox, sumbera, toolchain
Priority: P2 Keywords: patch, wrong-code
Version: 10.1.0   
Target Milestone: 10.3   
See Also: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95886
Host: Target:
Build: Known to work: 10.3.0, 8.5.0
Known to fail: 10.2.1, 9.3.0, 9.4.0, 9.5.0 Last reconfirmed: 2020-05-19 00:00:00
Attachments: Hacky patch that makes GCC print a diagnostic message if hitting the bug
Hacky patch to workaround the bug by disabling builtin memcmp always
My backport of fix to 9.3.0

Description ux 2020-05-18 13:56:37 UTC
Given the following C code:

    % cat a.c
    #include <string.h>
    static const float z[1] = {0};
    int f(float x) { return memcmp(&x, z, sizeof(x)); }

GCC 10 generates this on x86-64:

    % gcc -Wall -O2 -c a.c && objdump -d -Mintel a.o

    a.o:     file format elf64-x86-64


    Disassembly of section .text:

    0000000000000000 <f>:
       0:	f3 0f 11 44 24 fc    	movss  DWORD PTR [rsp-0x4],xmm0
       6:	0f b6 44 24 fc       	movzx  eax,BYTE PTR [rsp-0x4]
       b:	c3                   	ret

This doesn't happen if "= {0}" is removed from the z initialization (wtf?).
It also doesn't happen with -O1.
Comment 1 Richard Biener 2020-05-19 06:28:27 UTC
Confirmed.  It looks like string handling creeps in here somehow during RTL
expansion.
Comment 2 Jakub Jelinek 2020-05-23 21:01:32 UTC
Regressed in r10-2769-g14b7950f126f84fa585e3a057940ff10d4c5b3f8
Comment 3 Jakub Jelinek 2020-05-23 21:17:11 UTC
For memcmp, it looks completely wrong to use c_getstr, I think in that case it should use string_constant instead, or if it uses it, it should punt if the returned length by it is smaller than the length of the memcmp call and not use strnlen.
Comment 4 Martin Sebor 2020-05-24 20:10:10 UTC
Isn't the problem that c_getstr(arg2, &len2) sets len2 to 1 instead of 4 (i.e, sizeof (x)) as the function comment suggests should happen?

/* Return a pointer P to a NUL-terminated string representing the sequence
   of constant characters referred to by SRC (or a subsequence of such
   characters within it if SRC is a reference to a string plus some
   constant offset).  If STRLEN is non-null, store the number of bytes
   in the string constant including the terminating NUL char.  *STRLEN is
   typically strlen(P) + 1 in the absence of embedded NUL characters.  */

const char *
c_getstr (tree src, unsigned HOST_WIDE_INT *strlen /* = NULL */)

If that's what the function is supposed to do (and even if it isn't), using the name STRLEN for anything that's not actually the result of strlen() is terribly misleading, practically begging to be misused.  In fact, going back in time, it looks like starting with r262522 the function did take two pointers: STRLEN for the string length (i.e., the result of strlen) and STRSIZE for the size, until r264301 when it was changed to the current form.
Comment 5 ux 2020-06-09 13:38:45 UTC
I'd like to point out that this regression impacts badly a production app. We're using this pattern to compare an input vector of floats to a vector of zeros, but the comparison always returns 0 now, which basically breaks everything. We do have a workaround (removing the "= {0}"), but I don't think this is an isolate case.

Maybe the priority could be raised?
Comment 6 Martin Sebor 2020-06-09 23:53:42 UTC
The priority is up to the GCC release managers but raising it won't make the fix available before 10.2 (except on trunk).  Regardless, I should have a patch soon.
Comment 7 Martin Sebor 2020-06-24 21:24:16 UTC
The following is a more straightforward test case that's also miscompiled to return zero:

  int main ()
  { 
    char a[] = "\0abc";

    return __builtin_memcmp (a, "\0\0\0\0", 4);
  }

main:
.LFB0:
	.cfi_startproc
	xorl	%eax, %eax
	ret

There seems to a series of mistakes that led to this bug.  First, in the original test case, the initializer for the const float array of all zeros is just a single null byte, regardless of how big the array is.  c_getstr() returns the empty string for such an array and sets *STRLEN to 1.  As I mentioned in comment #4, this is very confusing given that the function can be called for non-strings and that the comment suggests embedded nuls are reflected in the STRLEN result.

The c_getstr() caller that ultimately interprets this result wrong is inline_expand_builtin_string_cmp() in builtins.c.  The comment above the function  reads:

/* Inline expansion a call to str(n)cmp, with result going to
   TARGET if that's convenient.
   If the call is not been inlined, return NULL_RTX.  */

but it's (also) called from expand_builtin_memcmp(), so yet again, the comment is incorrect or at best misleading.  The function then implements the correct behavior for strncmp() but not for memcmp().  The miscompilation was apparently introduced by r272993, most likely because of the misleading function name.  The call to inline_expand_builtin_string_cmp() from expand_builtin_memcmp() was introduced in r262636.
Comment 9 Martin Sebor 2020-07-01 14:46:39 UTC
*** Bug 96007 has been marked as a duplicate of this bug. ***
Comment 10 Jiarui Hong 2020-07-13 15:50:33 UTC
Please note that there seems to be two issues here that affect different versions of gcc:

(From the comment of gcc@pkh.me)
https://godbolt.org/z/xc59TM
This fails in 9.3.0 but works in 10.1.0.

(From the comment of Martin Sebor)
https://godbolt.org/z/9Kh1Tn
This fails in both 9.3.0 and 10.1.0.

Maybe the difference is caused by some other feature/bugfix that is present in 10.1.0 but not in 9.3.0.
Anyway, I just want to highlight this and hope it can help with getting a proper fix for this bug.
Thanks.
Comment 11 GCC Commits 2020-07-20 18:09:21 UTC
The master branch has been updated by Martin Sebor <msebor@gcc.gnu.org>:

https://gcc.gnu.org/g:d5803b9876b3d11c93d1a10fabb3fbb1c4a14bd6

commit r11-2231-gd5803b9876b3d11c93d1a10fabb3fbb1c4a14bd6
Author: Martin Sebor <msebor@redhat.com>
Date:   Mon Jul 20 12:06:18 2020 -0600

    Correct handling of constant representations containing embedded nuls.
    
    Resolves:
    PR middle-end/95189 - memcmp being wrongly stripped like strcm
    PR middle-end/95886 - suboptimal memcpy with embedded zero bytes
    
    gcc/ChangeLog:
    
            PR middle-end/95189
            PR middle-end/95886
            * builtins.c (inline_expand_builtin_string_cmp): Rename...
            (inline_expand_builtin_bytecmp): ...to this.
            (builtin_memcpy_read_str): Don't expect data to be nul-terminated.
            (expand_builtin_memory_copy_args): Handle object representations
            with embedded nul bytes.
            (expand_builtin_memcmp): Same.
            (expand_builtin_strcmp): Adjust call to naming change.
            (expand_builtin_strncmp): Same.
            * expr.c (string_constant): Create empty strings with nonzero size.
            * fold-const.c (c_getstr): Rename locals and update comments.
            * tree.c (build_string): Accept null pointer argument.
            (build_string_literal): Same.
            * tree.h (build_string): Provide a default.
            (build_string_literal): Same.
    
    gcc/testsuite/ChangeLog:
    
            PR middle-end/95189
            PR middle-end/95886
            * gcc.dg/memcmp-pr95189.c: New test.
            * gcc.dg/strncmp-3.c: New test.
            * gcc.target/i386/memcpy-pr95886.c: New test.
Comment 12 Martin Sebor 2020-07-20 18:10:13 UTC
Fixed in r11-2231 on trunk.
Comment 13 Richard Biener 2020-07-23 06:51:44 UTC
GCC 10.2 is released, adjusting target milestone.
Comment 14 Andrew Pinski 2020-08-16 20:26:35 UTC
*** Bug 96631 has been marked as a duplicate of this bug. ***
Comment 15 Alexander Monakov 2020-09-03 22:37:25 UTC
Is the patch eligible for backporting?

Users are hitting this as shown by dups and questions elsewhere like https://stackoverflow.com/questions/63724679/wrong-gcc-9-and-higher-optimization-of-memcmp-with-fno-inline
Comment 16 Martin Sebor 2020-09-06 23:28:26 UTC
I think it should be.  Let me do it.
Comment 17 Tim Ruffing 2020-09-27 14:22:44 UTC
Created attachment 49276 [details]
Hacky patch that makes GCC print a diagnostic message if hitting the bug

We hit this bug in a test case in secp256k1, the cryptography library used in Bitcoin Core. The bug appeared pretty scary to us at first glance because memcmp is widely used. After looking at the GCC patch that fixes this, we believe that this bug only occurs when
  * at least one of the compared byte arrays is constant and has a zero byte in position 0, 1, 2, or 3, *and*
  * the result of the memcmp isn't immediately used in a "== 0" or "!= 0" test (or equivalently "if(memcmp(...))" or "if(!memcmp(...))").

In particular the second condition makes this bug pretty rare and explains why it's mostly hit in non-inlined memcmp wrappers. (But in our case we hit it with a "<" comparison. )

For anyone else who's concerned about this bug, we've created a hacky patch (co-authored by Russell O'Connor) for the GCC versions with this bug that makes GCC print a diagnostic message if it emits wrong code due to this bug. We think the message is sound but it would be great if someone could confirm this.
Comment 18 Luke Dashjr 2020-10-05 23:41:50 UTC
Someone pointed out to me that the bug metadata says "Known to work: 9.3.0"
Comment 19 Alexander Monakov 2020-10-06 05:47:51 UTC
That was set early on and not adjusted when new testcases were found. Changed now.

The original testcase in comment #0 happened to work with 9.3, but breaks if 'float z[1]' is changed to 'char z[4]'. Testcases from the dup and from comment #7 also fail with 9.3.
Comment 20 Rich Felker 2020-10-07 22:49:20 UTC
For what it's worth, -fno-builtin is a workaround for this entire class of bug.
Comment 21 Luke Dashjr 2020-10-08 04:31:21 UTC
Not entirely, I think? At least according to the docs:

        if you wish to enable built-in functions selectively when using -fno-builtin or -ffreestanding, you may define macros such as:

                   #define abs(n)          __builtin_abs ((n))
                   #define strcpy(d, s)    __builtin_strcpy ((d), (s))

Which suggests code that calls __builtin_memcmp explicitly could still be broken.
Comment 22 GCC Commits 2020-10-08 18:38:54 UTC
The releases/gcc-10 branch has been updated by Martin Sebor <msebor@gcc.gnu.org>:

https://gcc.gnu.org/g:e4c9aac98611f63847ef6c57916808d9a2d7abcb

commit r10-8871-ge4c9aac98611f63847ef6c57916808d9a2d7abcb
Author: Martin Sebor <msebor@redhat.com>
Date:   Thu Oct 8 12:35:01 2020 -0600

    Correct handling of constant representations containing embedded nuls (backport from trunk).
    
    Resolves:
    PR middle-end/95189 - memcmp being wrongly stripped like strcm
    PR middle-end/95886 - suboptimal memcpy with embedded zero bytes
    
    gcc/ChangeLog:
    
            PR middle-end/95189
            PR middle-end/95886
            * builtins.c (inline_expand_builtin_string_cmp): Rename...
            (inline_expand_builtin_bytecmp): ...to this.
            (builtin_memcpy_read_str): Don't expect data to be nul-terminated.
            (expand_builtin_memory_copy_args): Handle object representations
            with embedded nul bytes.
            (expand_builtin_memcmp): Same.
            (expand_builtin_strcmp): Adjust call to naming change.
            (expand_builtin_strncmp): Same.
            * expr.c (string_constant): Create empty strings with nonzero size.
            * fold-const.c (c_getstr): Rename locals and update comments.
            * tree.c (build_string): Accept null pointer argument.
            (build_string_literal): Same.
            * tree.h (build_string): Provide a default.
            (build_string_literal): Same.
    
    gcc/testsuite/ChangeLog:
    
            PR middle-end/95189
            PR middle-end/95886
            * gcc.dg/memcmp-pr95189.c: New test.
            * gcc.dg/strncmp-3.c: New test.
            * gcc.target/i386/memcpy-pr95886.c: New test.
Comment 23 Martin Sebor 2020-10-08 18:40:19 UTC
Backported to 10-branch via r10-8871.
Comment 24 Rich Felker 2020-10-08 22:51:33 UTC
The fixes do not seem trivial to backport; lots of conflicts. It would be really helpful to have versions of the patch that are minified and applicable to all affected versions that might be shipping in distros (looks like 9.2, 9.3, 10.1, and 10.2) since this is a critical codegen regression.
Comment 25 Luke Dashjr 2020-10-09 00:12:09 UTC
Created attachment 49334 [details]
Hacky patch to workaround the bug by disabling builtin memcmp always

I wrote and am rebuilding my own systems with a very minimal patch that essentially disables the builtin memcmp entirely. It's probably overkill and bad for performance, but it should at least produce sane code and apply to pretty much any GCC (warning: including unaffected versions).
Comment 26 Rich Felker 2020-10-09 02:26:20 UTC
Is that complete, or is it unclear whether there are code paths other than builtin memcmp by which this is hit? Am I correct in assuming that with builtin memcmp expansion returning NULL_RTX, GCC always expands it to a function call?
Comment 27 Luke Dashjr 2020-10-09 13:09:46 UTC
AFAIK it's complete, but I am not a GCC developer.
Comment 28 Davin McCall 2021-01-02 05:05:18 UTC
Created attachment 49866 [details]
My backport of fix to 9.3.0

For the benefit of anyone that needs it, I've attached my own attempt to backport the fix directly to the 9.3.0 release (maybe it will apply against the current 9.4 branch, not sure, haven't tried). It seems to fix the issue but I'm not a GCC developer.
Comment 29 Richard Biener 2021-06-01 08:17:40 UTC
GCC 9.4 is being released, retargeting bugs to GCC 9.5.
Comment 30 Rich Felker 2021-06-24 01:13:43 UTC
This is a critical codegen issue. Is it really still not fixed in 9.4.0?
Comment 31 Richard Biener 2021-06-24 06:29:48 UTC
The original reported case doesn't affect GCC 9 it seems but the following does:

int main()
{
  char a[] = "\0abc";
  volatile int res = __builtin_memcmp (a, "\0\0\0\0", 4);
  if (res == 0)
    __builtin_abort ();
  return 0;
}
Comment 32 Martin Sebor 2022-03-17 19:47:27 UTC
I'm not planning to backport the patch.
Comment 33 Richard Biener 2022-05-27 08:54:07 UTC
Fixed in GCC 10.3