Bug 115056 - [14/15 regression] False positive -Wstringop-overflow and -Warray-bounds
Summary: [14/15 regression] False positive -Wstringop-overflow and -Warray-bounds
Status: WAITING
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 14.1.1
: P3 normal
Target Milestone: 14.3
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks: Warray-bounds Wstringop-overflow
  Show dependency treegraph
 
Reported: 2024-05-12 23:27 UTC by Dale Weiler
Modified: 2024-08-01 09:40 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2024-05-13 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Dale Weiler 2024-05-12 23:27:30 UTC
I've isolated what appears to be an unsound __builtin_memset optimization applied by gcc 14.1.1 on a hash function in a cryptographic library where it writes one byte beyond the end of a buffer. The compiler thankfully reports two warnings when it happens.

The isolated test case is small so I'll provide it inline

#include <string.h>
#include <stdio.h>
typedef union {
	unsigned char as_bytes[64];
	unsigned long long as_chunks[64 / sizeof(unsigned long long)];
} Block;
int main(int argc, char **argv) {
	Block block;
	int i = strlen(argv[0]), j = 0;
	for (; j < i; j++) block.as_bytes[j] = argv[0][j];
	while (++j & 7) block.as_bytes[j] = 0;
	if (j > 56) while (j < 64) block.as_bytes[j++] = 0;
	while (j < 56) block.as_bytes[j++] = 0;
	for (j = 0; j < 8; j++) printf("%d\n", (int)block.as_chunks[j]);
}

Compiling this with -O2 produces the following warning

t.c: In function ‘main’:
t.c:12:56: warning: ‘__builtin_memset’ writing 8 bytes into a region of size 7 overflows the destination [-Wstringop-overflow=]
   12 |         if (j > 56) while (j < 64) block.as_bytes[j++] = 0;
      |                                    ~~~~~~~~~~~~~~~~~~~~^~~
t.c:8:15: note: at offset [57, 63] into destination object ‘block’ of size 64
    8 |         Block block;
      |               ^~~~~

Compiling this with -O2 and -Wall produces the following warning

t.c: In function ‘main’:
t.c:12:56: warning: ‘__builtin_memset’ forming offset 64 is out of the bounds [0, 64] of object ‘block’ with type ‘Block’ [-Warray-bounds=]
   12 |         if (j > 56) while (j < 64) block.as_bytes[j++] = 0;
      |                                    ~~~~~~~~~~~~~~~~~~~~^~~
t.c:8:15: note: ‘block’ declared here
    8 |         Block block;
      |               ^~~~~
Comment 1 Sam James 2024-05-12 23:29:43 UTC
Just to be clear as it might be easy to misread: the warnings are maybe a hint about what's going on, and not the complaint here.

I can reproduce diff. results w/ 13 vs 14, not looked further yet.
Comment 2 Sam James 2024-05-12 23:32:13 UTC
(In reply to Dale Weiler from comment #0)
> I've isolated what appears to be an unsound __builtin_memset optimization
> applied by gcc 14.1.1 on a hash function in a cryptographic library where it
> writes one byte beyond the end of a buffer.

You don't have to share, but I find it useful to know where stuff was reduced from. Was it a public library? If so, what? It is OK to not answer.
Comment 3 Dale Weiler 2024-05-12 23:34:45 UTC
> You don't have to share, but I find it useful to know where stuff was reduced from. Was it a public library? If so, what? It is OK to not answer.

It's the inner part of the Tiger Hash cryptographic hash function. This one is just my own implementation. This is a pretty standard construction of how the function works so it would presumably affect any Tiger Hash implementation.
Comment 4 Sam James 2024-05-12 23:35:53 UTC
With Clang, I get:
```
¢ clang /tmp/foo.c -o /tmp/foo
$ /tmp/foo
1886221359
0
0
0
0
0
0
-733536256
```
and
```
$ clang /tmp/foo.c -o /tmp/foo -fsanitize=address,undefined
$ /tmp/foo
1886221359
0
0
0
0
0
0
0
```
Comment 5 Dale Weiler 2024-05-12 23:38:51 UTC
I should note that there is a byte in-between these two pieces of code I removed

     for (; j < i; j++) block.as_bytes[j] = argv[0][j];
     block.as_bytes[j] = 0x01; // I removed this line
     while (++j & 7) block.as_bytes[j] = 0;

Just to make the repro smaller, but I guess that causes that one byte to be uninitialized in this case. Adding it back doesn't change anything, but it should be noted it exists here.
Comment 6 Sam James 2024-05-12 23:39:34 UTC
I was just about to comment that, thanks!
Comment 7 Sam James 2024-05-13 00:57:46 UTC
Isn't there still an uninitialised read?

```
$ valgrind /tmp/foo
[...]
==814922==
1886221359
1
0
0
0
0
0
==814922== Use of uninitialised value of size 8
==814922==    at 0x48F7D3A: _itoa_word (_itoa.c:183)
==814922==    by 0x49029A6: __printf_buffer (vfprintf-process-arg.c:155)
==814922==    by 0x4904BD0: __vfprintf_internal (vfprintf-internal.c:1544)
==814922==    by 0x49C55AE: __printf_chk (printf_chk.c:33)
==814922==    by 0x10938D: main (/tmp/foo.c:16)
==814922==
```

with:
```
#include <string.h>
#include <stdio.h>
typedef union {
        unsigned char as_bytes[64];
        unsigned long long as_chunks[64 / sizeof(unsigned long long)];
} Block;
int main(int argc, char **argv) {
        Block block;
        int i = strlen(argv[0]), j = 0;
        for (; j < i; j++) block.as_bytes[j] = argv[0][j];
        block.as_bytes[j] = 0x01; // I removed this line
        while (++j & 7) block.as_bytes[j] = 0;
        if (j > 56) while (j < 64) block.as_bytes[j++] = 0;
        while (j < 56) block.as_bytes[j++] = 0;
        for (j = 0; j < 8; j++) printf("%d\n", (int)block.as_chunks[j]);
}
```
Comment 8 Dale Weiler 2024-05-13 01:14:45 UTC
Yeah, you can add another `while (j < 64) block.as_bytes[j] = 0;` to the end if you want. I really should've done a better job reducing it so as not to create uninitialized memory. You can also just memset the block at the start to all zeros if you want.
Comment 9 Sam James 2024-05-13 01:18:10 UTC
The issue is we need a program which no UB which has the bad symptoms.

I can fix it up but that doesn't mean it has the symptoms you originally saw which made you report a bug.
Comment 10 Dale Weiler 2024-05-13 01:51:39 UTC
New test case without UB still exhibits the behavior

#include <string.h>
#include <stdio.h>
typedef union {
	unsigned char as_bytes[64];
	unsigned long long as_chunks[64 / sizeof(unsigned long long)];
} Block;
int main(int argc, char **argv) {
	Block block;
	int i = strlen(argv[0]), j = 0;
	for (; j < i; j++) block.as_bytes[j] = argv[0][j];
        block.as_bytes[j] = 0x01;
	while (++j & 7) block.as_bytes[j] = 0;
	if (j > 56) while (j < 64) block.as_bytes[j++] = 0;
	while (j < 56) block.as_bytes[j++] = 0;
        while (j < 64) block.as_bytes[j++] = 0x01;
	for (j = 0; j < 8; j++) printf("%d\n", (int)block.as_chunks[j]);
}
Comment 11 Sam James 2024-05-20 07:39:33 UTC
Coming back to this: have you actually seen unexpected results from runtime execution? (If so, please share what they were & what options you used to get them).

The warning is a problem but it doesn't necessarily imply bad codegen. Some warnings have FPs based on the IR that gets generated.

i.e. I think this is a FP warning bug instead.
Comment 12 Jakub Jelinek 2024-08-01 09:40:55 UTC
GCC 14.2 is being released, retargeting bugs to GCC 14.3.