Bug 106757 - [12/13 Regression] Incorrect "writing 1 byte into a region of size 0" on a vectorized loop
Summary: [12/13 Regression] Incorrect "writing 1 byte into a region of size 0" on a ve...
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 12.2.0
: P2 normal
Target Milestone: 12.4
Assignee: Not yet assigned to anyone
URL:
Keywords: diagnostic, missed-optimization
Depends on:
Blocks: Wstringop-overflow
  Show dependency treegraph
 
Reported: 2022-08-26 18:32 UTC by Jonathan Leffler
Modified: 2024-03-15 01:08 UTC (History)
6 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2022-08-26 00:00:00


Attachments
Source code (gcc-bug.c) for the repro (286 bytes, text/plain)
2022-08-26 18:32 UTC, Jonathan Leffler
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jonathan Leffler 2022-08-26 18:32:13 UTC
Created attachment 53515 [details]
Source code (gcc-bug.c) for the repro

GCC 11.2.0 is happy with this code (and I believe it is correct).  Neither GCC 12.1.0 nor GCC 12.2.0 are happy with this code (and I believe this is a bug).  There are no preprocessor directives in the source code.

$ /usr/gcc/v12.2.0/bin/gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/work1/gcc/v12.2.0/bin/../libexec/gcc/x86_64-pc-linux-gnu/12.2.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-12.2.0/configure --prefix=/usr/gcc/v12.2.0 CC=/usr/bin/gcc CXX=/usr/bin/g++
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 12.2.0 (GCC) 
$

Compilation:

$ /usr/gcc/v11.2.0/bin/gcc -c -std=c99 -O3 -Wall -Werror -pedantic -Wextra  gcc-bug.c
$ /usr/gcc/v12.2.0/bin/gcc -c -std=c99 -O3 -Wall -Werror -pedantic -Wextra  gcc-bug.c
gcc-bug.c: In function ‘pqr_scanner’:
gcc-bug.c:16:24: error: writing 1 byte into a region of size 0 [-Werror=stringop-overflow=]
   16 |             tmpchar[k] = mbs[k];
      |             ~~~~~~~~~~~^~~~~~~~
gcc-bug.c:14:14: note: at offset 4 into destination object ‘tmpchar’ of size 4
   14 |         char tmpchar[MBC_MAX];
      |              ^~~~~~~
gcc-bug.c:16:24: error: writing 1 byte into a region of size 0 [-Werror=stringop-overflow=]
   16 |             tmpchar[k] = mbs[k];
      |             ~~~~~~~~~~~^~~~~~~~
gcc-bug.c:14:14: note: at offset 5 into destination object ‘tmpchar’ of size 4
   14 |         char tmpchar[MBC_MAX];
      |              ^~~~~~~
cc1: all warnings being treated as errors
$

The -Wall, -Wextra, -pedantic options are not necessary to generate the warning; the -Werror gives an error instead of a warning, of course.

$ cat gcc-bug.i
# 0 "gcc-bug.c"
# 0 "<built-in>"
# 0 "<command-line>"
# 1 "/usr/include/stdc-predef.h" 1 3 4
# 0 "<command-line>" 2
# 1 "gcc-bug.c"
enum { MBC_MAX = 4 };

extern int pqr_scanner(char *mbs);
extern int pqr_mbc_len(char *mbs, int n);
extern void pqr_use_mbs(const char *mbs, int len);
extern char *pqr_mbs_nxt(char *mbs);

int
pqr_scanner(char *mbs)
{
    while (mbs != 0 && *mbs != '\0')
    {
        int len = pqr_mbc_len(mbs, MBC_MAX);
        char tmpchar[MBC_MAX];
        for (int k = 0; k < len; k++)
            tmpchar[k] = mbs[k];
        pqr_use_mbs(tmpchar, len);

        mbs = pqr_mbs_nxt(mbs);
    }

    return 0;
}
$

The source code contains a comment noting that if I replace `mbs = pqr_nbs_nxt(mbs);` with `mbs += len;`, the bug does not reproduce.

In the original code (which was doing work with multi-byte characters and strings), the analogue of pqr_mbc_len() returns either -1 or a value 1..MBC_MAX.    The code for the pqr_mbc_len() function was not part of the TU.  There was a test for `if (len < 0) return -1;` after the call to pqr_mbc_len() but it wasn't needed for the repro.

Just in case - GCC 11.2.0 specs and output from uname -a:

$ /usr/gcc/v11.2.0/bin/gcc -v
Using built-in specs.
COLLECT_GCC=/usr/gcc/v11.2.0/bin/gcc
COLLECT_LTO_WRAPPER=/work1/gcc/v11.2.0/bin/../libexec/gcc/x86_64-pc-linux-gnu/11.2.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-11.2.0/configure --prefix=/usr/gcc/v11.2.0 CC=/usr/bin/gcc CXX=/usr/bin/g++
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 11.2.0 (GCC)
$ uname -a
Linux njdc-ldev04 3.10.0-693.el7.x86_64 #1 SMP Thu Jul 6 19:56:57 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux
$

The original function was 100 lines of code in a file of 2600 lines, including 20 headers directly.
Comment 1 Martin Sebor 2022-08-26 19:21:26 UTC
GCC unrolls the loop, and GCC 12 also vectorizes it.  The combination of the two isolates stores from the loop that are out of bounds but that GCC cannot prove cannot happen: it has no insight into what value pqr_mbc_len() might return and if it's 5 or more the code would indeed write past the end.  The warning just points it out.  To "fix" this the unroller could use the bounds of the destination array to avoid emitting code for iterations of the loop that end up accessing objects outside their bounds (there already is logic that does that, controlled by the -faggressive-loop-optimizations option).  Until then, if the function is guaranteed to return a value between 0 and 4 then adding the following assertion both avoids the warning and improves the emitted code.

        if (len < 0 || MBC_MAX < len)
          __builtin_unreachable ();

The invalid stores can be seen in the IL output by the -fdump-tree-strlen=/dev/stdout developer option:

  <bb 7> [local count: 76354976]:
  bnd.6_47 = _26 >> 2;
  vect__3.11_53 = MEM <vector(4) char> [(char *)mbs_22];
  MEM <vector(4) char> [(char *)&tmpchar] = vect__3.11_53;
  vectp_mbs.9_52 = mbs_22 + 4;
  niters_vector_mult_vf.7_48 = bnd.6_47 << 2;
  tmp.8_49 = (int) niters_vector_mult_vf.7_48;
  if (_26 == niters_vector_mult_vf.7_48)
    goto <bb 15>; [25.00%]
  else
    goto <bb 8>; [75.00%]

  <bb 8> [local count: 57266232]:
  _75 = (sizetype) tmp.8_49;
  _76 = vectp_mbs.9_52;
  _77 = MEM[(char *)vectp_mbs.9_52];
  tmpchar[tmp.8_49] = _77;   <<< -Wstringop-overflow
  k_79 = tmp.8_49 + 1;
  if (len_12 > 5)
    goto <bb 9>; [80.00%]
  else
    goto <bb 15>; [20.00%]

  <bb 9> [local count: 45812986]:
  _82 = 5;
  _83 = mbs_22 + 5;
  _84 = *_83;
  tmpchar[5] = _84;          <<< -Wstringop-overflow
  k_86 = tmp.8_49 + 2;
  if (len_12 > k_86)
    goto <bb 10>; [80.00%]
  else
    goto <bb 15>; [20.00%]
Comment 2 Richard Biener 2022-08-29 08:26:45 UTC
The unroller has code to put unreachable()s in paths like those but it's imperfect.
Comment 3 Peter Bergner 2022-10-03 23:12:19 UTC
Is this the same bug, so just a simpler test case?

bergner@fowler:LTC193379$ cat bug.c
int len = 16;
extern char *src;
char dst[16];

void
foo (void)
{
#ifdef OK
  for (int i = 0; i < 16; i++)
#else
  for (int i = 0; i < len; i++)
#endif
    dst[i] = src[i];
}

bergner@fowler:LTC193379$ /home/bergner/gcc/build/gcc-fsf-mainline-ltc193379-debug/gcc/xgcc -B/home/bergner/gcc/build/gcc-fsf-mainline-ltc193379-debug/gcc -S -O3 -DOK -ftree-vectorize bug.c

bergner@fowler:LTC193379$ /home/bergner/gcc/build/gcc-fsf-mainline-ltc193379-debug/gcc/xgcc -B/home/bergner/gcc/build/gcc-fsf-mainline-ltc193379-debug/gcc -S -O3 -UOK -fno-tree-vectorize bug.c

bergner@fowler:LTC193379$ /home/bergner/gcc/build/gcc-fsf-mainline-ltc193379-debug/gcc/xgcc -B/home/bergner/gcc/build/gcc-fsf-mainline-ltc193379-debug/gcc -S -O3 -UOK -ftree-vectorize bug.c
bug.c: In function ‘foo’:
bug.c:13:12: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
   13 |     dst[i] = src[i];
      |     ~~~~~~~^~~~~~~~
bug.c:3:6: note: at offset 16 into destination object ‘dst’ of size 16
    3 | char dst[16];
      |      ^~~
bug.c:13:12: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
   13 |     dst[i] = src[i];
      |     ~~~~~~~^~~~~~~~
bug.c:3:6: note: at offset 17 into destination object ‘dst’ of size 16
    3 | char dst[16];
      |      ^~~

I'll note that -fno-unroll-loops doesn't affect anything.
Comment 4 Richard Biener 2022-12-05 20:35:08 UTC
(In reply to Peter Bergner from comment #3)
> Is this the same bug, so just a simpler test case?
> 
> bergner@fowler:LTC193379$ cat bug.c
> int len = 16;
> extern char *src;
> char dst[16];
> 
> void
> foo (void)
> {
> #ifdef OK
>   for (int i = 0; i < 16; i++)
> #else
>   for (int i = 0; i < len; i++)
> #endif
>     dst[i] = src[i];
> }
> 
> bergner@fowler:LTC193379$
> /home/bergner/gcc/build/gcc-fsf-mainline-ltc193379-debug/gcc/xgcc
> -B/home/bergner/gcc/build/gcc-fsf-mainline-ltc193379-debug/gcc -S -O3 -DOK
> -ftree-vectorize bug.c
> 
> bergner@fowler:LTC193379$
> /home/bergner/gcc/build/gcc-fsf-mainline-ltc193379-debug/gcc/xgcc
> -B/home/bergner/gcc/build/gcc-fsf-mainline-ltc193379-debug/gcc -S -O3 -UOK
> -fno-tree-vectorize bug.c
> 
> bergner@fowler:LTC193379$
> /home/bergner/gcc/build/gcc-fsf-mainline-ltc193379-debug/gcc/xgcc
> -B/home/bergner/gcc/build/gcc-fsf-mainline-ltc193379-debug/gcc -S -O3 -UOK
> -ftree-vectorize bug.c
> bug.c: In function ‘foo’:
> bug.c:13:12: warning: writing 1 byte into a region of size 0
> [-Wstringop-overflow=]
>    13 |     dst[i] = src[i];
>       |     ~~~~~~~^~~~~~~~
> bug.c:3:6: note: at offset 16 into destination object ‘dst’ of size 16
>     3 | char dst[16];
>       |      ^~~
> bug.c:13:12: warning: writing 1 byte into a region of size 0
> [-Wstringop-overflow=]
>    13 |     dst[i] = src[i];
>       |     ~~~~~~~^~~~~~~~
> bug.c:3:6: note: at offset 17 into destination object ‘dst’ of size 16
>     3 | char dst[16];
>       |      ^~~
> 
> I'll note that -fno-unroll-loops doesn't affect anything.

It looks similar.  Note the code we warn is isolated by DOM threading
after loop opts here.  The unrolling done is also a bit excessive but
that's because we estimate an upper bound on the epilogue based on
the array size accessed.

The IL we diagnose is definitely bogus but unreachable at runtime which
we don't see so it's also a code size issue.
Comment 5 Richard Biener 2023-05-08 12:25:20 UTC
GCC 12.3 is being released, retargeting bugs to GCC 12.4.
Comment 6 Jeffrey A. Law 2024-03-15 01:08:49 UTC
Works correctly on the trunk.  Adjusting regression markers.