Bug 103215 - [12 regression] gcc generates unexpected warnings on libx11-1.7.2: error: array subscript -2 is outside array bounds of since r12-3124-g820f0940d7ace130
Summary: [12 regression] gcc generates unexpected warnings on libx11-1.7.2: error: arr...
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 12.0
: P3 normal
Target Milestone: 12.0
Assignee: Martin Sebor
Keywords: diagnostic, patch
Depends on:
Blocks: Warray-bounds
  Show dependency treegraph
Reported: 2021-11-12 19:52 UTC by Sergei Trofimovich
Modified: 2021-12-09 19:59 UTC (History)
3 users (show)

See Also:
Known to work:
Known to fail:
Last reconfirmed: 2021-11-12 00:00:00

orig.bug.c (34.10 KB, text/x-csrc)
2021-11-12 19:54 UTC, Sergei Trofimovich
gdb-bug.cc (370 bytes, text/x-csrc)
2021-11-14 17:21 UTC, Sergei Trofimovich

Note You need to log in before you can comment on or make changes to this bug.
Description Sergei Trofimovich 2021-11-12 19:52:18 UTC
Upstream core: https://gitlab.freedesktop.org/xorg/lib/libx11/-/blob/6d1dc1f6169ebf0ba71785d461bd98129c65c862/src/RdBitF.c#L156

Self-contained example:

    // $ cat bug.c
    #include <string.h>
    int extract(char*);
    int XReadBitmapFileData (void) {
        char name_and_type[255];
        for (;;) {
            extract (name_and_type);
            char * type = strrchr (name_and_type, '_');
            if (type) type++; else type = name_and_type;
            if (strcmp ("hot", type) == 0) {
            if (type-- == name_and_type || type-- == name_and_type) continue;
            if (strcmp ("ax_hot", type) == 0) return 1;
      return -1;

    $ gcc-11.2.0 -Wall -Werror=array-bounds -fno-strict-aliasing -O2 -c bug.c -o bug.o

    $ gcc-12.0.0 -Wall -Werror=array-bounds -fno-strict-aliasing -O2 -c bug.c -o bug.o
bug.c: In function 'XReadBitmapFileData':
bug.c:10:48: error: array subscript -2 is outside array bounds of 'char[9223372036854775807]' [-Werror=array-bounds]
   10 |             if (type-- == name_and_type || type-- == name_and_type) continue;
      |                                            ~~~~^~
bug.c:4:14: note: at offset [0, 253] into object 'name_and_type' of size 255
    4 |         char name_and_type[255];
      |              ^~~~~~~~~~~~~
cc1: some warnings being treated as errors

Does double post-increment `if (type-- == name_and_type || type-- == name_and_type)` have defined behaviour?

I think error message is at least incorrectly worded. I don't think there is an out-of-bounds access.

$ LANG=C ./result-2/bin/gcc -v
Using built-in specs.
Target: x86_64-unknown-linux-gnu
Configured with:
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 12.0.0 20211107 (experimental) (GCC)
Comment 1 Sergei Trofimovich 2021-11-12 19:54:06 UTC
Created attachment 51781 [details]

Attaching orig.bug.c in case I corrupted original too much.
Comment 2 Andrew Pinski 2021-11-12 21:32:24 UTC
>Does double post-increment `if (type-- == name_and_type || type-- == name_and_type)` have defined behaviour?

Yes as the || acts as a sequence point.

My bet is there is a (bad) transformation going on that converts v-- == array to v == array-1, v-- and that is causing the warnings to show up.

> I think error message is at least incorrectly worded. 

the warning is correct iif array-2 was used and that seems to be the problem here is that is not used directly but the compiler decides to create it (incorrectly).
Comment 3 Andrew Pinski 2021-11-12 22:03:58 UTC
Confirmed, though I don't see anywhere in the IR &MEM <char> [(void *)&name_and_type + -2B];
I do see:
type_9 = &MEM <char> [(void *)type_3 + -2B];

But type_3 cannot be &name_and_type there because of a check before hand.

This is definitely VRP/CCP related with maybe jump threading involved.
Comment 4 Sergei Trofimovich 2021-11-14 17:21:09 UTC
Created attachment 51792 [details]

Found similar bug in gdb/c++ at gdb/language.c. It might have slightly better loop structure.
As a downside it relies on very complex std::sort definition. Extracted example:

    #include <algorithm>
    enum language_t {
    extern const language_t languages_[nr_languages] /* = { auto, foo, unk } */;
    bool lc (language_t, language_t);
    const language_t* add_set_language_command ()
        language_t *ls = new language_t[nr_languages];

        // pull auto, unk in front. sort the rest
        language_t* ls_p = ls;
        *ls_p++ = language_auto;
        *ls_p++ = language_unknown;

        language_t* sort_begin = ls_p;
        for (const auto lang : languages_)
            // already present
            if (lang == language_auto || lang == language_unknown)

           *ls_p++ = lang;
        std::sort (sort_begin, ls_p, lc);
        return ls;

$ ./gcc12/bin/g++-12.0.0 -O2 gdb-bug.cc -Werror=array-bounds

In file included from /...-gcc-12.0.0/include/c++/12.0.0/algorithm:61,
                 from gdb-bug.cc:1:
In function 'void std::__final_insertion_sort(_RandomAccessIterator, _RandomAccessIterator, _Compare) [with _RandomAccessIterator = language_t*; _Compare = __gnu_cxx::__ops::_Iter_comp_iter<bool (*)(language_t, language_t)>]',
    inlined from 'void std::__sort(_RandomAccessIterator, _RandomAccessIterator, _Compare) [with _RandomAccessIterator = language_t*; _Compare = __gnu_cxx::__ops::_Iter_comp_iter<bool (*)(language_t, language_t)>]' at /...-gcc-12.0.0/include/c++/12.0.0/bits/stl_algo.h:1940:31,
    inlined from 'void std::sort(_RAIter, _RAIter, _Compare) [with _RAIter = language_t*; _Compare = bool (*)(language_t, language_t)]' at /nix/store/kckkq6280kixj8wxg4d0ks9lck8nai73-gcc-12.0.0/include/c++/12.0.0/bits/stl_algo.h:4853:18,
    inlined from 'const language_t* add_set_language_command()' at gdb-bug.cc:28:19:
/...-gcc-12.0.0/include/c++/12.0.0/bits/stl_algo.h:1849:32: error: array subscript 18 is outside array bounds of 'language_t [3]' [-Werror=array-bounds]
 1849 |           std::__insertion_sort(__first, __first + int(_S_threshold), __comp);
      |           ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
gdb-bug.cc: In function 'const language_t* add_set_language_command()':
gdb-bug.cc:12:53: note: at offset 72 into object of size 12 allocated by 'operator new []'
   12 |         language_t *ls = new language_t[nr_languages];
      |                                                     ^
cc1plus: some warnings being treated as errors
Comment 5 Andrew Pinski 2021-11-14 21:00:45 UTC
(In reply to Sergei Trofimovich from comment #4)
> Created attachment 51792 [details]
> gdb-bug.cc

That was filed as PR 102216.
Comment 6 Martin Sebor 2021-11-15 17:37:08 UTC
The warning follows the type_3 pointer to determine the object it points to.  That leads it to either name_and_type or name_and_type + OFFSET where OFFSET is in [1, 254].  Between those two, it conservatively picks the former because it has more space (this is done to avoid false positives for stores).  What it neglects to do is adjust the bounds of the offset to reflect that of the other.  So the code ends up determining that type_43 points to name_and_type with a zero offset when it should instead arrive at name_and_type with an offset in [0, 254].  (If the code also considered the ASSERT_EXPR conditions it should end up with an offset in [1, 254].)

  <bb 4> [local count: 1073741824]:
  extract (&name_and_type);
  type_7 = __builtin_strrchr (&name_and_type, 95);
  if (type_7 != 0B)
    goto <bb 5>; [70.00%]
    goto <bb 6>; [30.00%]

  <bb 5> [local count: 751619281]:
  type_9 = type_7 + 1;

  <bb 6> [local count: 1073741824]:
  # type_3 = PHI <type_9(5), &name_and_type(4)>
  _1 = __builtin_strcmp ("hot", type_3);
  if (_1 == 0)
    goto <bb 7>; [50.00%]
    goto <bb 3>; [50.00%]

  <bb 7> [local count: 536870913]:
  type_10 = type_3 + 18446744073709551615;
  if (&name_and_type == type_3)
    goto <bb 3>; [17.43%]
    goto <bb 8>; [82.57%]

  <bb 8> [local count: 443294313]:
  type_11 = &MEM <char> [(void *)type_3 + -2B];
  if (&name_and_type == type_10)
    goto <bb 3>; [17.43%]
    goto <bb 9>; [82.57%]
Comment 8 CVS Commits 2021-12-09 19:51:52 UTC
The master branch has been updated by Martin Sebor <msebor@gcc.gnu.org>:


commit r12-5871-g243a980437b5e7fca56587bf86667005bdf343a7
Author: Martin Sebor <msebor@redhat.com>
Date:   Thu Dec 9 12:49:28 2021 -0700

    Extend the offset and size of merged object references [PR103215].
    PR tree-optimization/103215 - bogus -Warray-bounds with two pointers with different offsets each
            PR tree-optimization/103215
            * pointer-query.cc (access_ref::merge_ref): Extend the offset and
            size of the merged object instead of using the larger.
            PR tree-optimization/103215
            * gcc.dg/Wstringop-overflow-58.c: Adjust and xfail expected warnings.
            * gcc.dg/Wstringop-overflow-59.c: Same.
            * gcc.dg/warn-strnlen-no-nul.c: Same.
            * gcc.dg/Warray-bounds-91.c: New test.
            * gcc.dg/Warray-bounds-92.c: New test.
            * gcc.dg/Wstringop-overflow-85.c: New test.
            * gcc.dg/Wstringop-overflow-87.c: New test.
Comment 9 Martin Sebor 2021-12-09 19:59:23 UTC
Fixed, but not without a price: some false negatives (pr103637).