Bug 86889 - s390x gcc build fails when configured with --disable-checking
Summary: s390x gcc build fails when configured with --disable-checking
Status: ASSIGNED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 9.0
: P3 normal
Target Milestone: ---
Assignee: Richard Biener
URL:
Keywords: diagnostic, missed-optimization
Depends on:
Blocks: Warray-bounds
  Show dependency treegraph
 
Reported: 2018-08-08 13:43 UTC by Ilya Leoshkevich
Modified: 2020-06-10 17:45 UTC (History)
3 users (show)

See Also:
Host: s390x-linux-gnu
Target: s390x-linux-gnu
Build: s390x-linux-gnu
Known to work:
Known to fail: 10.1.0, 11.0, 9.2.0
Last reconfirmed: 2020-06-10 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Ilya Leoshkevich 2018-08-08 13:43:01 UTC
Seen on master (18d371d3):

build$ ../configure --disable-checking
build$ make -j$(getconf _NPROCESSORS_ONLN)

../../gcc/bitmap.c: In function ‘unsigned int bitmap_last_set_bit(const_bitmap)’:
../../gcc/bitmap.c:841:26: error: array subscript -1 is below array bounds of ‘const BITMAP_WORD [2]’ {aka ‘const long unsigned int [2]’} [-Werror=array-bounds]
       word = elt->bits[ix];

The code in question is:

 839   for (ix = BITMAP_ELEMENT_WORDS - 1; ix >= 0; ix--)
 840     {
 841       word = elt->bits[ix];
 842       if (word)
 843         goto found_bit;
 844     }

BITMAP_ELEMENT_WORDS on s390x is 2.
I narrowed this down to cunrolli pass, which unrolls this loop 3 times instead of 2, so ix=[1, 0, -1] instead of just [1, 0].
And indeed, building this individual file with -fdisable-tree-cunrolli helps.
Comment 1 Martin Sebor 2018-08-09 02:27:29 UTC
I can't confirm it yet.  Could you attach a test case or a preprocessing translation unit?
Comment 2 Ilya Leoshkevich 2018-08-09 09:02:15 UTC
C-Reduced version:

struct a {
  long b[128 / (8 * 8)];
};
int c;
void d() {
  a *e;
  long f;
  int g = 1;
  for (; g >= 0; g--) {
    f = e->b[g];
    if (f)
      goto h;
  }
  __builtin_unreachable();
h:
  c = g;
}

Repro:

$ PATH=build/prev-gcc:$PATH xg++ -c -O2 -Warray-bounds bitmap.E.c
bitmap.E.c: In function ‘void d()’:
bitmap.E.c:10:15: warning: array subscript -1 is below array bounds of ‘long int [2]’ [-Warray-bounds]
     f = e->b[g];
         ~~~~~~^
Comment 3 Martin Sebor 2018-08-09 21:59:06 UTC
Thank you.  With the small test case I can confirm the warning even on x86_64-linux.  I've reduced it a tiny bit more.  Here's the smaller test case and the VRP output:

$ cat f.c && gcc -S -O2 -Wall -fdump-tree-vrp=/dev/stdout f.c
struct a { long b[2]; };

int d (struct a *p)
{
  int g = 1;

  for (; g >= 0; g--)
    if (p->b[g])
      return g;

  __builtin_unreachable ();
}

;; Function d (d, funcdef_no=0, decl_uid=1907, cgraph_uid=1, symbol_order=0)

;; 2 loops found
;;
;; Loop 0
;;  header 0, latch 1
;;  depth 0, outer -1
;;  nodes: 0 1 2 3 4 5
;; 2 succs { 5 3 }
;; 3 succs { 5 4 }
;; 4 succs { 5 }
;; 5 succs { 1 }

SSA replacement table
N_i -> { O_1 ... O_j } means that N_i replaces O_1, ..., O_j

p_9 -> { p_4(D) }
Incremental SSA update started at block: 2
Number of blocks in CFG: 6
Number of blocks to update: 3 ( 50%)



Value ranges after VRP:

_1: VARYING
p_4(D): VARYING
g_5: [0, 1]
_8: VARYING
p_9: ~[0B, 0B]  EQUIVALENCES: { p_4(D) } (1 elements)
_11: VARYING


f.c: In function ‘d’:
f.c:8:13: warning: array subscript -1 is below array bounds of ‘long int[2]’ [-Warray-bounds]
8 |     if (p->b[g])
  |         ~~~~^~~
d (struct a * p)
{
  int g;
  long int _1;
  long int _8;
  long int _11;

  <bb 2> [local count: 357878152]:
  _8 = p_4(D)->b[1];
  if (_8 != 0)
    goto <bb 5>; [33.33%]
  else
    goto <bb 3>; [66.67%]

  <bb 3> [local count: 238597364]:
  _11 = p_4(D)->b[0];
  if (_11 != 0)
    goto <bb 5>; [33.33%]
  else
    goto <bb 4>; [66.67%]

  <bb 4> [local count: 159072864]:
  _1 = p_4(D)->b[-1];

  <bb 5> [local count: 357878150]:
  # g_5 = PHI <-1(4), 1(2), 0(3)>
  return g_5;

}



;; Function d (d, funcdef_no=0, decl_uid=1907, cgraph_uid=1, symbol_order=0)

;; 2 loops found
;;
;; Loop 0
;;  header 0, latch 1
;;  depth 0, outer -1
;;  nodes: 0 1 2 3 4
;; 2 succs { 4 3 }
;; 3 succs { 4 }
;; 4 succs { 1 }

SSA replacement table
N_i -> { O_1 ... O_j } means that N_i replaces O_1, ..., O_j

p_2 -> { p_4(D) }
Incremental SSA update started at block: 2
Number of blocks in CFG: 5
Number of blocks to update: 2 ( 40%)



Value ranges after VRP:

_1: [0, +INF]
p_2: ~[0B, 0B]  EQUIVALENCES: { p_4(D) } (1 elements)
p_4(D): VARYING
g_5: [0, 1]
_8: VARYING
_9: [0, 1]
_11: VARYING
_12: [-1, 0]


d (struct a * p)
{
  int g;
  _Bool _1;
  long int _8;
  int _9;
  long int _11;
  int _12;

  <bb 2> [local count: 357878152]:
  _8 = p_4(D)->b[1];
  if (_8 != 0)
    goto <bb 4>; [33.33%]
  else
    goto <bb 3>; [66.67%]

  <bb 3> [local count: 238597364]:
  _11 = p_4(D)->b[0];
  _1 = _11 == 0;
  _9 = (int) _1;
  _12 = -_9;

  <bb 4> [local count: 357878150]:
  # g_5 = PHI <_12(3), 1(2)>
  return g_5;

}
Comment 4 Ilya Leoshkevich 2018-08-13 10:16:32 UTC
Bisect points to:

commit d19574debc50ff915c5d39c22a51cb0bc750e75d (HEAD, refs/bisect/bad)
Author: rguenth <rguenth@138bc75d-0d04-0410-961f-82ee72b054a4>
Date:   Fri May 18 11:54:37 2018 +0000

    2018-05-18  Richard Biener  <rguenther@suse.de>

            * gimple-ssa-evrp.c (class evrp_folder): Add simplify_stmt_using_ranges
            method.
            (evrp_dom_walker::before_dom_children): Call it.

            * gcc.dg/tree-ssa/pr21559.c: Adjust.
            * gcc.dg/tree-ssa/pr45397.c: Likewise.
            * gcc.dg/tree-ssa/pr61839_1.c: Likewise.
            * gcc.dg/tree-ssa/pr61839_2.c: Likewise.
            * gcc.dg/tree-ssa/pr61839_4.c: Likewise.
            * gcc.dg/tree-ssa/vrp17.c: Likewise.
            * gcc.dg/tree-ssa/vrp18.c: Likewise.
            * gcc.dg/tree-ssa/vrp23.c: Likewise.
            * gcc.dg/tree-ssa/vrp24.c: Likewise.
            * gcc.dg/tree-ssa/vrp58.c: Likewise.
            * gcc.dg/vrp-min-max-1.c: Likewise.
            * gcc.dg/vrp-min-max-3.c: New testcase.


    git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@260357 138bc75d-0d04-0410-961f-82ee72b054a4
Comment 5 Richard Biener 2018-08-17 10:54:33 UTC
So the issue is that the __builtin_unreachable () elides the g >= 0 test,
effectively making the loop endless.  If you disable EVRP and CUNROLLI then VRP1 will do this transform.

<bb 6> :
# g_2 = PHI <1(2), g_7(5)>
if (g_2 != -1)
  goto <bb 3>; [INV]
else
  goto <bb 7>; [INV]

It goes like EVRP computing the range of g_2 to [0, 1] (correctly, based on
the __builtin_unreachable) and then CFG cleanup simplifying the conditional
to false, removing the __builtin_unreachable () call.

cunrolli then computes the loop to run twice and inserts that stray [-1] array
reference, exposing it's old latent issue of sometimes emitting stray
iterations as the reporter already analyzed (it can at most handle the _last_
iteration when marking blocks as never executed, but in cases where the IV
increment is at unfortunate places it has to look at the previous peeled
copy as well).

Maybe I can get to improve this.  Don't hold your breath.

I think we have dups of this (but maybe closed because they no longer warn).
Comment 6 Martin Sebor 2020-06-10 17:45:01 UTC
Reconfirming with GCC 10 and 11.  The output for the test case in comment #2 is:

pr86889.c: In function ‘d’:
pr86889.c:8:13: warning: array subscript -1 is below array bounds of ‘long int[2]’ [-Warray-bounds]
    8 |     if (p->b[g])
      |         ~~~~^~~
pr86889.c:1:17: note: while referencing ‘b’
    1 | struct a { long b[2]; };
      |                 ^