Bug 112859

Summary: [12/13/14/15 Regression] wrong code at -O3 on x86_64-linux-gnu since r12-2097-g9f34b780b0461e
Product: gcc Reporter: Zhendong Su <zhendong.su>
Component: tree-optimizationAssignee: Richard Biener <rguenth>
Status: ASSIGNED ---    
Severity: normal CC: sjames
Priority: P2 Keywords: wrong-code
Version: 14.0   
Target Milestone: 12.5   
See Also: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112281
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115347
Host: Target: x86_64-*-*
Build: Known to work:
Known to fail: Last reconfirmed: 2023-12-05 00:00:00

Description Zhendong Su 2023-12-05 08:49:25 UTC
This appears to be a regression from 11.*, and affects 12.* and later.

Compiler Explorer: https://godbolt.org/z/aa8vrex9c

[555] % gcctk -v
Using built-in specs.
COLLECT_GCC=gcctk
COLLECT_LTO_WRAPPER=/local/home/suz/suz-local/software/local/gcc-trunk/bin/../libexec/gcc/x86_64-pc-linux-gnu/14.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-trunk/configure --disable-bootstrap --enable-checking=yes --prefix=/local/suz-local/software/local/gcc-trunk --enable-sanitizers --enable-languages=c,c++ --disable-werror --enable-multilib
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 14.0.0 20231205 (experimental) (GCC) 
[556] % 
[556] % gcctk -O2 small.c; ./a.out
[557] % 
[557] % gcctk -O3 small.c
[558] % ./a.out
Aborted
[559] % 
[559] % cat small.c
struct a {
  char b;
  int c;
} f, *i = &f;
static struct a e[4];
int *d, **g = &d;
static int h, j;
int main() {
  for (; h < 1; h++) {
    struct a k = {1, 1};
    for (j = 0; j < 2; j++) {
      *i = e[h];
      e[h] = k;
    }
    *g = 0;
  }
  if (f.c != 1)
    __builtin_abort();
  return 0;
}
Comment 1 Richard Biener 2023-12-05 13:53:57 UTC
t.c:9:12: optimized: Loop nest 1 distributed: split to 2 loops and 0 library calls.

-fno-tree-loop-distribution fixes this

GCC 11 distributes the inner loop, but only splits out the assignment to 'j'
there while with 12+ we distribute away the store to e[h].c.  Looks somewhat
familiar to the other "aggregate copy" case where we have no evolution in
the inner loop.

The critical dependence is

(Data Dep: 
#(Data Ref: 
#  bb: 5 
#  stmt: *i.1_1 = e[h.7_24];
#  ref: e[h.7_24];
#  base_object: e;
#  Access function 0: {h.7_20, +, 1}_1
#)
#(Data Ref: 
#  bb: 5 
#  stmt: e[h.7_24].c = 1;
#  ref: e[h.7_24].c;
#  base_object: e;
#  Access function 0: 32
#  Access function 1: {h.7_20, +, 1}_1
#)
  access_fn_A: {h.7_20, +, 1}_1
  access_fn_B: {h.7_20, +, 1}_1

 (subscript 
  iterations_that_access_an_element_twice_in_A: [0]
  last_conflict: scev_not_known
  iterations_that_access_an_element_twice_in_B: [0]
  last_conflict: scev_not_known
  (Subscript distance: 0 ))
  loop nest: (1 2 )
  distance_vector: 0 0 
  direction_vector:     =    =
)

which is then running into

              /* If the overlap is exact preserve stmt order.  */
              else if (lambda_vector_zerop (DDR_DIST_VECT (ddr, 0),
                                            DDR_NB_LOOPS (ddr)))
                ;

so maybe that special-casing is indeed incorrect, and the special-casing
I added

                  /* When then dependence distance of the innermost common
                     loop of the DRs is zero we have a conflict.  */
                  auto l1 = gimple_bb (DR_STMT (dr1))->loop_father;
                  auto l2 = gimple_bb (DR_STMT (dr2))->loop_father;
                  int idx = index_in_loop_nest (find_common_loop (l1, l2)->num,
                                                DDR_LOOP_NEST (ddr));
                  if (DDR_DIST_VECT (ddr, 0)[idx] == 0)
                    this_dir = 2;

should instead be handled to somehow handle more generally the situation that
two refs conflict in a loop where the refs do not evolve.

Thanks for these testcases btw.
Comment 2 Zhendong Su 2023-12-05 19:14:39 UTC
> Thanks for these testcases btw.

Happy to be of help, Richard. 

And thanks to you folks for the incredible work and dedication!
Comment 3 Sam James 2023-12-07 00:45:36 UTC
bisected to r12-2097-g9f34b780b0461e
Comment 4 Richard Biener 2024-06-20 09:14:31 UTC
GCC 12.4 is being released, retargeting bugs to GCC 12.5.