Bug 98069 - [10 Regression] Miscompilation with -O3 since r8-2380-g2d7744d4ef93bfff
Summary: [10 Regression] Miscompilation with -O3 since r8-2380-g2d7744d4ef93bfff
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 11.0
: P2 normal
Target Milestone: 11.0
Assignee: Richard Sandiford
URL:
Keywords: wrong-code
: 95396 97960 (view as bug list)
Depends on:
Blocks: yarpgen
  Show dependency treegraph
 
Reported: 2020-11-30 15:04 UTC by Vsevolod Livinskii
Modified: 2023-07-07 09:15 UTC (History)
8 users (show)

See Also:
Host:
Target:
Build:
Known to work: 11.0, 7.4.0
Known to fail: 10.2.0, 10.5.0, 11.0, 8.4.0, 9.3.0
Last reconfirmed: 2020-11-30 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Vsevolod Livinskii 2020-11-30 15:04:53 UTC
Reproducer:
//func.cpp
extern long long var_3;
extern short var_8;
extern int var_17;
extern short arr_165[];
long c(long e, long f) { return f ? e : f; }
void test() {
    for (int b = 0; b < 19; b = var_17)
        for (int d = int(~c(-2147483647 - 1, var_3)) - 2147483647; d < 22; d++)
            arr_165[d] = var_8;
}

//driver.cpp 
#include <stdio.h>

long long int var_3 = -166416893043554447LL;
short var_8 = (short)27092;
unsigned int var_17 = 75036300U;
short arr_165[23];

void test();

int main() {
    for (size_t i_3 = 0; i_3 < 23; ++i_3)
        arr_165[i_3] = (short)-8885;
    test();
    printf("%d\n", arr_165[0]);
}

>$ g++ -O3 func.cpp driver.cpp && ./a.out
Segmentation fault (core dumped)
>$ g++ -O2 func.cpp driver.cpp && ./a.out
27092

gcc version 11.0.0 20201126 (beb9afcaf1466996a301c778596c5df209e7913c)
Comment 1 Martin Liška 2020-11-30 16:29:35 UTC
Started with r8-2380-g2d7744d4ef93bfff.
Comment 2 Richard Biener 2020-12-01 08:33:00 UTC
The following fails with -O2 -ftree-vectorize


long long int var_3 = -166416893043554447LL;
short var_8 = (short)27092;
unsigned int var_17 = 75036300U;
short arr_165[23];

static long c(long e, long f) { return f ? e : f; }
void __attribute((noipa)) test()
{
  for (int b = 0; b < 19; b = var_17)
    for (int d = (int)(~c(-2147483647 - 1, var_3)) - 2147483647; d < 22; d++)
      arr_165[d] = var_8;
}

int main()
{
  for (unsigned i_3 = 0; i_3 < 23; ++i_3)
    arr_165[i_3] = (short)-8885;
  test();
  if (arr_165[0] != 27092)
    __builtin_abort ();
  return 0;
}
Comment 3 Richard Biener 2020-12-01 09:41:23 UTC
So before vectorization we have

  var_3.0_1 = var_3;
  iftmp.3_10 = var_3.0_1 != 0 ? -2147483648 : 0;
  _2 = (int) iftmp.3_10;
  d_11 = -2147483648 - _2;

which should compute to 0.  Now somewhere (split_constant_offset I'd bet)
things go wrong and we produce in dataref analysis

        base_address: &arr_165
        offset from base address: (ssizetype) ((sizetype) -_2 * 2)
        constant offset from base address: -4294967296
        step: 2
        base alignment: 32
        base misalignment: 0
        offset alignment: 2
        step alignment: 2
        base_object: arr_165
        Access function 0: {d_11, +, 1}_2

from { ((sizetype) d_21) * 2, +, 2 }_2 as offset IV which split_constant_offset
makes into { (sizetype) -_2 * 2, +, 2 }_2 + -4294967296.

The logic is that we can dive into the (sizetype) extension from int since
the operation on int cannot overflow (undefined behavior).  We then
strip the -2147483648 constant, thinking we can associate this as
(sizetype)(0 - _2) + -2147483648 but of course 0 - _2 now overflows
and we happily fold it to -_2.  So here:

static bool
split_constant_offset_1 (tree type, tree op0, enum tree_code code, tree op1,
                         tree *var, tree *off,
                         hash_map<tree, std::pair<tree, tree> > &cache,
                         unsigned *limit)
{
...
    case PLUS_EXPR:
    case MINUS_EXPR:
      if (TREE_CODE (op1) == INTEGER_CST)
        {
          split_constant_offset (op0, &var0, &off0, cache, limit);
          *var = var0;
          *off = size_binop (ocode, off0, fold_convert (ssizetype, op1));
          return true;
        }
      split_constant_offset (op0, &var0, &off0, cache, limit);
      split_constant_offset (op1, &var1, &off1, cache, limit);
      *var = fold_build2 (code, type, var0, var1);

we have to make sure that the expression we fold cannot overflow or
use unsigned arithmetic.  In fact, using unsigned arithmetic looks like
the only reasonable thing since splitting out any constant is not
trivially valid.  We're doing

 (var0 + off0) +- (var1 + off1) -> (var0 +- var1) + (off0 + off1)

where we know the left side doesn't overflow.

This makes associating across the multiplication invalid where we do

 ((sizetype) (A +- B)) * 2 -> ((sizetype) A) * 2 +- B * 2

and thus from 0 * 2 we go to -2147483648 * 2 + -4294967296.  Using
unsigned arithmetic to make the inner op no longer undefined doesn't
help here.

So it looks like we cannot use TYPE_OVERFLOW_UNDEFINED to validate
looking across widening conversions.  Which points to the same area
as PR97960.
Comment 4 Richard Biener 2020-12-02 08:38:48 UTC
The following helps:

diff --git a/gcc/tree-data-ref.c b/gcc/tree-data-ref.c
index 3bf460cccfd..a7606b3db99 100644
--- a/gcc/tree-data-ref.c
+++ b/gcc/tree-data-ref.c
@@ -754,9 +759,10 @@ split_constant_offset_1 (tree type, tree op0, enum tree_code code, tree op1,
            && TYPE_PRECISION (type) >= TYPE_PRECISION (itype)
            && (POINTER_TYPE_P (type) || INTEGRAL_TYPE_P (type)))
          {
-           if (INTEGRAL_TYPE_P (itype) && TYPE_OVERFLOW_WRAPS (itype)
+           if (INTEGRAL_TYPE_P (itype)
                && (TYPE_PRECISION (type) > TYPE_PRECISION (itype)
-                   || TYPE_UNSIGNED (itype) != TYPE_UNSIGNED (type)))
+                   || (TYPE_UNSIGNED (itype) != TYPE_UNSIGNED (type)
+                       && TYPE_OVERFLOW_WRAPS (itype))))
              {
                /* Split the unconverted operand and try to prove that
                   wrapping isn't a problem.  */

but I fear this will have quite some negative impact on dependence analysis.

I'm still not sure I have the issue nailed down properly :/
Comment 5 Richard Biener 2020-12-04 13:23:49 UTC
The proposed fix for PR98137 might also expose more cases like this (which maybe is a good thing for coverage).
Comment 6 Richard Biener 2021-01-08 14:39:31 UTC
This is now also fixed, since g:4cf70c20cb10acd6fb10166 presumably.  Assigning to Richard in case he's planning backports.
Comment 7 Richard Sandiford 2021-04-23 09:20:49 UTC
As discussed on irc, the fix was quite invasive, so it seems a bit
dangerous to backport further than GCC 10.  Will backport to GCC 10 in
the GCC 11.2 timeframe, once we've had more chance to see if there's
any fallout.
Comment 8 Richard Sandiford 2021-04-23 09:21:21 UTC
*** Bug 97960 has been marked as a duplicate of this bug. ***
Comment 9 Richard Sandiford 2021-04-23 09:21:56 UTC
*** Bug 95396 has been marked as a duplicate of this bug. ***
Comment 10 Jakub Jelinek 2022-06-28 10:42:46 UTC
GCC 10.4 is being released, retargeting bugs to GCC 10.5.
Comment 11 Richard Biener 2023-07-07 09:15:22 UTC
Fixed in GCC 11.