Bug 81281 - [7 Regression] UBSAN: false positive, dropped promotion to long type.
Summary: [7 Regression] UBSAN: false positive, dropped promotion to long type.
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: sanitizer (show other bugs)
Version: 8.0
: P2 normal
Target Milestone: 8.0
Assignee: Jakub Jelinek
URL:
Keywords: wrong-code
Depends on:
Blocks: yarpgen
  Show dependency treegraph
 
Reported: 2017-07-02 20:36 UTC by Dmitry Babokin
Modified: 2021-11-01 23:07 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Known to work: 8.0
Known to fail: 7.5.0
Last reconfirmed: 2017-07-03 00:00:00


Attachments
gcc8-pr81281-test.patch (522 bytes, patch)
2017-12-04 14:52 UTC, Jakub Jelinek
Details | Diff
gcc8-pr81281.patch (2.49 KB, patch)
2017-12-05 11:25 UTC, Jakub Jelinek
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Dmitry Babokin 2017-07-02 20:36:52 UTC
gcc x86_64, rev249877.

Looks like this one is deferent from other currently open ubsan bugs.

Promotion to long was dropped, so -2024172551 - (long)ci overflows. Also note, that removing const modifier from ci definition hides the problem.

> cat f_init.cpp
extern const int ci = 1716607962;
int i = -943738830;
long long ll = -43165919987465092LL;

> cat f.cpp
extern const int ci;
extern int i;
extern long long ll;

int foo() {
  int a = int(-2024172551 - i - (ci - ll)) -
        (int(-2024172551 - i - (ci - ll)) -
         int(-2024172551 - (long)ci));
  return a;
}

int main() {
  foo();
  return 0;
}

> g++ -fsanitize=undefined -O0 f.cpp f_init.cpp -o out
> ./out
f.cpp:6:7: runtime error: signed integer overflow: -2024172551 - 1716607962 cannot be represented in type 'int'
Comment 1 Martin Liška 2017-07-03 21:17:51 UTC
Confirmed.
Comment 2 Marek Polacek 2017-07-18 15:39:48 UTC
Started with r229167.  Would be nice to have one-file testcase but that doesn't seem to be that easy...
Comment 3 Marek Polacek 2017-07-18 15:52:47 UTC
Before that rev we had

int a = (int) (((((unsigned int) ll - (unsigned int) ci) - (unsigned int) i) + ((((unsigned int) ci - (unsigned int) ll) + (unsigned int) i) - (unsigned int) ci)) + 2270794745);

and now just

int a = -2024172551(OVF) - (int) ci;
Comment 4 Jakub Jelinek 2017-12-04 14:52:37 UTC
Created attachment 42785 [details]
gcc8-pr81281-test.patch

This was fixed by r251651 for -fsanitize=undefined.  Attaching testcase in patch form.  That said, without -fsanitize=unreachable the bug is still latent.
If we start just with:
  int a = (int) (-2024172551 - (long long)ci);
then we properly fold it into:
  int a = (int) (2270794745 - (unsigned int) ci);
Comment 5 rguenther@suse.de 2017-12-04 15:08:40 UTC
On Mon, 4 Dec 2017, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81281
> 
> --- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
> Created attachment 42785 [details]
>   --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42785&action=edit
> gcc8-pr81281-test.patch
> 
> This was fixed by r251651 for -fsanitize=undefined.  Attaching testcase in
> patch form.  That said, without -fsanitize=unreachable the bug is still latent.
> If we start just with:
>   int a = (int) (-2024172551 - (long long)ci);
> then we properly fold it into:
>   int a = (int) (2270794745 - (unsigned int) ci);

Yeah, those sanitize_flags_p (SANITIZE_SI_OVERFLOW) "fixes" are always
wrong...  The 2nd hunk looks ok though
Comment 6 Jakub Jelinek 2017-12-04 16:59:39 UTC
So it is indeed the
  /* (T)(P + A) - (T)(P + B) -> (T)A - (T)B */
  (for add (plus pointer_plus)
   (simplify
    (minus (convert (add @@0 @1))
     (convert (add @0 @2)))
    (if (element_precision (type) <= element_precision (TREE_TYPE (@1))
         /* For integer types, if A has a smaller type
            than T the result depends on the possible
            overflow in P + A.
            E.g. T=size_t, A=(unsigned)429497295, P>0.
            However, if an overflow in P + A would cause
            undefined behavior, we can assume that there
            is no overflow.  */
         || (INTEGRAL_TYPE_P (TREE_TYPE (@0))
             && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)))
         /* For pointer types, if the conversion of A to the
            final type requires a sign- or zero-extension,
            then we have to punt - it is not defined which
            one is correct.  */
         || (POINTER_TYPE_P (TREE_TYPE (@0))
             && TREE_CODE (@1) == INTEGER_CST
             && tree_int_cst_sign_bit (@1) == 0
             && TREE_CODE (@2) == INTEGER_CST
             && tree_int_cst_sign_bit (@2) == 0))
     (minus (convert @1) (convert @2)))))))

case, where we have:
op0 (int) ((((unsigned int) ll - (unsigned int) ci) - (unsigned int) i) + 2270794745)
op1 (int) ((((unsigned int) ll - (unsigned int) ci) - (unsigned int) i) + (unsigned int) ci);
type here is int, @0/@1/@2 all have unsigned type.

The (T)(P + A) - (T)(P + B) to (T)A - (T)B transformation is incorrect if TYPE_OVERFLOW_UNDEFINED (type) (or its element type)
and either !TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)) (or its element type), or TREE_TYPE (@1) has larger element precision.
Say if T is int, and TREE_TYPE (P) is unsigned, then e.g. random example of A = 0U+INT_MIN, B = 1U, P = -1U overflows only in the (int) INT_MIN - (int) 1 case, but doesn't overflow in the (int)(-1U+INT_MIN) - (int)(1U + -1U) case (many other examples).

We need to do the subtraction in an unsigned_type_for (type) instead and only cast to type at the end.

For the POINTER_PLUS_EXPR case, the explicitly enabled case is only if both @1 and @2 are INTEGER_CSTs with MSB clear which is fine for the widening conversion to type, but for equal or narrowing one not really sure.
Comment 7 Jakub Jelinek 2017-12-04 17:19:00 UTC
Even the
  (T)P - (T)(P + A) -> -(T) A
transformation looks wrong, consider A being 0U+INT_MIN, and P -1U.
(int)-1U - (int)(-1U+INT_MIN) is INT_MIN without overflow, while -(int)INT_MIN
overflows.  Note it doesn't look like fold-const.c, at least the removed spot from it, was doing what the match.pd is doing at least for non-pointers.

Note r251651 was actually the right fix, just not for this issue, but for the general issue that through performing say (int) (long_long_a - long_long_b)
to (int) ((unsigned) long_long_a - (unsigned) long_long_b)) optimization we won't be able to detect UB user code had, but the optimized form doesn't.
That just means that the sanitizer is less useful in finding bugs in the compiler transformations.  Perhaps we could have a debug counter or param or something similar where we'd accept detecting fewer UBs on user code but gain the possibility to detect more invalid transformations like this one.
Comment 8 Jakub Jelinek 2017-12-04 18:11:50 UTC
void
foo (unsigned p, unsigned a, unsigned b)
{
  unsigned q = p + 7;
  if (a - (1U + __INT_MAX__) >= 2)
    __builtin_unreachable ();
  int d = p + b;
  int c = p + a;
  if (c - d != __INT_MAX__)
    __builtin_abort ();
}

void
bar (unsigned p, unsigned a)
{
  unsigned q = p + 7;
  if (a - (1U + __INT_MAX__) >= 2)
    __builtin_unreachable ();
  int c = p;
  int d = p + a;
  if (c - d != -__INT_MAX__ - 1)
    __builtin_abort ();
}

int
main ()
{
  foo (-1U, 1U + __INT_MAX__, 1U);
  bar (-1U, 1U + __INT_MAX__);
  return 0;
}

is a testcase that fails on the trunk with -O2 because of this (without any sanitization).

On the other side, when add is a plus and not pointer_plus, the pattern doesn't contain :c as I found when writing the testcase, so it matches only if the SSA_NAMEs are in the right order.
Comment 9 Jakub Jelinek 2017-12-04 18:17:19 UTC
Note the #c8 testcase started failing with r247578 - before that we weren't doing that good job in evrp to optimize it.
Comment 10 Jakub Jelinek 2017-12-05 11:25:02 UTC
Created attachment 42794 [details]
gcc8-pr81281.patch

Untested fix.
Comment 11 Jakub Jelinek 2017-12-05 11:26:47 UTC
Corresponding diff -upbd for better readability:
--- gcc/match.pd.jj	2017-11-28 09:40:08.000000000 +0100
+++ gcc/match.pd	2017-12-05 11:36:58.855074420 +0100
@@ -1783,9 +1783,8 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
    (bit_not @0))
 
   /* (T)(P + A) - (T)P -> (T) A */
-  (for add (plus pointer_plus)
    (simplify
-    (minus (convert (add @@0 @1))
+   (minus (convert (plus:c @@0 @1))
      (convert @0))
     (if (element_precision (type) <= element_precision (TREE_TYPE (@1))
 	 /* For integer types, if A has a smaller type
@@ -1796,7 +1795,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 	    undefined behavior, we can assume that there
 	    is no overflow.  */
 	 || (INTEGRAL_TYPE_P (TREE_TYPE (@0))
-	     && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)))
+	    && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0))))
+    (convert @1)))
+  (simplify
+   (minus (convert (pointer_plus @@0 @1))
+    (convert @0))
+   (if (element_precision (type) <= element_precision (TREE_TYPE (@1))
 	 /* For pointer types, if the conversion of A to the
 	    final type requires a sign- or zero-extension,
 	    then we have to punt - it is not defined which
@@ -1804,7 +1808,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 	 || (POINTER_TYPE_P (TREE_TYPE (@0))
 	     && TREE_CODE (@1) == INTEGER_CST
 	     && tree_int_cst_sign_bit (@1) == 0))
-     (convert @1))))
+    (convert @1)))
    (simplify
     (pointer_diff (pointer_plus @@0 @1) @0)
     /* The second argument of pointer_plus must be interpreted as signed, and
@@ -1813,10 +1817,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
      (convert (convert:stype @1))))
 
   /* (T)P - (T)(P + A) -> -(T) A */
-  (for add (plus pointer_plus)
    (simplify
     (minus (convert @0)
-     (convert (add @@0 @1)))
+    (convert (plus:c @@0 @1)))
+   (if (INTEGRAL_TYPE_P (type)
+	&& TYPE_OVERFLOW_UNDEFINED (type)
+        && element_precision (type) <= element_precision (TREE_TYPE (@1)))
+    (with { tree utype = unsigned_type_for (type); }
+     (convert (negate (convert:utype @1))))
     (if (element_precision (type) <= element_precision (TREE_TYPE (@1))
 	 /* For integer types, if A has a smaller type
 	    than T the result depends on the possible
@@ -1826,7 +1834,17 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 	    undefined behavior, we can assume that there
 	    is no overflow.  */
 	 || (INTEGRAL_TYPE_P (TREE_TYPE (@0))
-	     && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)))
+	     && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0))))
+     (negate (convert @1)))))
+  (simplify
+   (minus (convert @0)
+    (convert (pointer_plus @@0 @1)))
+   (if (INTEGRAL_TYPE_P (type)
+	&& TYPE_OVERFLOW_UNDEFINED (type)
+        && element_precision (type) <= element_precision (TREE_TYPE (@1)))
+    (with { tree utype = unsigned_type_for (type); }
+     (convert (negate (convert:utype @1))))
+    (if (element_precision (type) <= element_precision (TREE_TYPE (@1))
 	 /* For pointer types, if the conversion of A to the
 	    final type requires a sign- or zero-extension,
 	    then we have to punt - it is not defined which
@@ -1843,10 +1861,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
      (negate (convert (convert:stype @1)))))
 
   /* (T)(P + A) - (T)(P + B) -> (T)A - (T)B */
-  (for add (plus pointer_plus)
    (simplify
-    (minus (convert (add @@0 @1))
-     (convert (add @0 @2)))
+   (minus (convert (plus:c @@0 @1))
+    (convert (plus:c @0 @2)))
+   (if (INTEGRAL_TYPE_P (type)
+	&& TYPE_OVERFLOW_UNDEFINED (type)
+        && element_precision (type) <= element_precision (TREE_TYPE (@1)))
+    (with { tree utype = unsigned_type_for (type); }
+     (convert (minus (convert:utype @1) (convert:utype @2))))
     (if (element_precision (type) <= element_precision (TREE_TYPE (@1))
 	 /* For integer types, if A has a smaller type
 	    than T the result depends on the possible
@@ -1856,7 +1878,17 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 	    undefined behavior, we can assume that there
 	    is no overflow.  */
 	 || (INTEGRAL_TYPE_P (TREE_TYPE (@0))
-	     && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)))
+	     && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0))))
+     (minus (convert @1) (convert @2)))))
+  (simplify
+   (minus (convert (pointer_plus @@0 @1))
+    (convert (pointer_plus @0 @2)))
+   (if (INTEGRAL_TYPE_P (type)
+	&& TYPE_OVERFLOW_UNDEFINED (type)
+        && element_precision (type) <= element_precision (TREE_TYPE (@1)))
+    (with { tree utype = unsigned_type_for (type); }
+     (convert (minus (convert:utype @1) (convert:utype @2))))
+    (if (element_precision (type) <= element_precision (TREE_TYPE (@1))
 	 /* For pointer types, if the conversion of A to the
 	    final type requires a sign- or zero-extension,
 	    then we have to punt - it is not defined which
@@ -1866,13 +1898,13 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 	     && tree_int_cst_sign_bit (@1) == 0
 	     && TREE_CODE (@2) == INTEGER_CST
 	     && tree_int_cst_sign_bit (@2) == 0))
-     (minus (convert @1) (convert @2)))))))
+     (minus (convert @1) (convert @2)))))
    (simplify
     (pointer_diff (pointer_plus @@0 @1) (pointer_plus @0 @2))
     /* The second argument of pointer_plus must be interpreted as signed, and
        thus sign-extended if necessary.  */
     (with { tree stype = signed_type_for (TREE_TYPE (@1)); }
-     (minus (convert (convert:stype @1)) (convert (convert:stype @2)))))
+     (minus (convert (convert:stype @1)) (convert (convert:stype @2)))))))
 
 
 /* Simplifications of MIN_EXPR, MAX_EXPR, fmin() and fmax().  */
Comment 12 Jakub Jelinek 2017-12-06 19:22:38 UTC
Author: jakub
Date: Wed Dec  6 19:22:06 2017
New Revision: 255449

URL: https://gcc.gnu.org/viewcvs?rev=255449&root=gcc&view=rev
Log:
	PR sanitizer/81281
	* match.pd ((T)(P + A) - (T)P -> (T) A): Split into separate
	simplify for plus with :c added, and pointer_plus without that.
	((T)P - (T)(P + A) -> -(T) A): Likewise.  If type is integral
	with undefined overflow and the conversion is not widening,
	perform negation in utype and only convert to type afterwards.
	((T)(P + A) - (T)(P + B) -> (T)A - (T)B): Split into separate
	simplify for plus with :c added, and pointer_plus without that.
	If type is integral with undefined overflow and the conversion is
	not widening, perform minus in utype and only convert to type
	afterwards.  Move the last pointer_diff_expr simplify into the
	two outermost ifs.

	* gcc.c-torture/execute/pr81281.c: New test.
	* gcc.dg/pr81281-1.c: New test.
	* gcc.dg/pr81281-2.c: New test.
	* g++.dg/ubsan/pr81281.C: New test.
	* g++.dg/ubsan/pr81281-aux.cc: New test.

Added:
    trunk/gcc/testsuite/g++.dg/ubsan/pr81281-aux.cc
    trunk/gcc/testsuite/g++.dg/ubsan/pr81281.C
    trunk/gcc/testsuite/gcc.c-torture/execute/pr81281.c
    trunk/gcc/testsuite/gcc.dg/pr81281-1.c
    trunk/gcc/testsuite/gcc.dg/pr81281-2.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/match.pd
    trunk/gcc/testsuite/ChangeLog
Comment 13 Jakub Jelinek 2017-12-06 22:50:23 UTC
Fixed on the trunk so far.
Comment 14 Jakub Jelinek 2017-12-15 14:36:57 UTC
Author: jakub
Date: Fri Dec 15 14:36:26 2017
New Revision: 255696

URL: https://gcc.gnu.org/viewcvs?rev=255696&root=gcc&view=rev
Log:
	PR sanitizer/81281
	* match.pd ((T)(P + A) - (T)P -> (T) A): Use @@0 instead of @0 and
	convert? on @0 instead of convert.  Check type of @1, not @0.
	((T)P - (T)(P + A) -> -(T) A): Use @@0 instead of @0 and
	convert? on @0 instead of convert.  Check type of @1, not @0.
	((T)(P + A) - (T)(P + B) -> (T)A - (T)B): Use @@0 instead of @0,
	only optimize if either both @1 and @2 types are narrower
	precision, or both are wider or equal precision, and in the former
	case only if both have undefined overflow.

	* gcc.dg/pr81281-3.c: New test.

Added:
    trunk/gcc/testsuite/gcc.dg/pr81281-3.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/match.pd
    trunk/gcc/testsuite/ChangeLog
Comment 15 Jakub Jelinek 2018-10-26 10:11:44 UTC
GCC 6 branch is being closed
Comment 16 Richard Biener 2019-11-14 10:00:00 UTC
Fixed in GCC8.