[PATCH] c++: Further address_compare fixes [PR89074]

Jakub Jelinek jakub@redhat.com
Tue Jan 18 10:17:41 GMT 2022


On Thu, Jan 13, 2022 at 04:18:33PM -0500, Jason Merrill wrote:
> > Note, address_compare has some special cases, e.g. it assumes that
> > static vars are never adjacent to automatic vars, which is the case
> > for the usual layout where automatic vars are on the stack and after
> > .rodata/.data sections there is heap:
> >    /* Assume that automatic variables can't be adjacent to global
> >       variables.  */
> >    else if (is_global_var (base0) != is_global_var (base1))
> >      ;
> > Is it ok that during constant evaluation we don't treat those as undefined
> > behavior, or shall that be with !folding_initializer && too?
> 
> I guess that's undefined as well.

Ok, following patch seems to guard that too

> > Another special case is:
> >    if ((DECL_P (base0) && TREE_CODE (base1) == STRING_CST)
> >         || (TREE_CODE (base0) == STRING_CST && DECL_P (base1))
> >         || (TREE_CODE (base0) == STRING_CST
> >             && TREE_CODE (base1) == STRING_CST
> >             && ioff0 >= 0 && ioff1 >= 0
> >             && ioff0 < TREE_STRING_LENGTH (base0)
> >             && ioff1 < TREE_STRING_LENGTH (base1)
> >            /* This is a too conservative test that the STRING_CSTs
> >               will not end up being string-merged.  */
> >             && strncmp (TREE_STRING_POINTER (base0) + ioff0,
> >                         TREE_STRING_POINTER (base1) + ioff1,
> >                         MIN (TREE_STRING_LENGTH (base0) - ioff0,
> >                              TREE_STRING_LENGTH (base1) - ioff1)) != 0))
> >      ;
> >    else if (!DECL_P (base0) || !DECL_P (base1))
> >      return 2;
> > Here we similarly assume that vars aren't adjacent to string literals
> > or vice versa.  Do we need to stick !folding_initializer && to those
> > DECL_P vs. STRING_CST cases?
> 
> Seems so.

and this too

> > Though, because of the return 2; for
> > non-DECL_P that would mean rejecting comparisons like &var == &"foobar"[3]
> > etc. which ought to be fine, no?  So perhaps we need to watch for
> > decls. vs. STRING_CSTs like for DECLs whether the address is at the start
> > or at the end of the string literal or somewhere in between (at least
> > for folding_initializer)?
> 
> Agreed.

and this as well.
Furthermore I've fixed constexpr-compare2.C by assuming if
folding_initializer that addresses of non-aliased (visibly to the compiler)
FUNCTION_DECLs are different and that functions are non-zero sized for the
purpose of var vs. function comparisons.

> > And yet another chapter but probably unsolvable is comparison of
> > string literal addresses.  I think pedantically in C++
> > &"foo"[0] == &"foo"[0] is undefined behavior, different occurences of
> > the same string literals might still not be merged in some implementations.
> 
> I disagree; it's unspecified whether string literals are merged, but I think
> the comparison result is well specified depending on that implementation
> behavior.

Can you please comment on https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86369#c1
then?

Anyway, the following has been successfully bootstrapped/regtested on
x86_64-linux and i686-linux, ok for trunk?

2022-01-18  Jakub Jelinek  <jakub@redhat.com>

	PR c++/89074
	PR c++/104033
	* fold-const.c (address_compare): Restrict the decl vs. STRING_CST
	or vice versa or STRING_CST vs. STRING_CST or
	is_global_var != is_global_var optimizations to !folding_initializer.
	Punt for FUNCTION_DECLs with non-zero offsets.  If folding_initializer,
	assume non-aliased functions have non-zero size and have different
	addresses.  For folding_initializer, punt on comparisons of start
	of some object and end of another one, regardless whether it is a decl
	or string literal.  Also punt for folding_initializer of
	STRING_CST vs. STRING_CST comparisons if the two literals could be
	overlapping.

	* g++.dg/cpp1y/constexpr-89074-3.C: New test.

--- gcc/fold-const.c.jj	2022-01-17 14:19:08.817376382 +0100
+++ gcc/fold-const.c	2022-01-17 15:50:16.687211071 +0100
@@ -16608,21 +16608,27 @@ address_compare (tree_code code, tree ty
   HOST_WIDE_INT ioff0 = -1, ioff1 = -1;
   off0.is_constant (&ioff0);
   off1.is_constant (&ioff1);
-  if ((DECL_P (base0) && TREE_CODE (base1) == STRING_CST)
-       || (TREE_CODE (base0) == STRING_CST && DECL_P (base1))
-       || (TREE_CODE (base0) == STRING_CST
-	   && TREE_CODE (base1) == STRING_CST
-	   && ioff0 >= 0 && ioff1 >= 0
-	   && ioff0 < TREE_STRING_LENGTH (base0)
-	   && ioff1 < TREE_STRING_LENGTH (base1)
-	  /* This is a too conservative test that the STRING_CSTs
-	     will not end up being string-merged.  */
-	   && strncmp (TREE_STRING_POINTER (base0) + ioff0,
-		       TREE_STRING_POINTER (base1) + ioff1,
-		       MIN (TREE_STRING_LENGTH (base0) - ioff0,
-			    TREE_STRING_LENGTH (base1) - ioff1)) != 0))
+  if (!folding_initializer
+      && ((DECL_P (base0) && TREE_CODE (base1) == STRING_CST)
+	  || (TREE_CODE (base0) == STRING_CST && DECL_P (base1))
+	  || (TREE_CODE (base0) == STRING_CST
+	      && TREE_CODE (base1) == STRING_CST
+	      && ioff0 >= 0 && ioff1 >= 0
+	      && ioff0 < TREE_STRING_LENGTH (base0)
+	      && ioff1 < TREE_STRING_LENGTH (base1)
+	      /* This is a too conservative test that the STRING_CSTs
+		 will not end up being string-merged.  */
+	      && strncmp (TREE_STRING_POINTER (base0) + ioff0,
+			  TREE_STRING_POINTER (base1) + ioff1,
+			  MIN (TREE_STRING_LENGTH (base0) - ioff0,
+			       TREE_STRING_LENGTH (base1) - ioff1)) != 0)))
     ;
-  else if (!DECL_P (base0) || !DECL_P (base1))
+  /* Punt on non-zero offsets from functions.  */
+  else if ((TREE_CODE (base0) == FUNCTION_DECL && ioff0)
+	   || (TREE_CODE (base1) == FUNCTION_DECL && ioff1))
+    return 2;
+  else if ((!DECL_P (base0) && TREE_CODE (base0) != STRING_CST)
+	   || (!DECL_P (base1) && TREE_CODE (base1) != STRING_CST))
     return 2;
   /* If this is a pointer comparison, ignore for now even
      valid equalities where one pointer is the offset zero
@@ -16631,18 +16637,62 @@ address_compare (tree_code code, tree ty
     ;
   /* Assume that automatic variables can't be adjacent to global
      variables.  */
-  else if (is_global_var (base0) != is_global_var (base1))
+  else if (!folding_initializer
+	   && is_global_var (base0) != is_global_var (base1))
+    ;
+  /* For initializers, assume addresses of different functions are
+     different.  */
+  else if (folding_initializer
+	   && TREE_CODE (base0) == FUNCTION_DECL
+	   && TREE_CODE (base1) == FUNCTION_DECL)
     ;
   else
     {
-      tree sz0 = DECL_SIZE_UNIT (base0);
-      tree sz1 = DECL_SIZE_UNIT (base1);
-      /* If sizes are unknown, e.g. VLA or not representable, punt.  */
-      if (!tree_fits_poly_int64_p (sz0) || !tree_fits_poly_int64_p (sz1))
-	return 2;
+      poly_int64 size0, size1;
+      if (TREE_CODE (base0) == STRING_CST)
+	{
+	  if (!folding_initializer
+	      || ioff0 < 0
+	      || ioff0 > TREE_STRING_LENGTH (base0))
+	    return 2;
+	  size0 = TREE_STRING_LENGTH (base0);
+	}
+      /* For initializers, assume function decls don't overlap and have
+	 non-empty size.  */
+      else if (folding_initializer && TREE_CODE (base0) == FUNCTION_DECL)
+	size0 = 1;
+      else
+	{
+	  tree sz0 = DECL_SIZE_UNIT (base0);
+	  /* If sizes are unknown, e.g. VLA or not representable, punt.  */
+	  if (!tree_fits_poly_int64_p (sz0))
+	    return 2;
+
+	  size0 = tree_to_poly_int64 (sz0);
+	}
+
+      if (TREE_CODE (base1) == STRING_CST)
+	{
+	  if (!folding_initializer
+	      || ioff1 < 0
+	      || ioff1 > TREE_STRING_LENGTH (base1))
+	    return 2;
+	  size1 = TREE_STRING_LENGTH (base1);
+	}
+      /* For initializers, assume function decls don't overlap and have
+	 non-empty size.  */
+      else if (folding_initializer && TREE_CODE (base1) == FUNCTION_DECL)
+	size1 = 1;
+      else
+	{
+	  tree sz1 = DECL_SIZE_UNIT (base1);
+	  /* If sizes are unknown, e.g. VLA or not representable, punt.  */
+	  if (!tree_fits_poly_int64_p (sz1))
+	    return 2;
+
+	  size1 = tree_to_poly_int64 (sz1);
+	}
 
-      poly_int64 size0 = tree_to_poly_int64 (sz0);
-      poly_int64 size1 = tree_to_poly_int64 (sz1);
       /* If one offset is pointing (or could be) to the beginning of one
 	 object and the other is pointing to one past the last byte of the
 	 other object, punt.  */
@@ -16658,6 +16708,27 @@ address_compare (tree_code code, tree ty
 	  && (known_ne (off0, 0)
 	      || (known_ne (size0, 0) && known_ne (size1, 0))))
 	equal = 0;
+      if (equal == 0
+	  && TREE_CODE (base0) == STRING_CST
+	  && TREE_CODE (base1) == STRING_CST)
+	{
+	  /* If the bytes in the string literals starting at the pointers
+	     differ, the pointers need to be different.  */
+	  if (memcmp (TREE_STRING_POINTER (base0) + ioff0,
+		      TREE_STRING_POINTER (base1) + ioff1,
+		      MIN (TREE_STRING_LENGTH (base0) - ioff0,
+			   TREE_STRING_LENGTH (base1) - ioff1)) == 0)
+	    {
+	      HOST_WIDE_INT ioffmin = MIN (ioff0, ioff1);
+	      if (memcmp (TREE_STRING_POINTER (base0) + ioff0 - ioffmin,
+			  TREE_STRING_POINTER (base1) + ioff1 - ioffmin,
+			  ioffmin) == 0)
+		/* If even the bytes in the string literal before the
+		   pointers are the same, the string literals could be
+		   tail merged.  */
+		return 2;
+	    }
+	}
      }
   return equal;
 }
--- gcc/testsuite/g++.dg/cpp1y/constexpr-89074-3.C.jj	2022-01-17 15:22:43.743566175 +0100
+++ gcc/testsuite/g++.dg/cpp1y/constexpr-89074-3.C	2022-01-17 16:10:19.182230570 +0100
@@ -0,0 +1,132 @@
+// PR c++/89074
+// { dg-do compile { target c++14 } }
+
+int fn1 (void) { return 0; }
+int fn2 (void) { return 1; }
+
+constexpr bool
+f1 ()
+{
+  char a[] = { 1, 2, 3, 4 };
+
+  if (&a[1] == "foo")
+    return false;
+
+  if (&a[1] == &"foo"[4])
+    return false;
+
+  if (&"foo"[1] == &a[0])
+    return false;
+
+  if (&"foo"[3] == &a[4])
+    return false;
+
+  if (&a[0] == "foo")
+    return false;
+
+  // Pointer to start of one object (var) and end of another one (literal)
+  if (&a[0] == &"foo"[4])	// { dg-error "is not a constant expression" }
+    return false;
+
+  return true;
+}
+
+constexpr bool
+f2 ()
+{
+  char a[] = { 1, 2, 3, 4 };
+
+  // Pointer to end of one object (var) and start of another one (literal)
+  if (&a[4] == "foo")		// { dg-error "is not a constant expression" }
+    return false;
+
+  return true;
+}
+
+char v[] = { 1, 2, 3, 4 };
+
+constexpr bool
+f3 ()
+{
+  char a[] = { 1, 2, 3, 4 };
+
+  if (&a[1] == &v[1])
+    return false;
+
+  if (&a[0] == &v[3])
+    return false;
+
+  if (&a[2] == &v[4])
+    return false;
+
+  // Pointer to start of one object (automatic var) and end of another one (non-automagic var)
+  if (&a[0] == &v[4])		// { dg-error "is not a constant expression" }
+    return false;
+
+  return true;
+}
+
+constexpr bool
+f4 ()
+{
+  char a[] = { 1, 2, 3, 4, 5 };
+
+  // Pointer to end of one object (automatic var) and start of another one (non-automagic var)
+  if (&a[5] == &v[0])		// { dg-error "is not a constant expression" }
+    return false;
+
+  return true;
+}
+
+constexpr bool
+f5 ()
+{
+  if (fn1 != fn1)
+    return false;
+
+  if (fn1 == fn2)
+    return false;
+
+  if (&"abcde"[0] == &"edcba"[1])
+    return false;
+
+  if (&"abcde"[1] == &"edcba"[6])
+    return false;
+
+  // Pointer to start of one object (literal) and end of another one (literal)
+  if (&"abcde"[0] == &"edcba"[6])	// { dg-error "is not a constant expression" }
+    return false;
+
+  return true;
+}
+
+constexpr bool
+f6 ()
+{
+  // Pointer to start of one object (literal) and end of another one (literal)
+  if (&"abcde"[6] == &"edcba"[0])	// { dg-error "is not a constant expression" }
+    return false;
+
+  return true;
+}
+
+constexpr bool
+f7 ()
+{
+  if (&"abcde"[3] == &"fabcde"[3])
+    return false;
+
+  // These could be suffix merged, with &"abcde"[0] == &"fabcde"[1].
+  if (&"abcde"[3] == &"fabcde"[4])	// { dg-error "is not a constant expression" }
+    return false;
+
+  return true;
+}
+
+constexpr bool a = f1 ();
+constexpr bool b = f2 ();
+constexpr bool c = f3 ();
+constexpr bool d = f4 ();
+constexpr bool e = f5 ();
+constexpr bool f = f6 ();
+constexpr bool g = f7 ();

	Jakub



More information about the Gcc-patches mailing list