[PATCH] PR3609

Jakub Jelinek jakub@redhat.com
Fri Aug 10 09:55:00 GMT 2001


On Fri, Aug 10, 2001 at 08:34:54AM -0700, Mark Mitchell wrote:
> 
> > But c_strlen is called here as a subroutine of the builtin_strcpy
> > optimization (and this one wasn't there in 2.95).
> 
> I don't see how that's relevant.  I think it would be even weirder
> for the compiler to treat those expressions as valid for strcpy,
> but not for strlen.
> 
> But, in case it's not clear, I *agree* with you that GCC is buggy;
> unfortunately, I seem to be the only one.
> 
> Note that `"foo" + 10' is *undefined*, but
> `(char*)((size_t)"foo" + 10)' is *implementation-defined*.
> 
> I think the resolution everyone wants is that the implementation-defined
> behavior for string-constant + variable offset is that the result
> is undefined, even when there are casts involved.
> 
> I think is not in the spirit of the standard.  For example, the C++
> standard defines implementation-defined behavior as:
> 
>   behavior,  for  a well-formed program construct and correct data, that
>   depends on the implementation and that each implementation shall docu-
>   ment.

But I think the most natural implementation-defined behaviour for this is
that it is the same thing as *("foo" + 10), whatever casts you do you're still
referencing something beyond end of an object.

Anyway, so that we don't spend too much energy on this, following patch
only optimizes if the string was not casted to non-pointer type before the
addition, so Franz's testcase works and no gcc testcases need to be changed.

2001-08-10  Jakub Jelinek  <jakub@redhat.com>

	* builtins.c (c_strlen): Only optimize string literal + non-constant,
	if string literal was not casted to non-pointer type.

--- gcc/builtins.c.jj	Sun Jul 22 21:33:43 2001
+++ gcc/builtins.c	Fri Aug 10 19:00:25 2001
@@ -226,16 +226,16 @@ static tree
 c_strlen (src)
      tree src;
 {
-  tree offset_node;
+  tree offset_node, str;
   int offset, max;
   const char *ptr;
 
-  src = string_constant (src, &offset_node);
-  if (src == 0)
+  str = string_constant (src, &offset_node);
+  if (str == 0)
     return 0;
 
-  max = TREE_STRING_LENGTH (src) - 1;
-  ptr = TREE_STRING_POINTER (src);
+  max = TREE_STRING_LENGTH (str) - 1;
+  ptr = TREE_STRING_POINTER (str);
 
   if (offset_node && TREE_CODE (offset_node) != INTEGER_CST)
     {
@@ -243,10 +243,45 @@ c_strlen (src)
 	 compute the offset to the following null if we don't know where to
 	 start searching for it.  */
       int i;
+      tree arg0, arg1;
 
       for (i = 0; i < max; i++)
 	if (ptr[i] == 0)
 	  return 0;
+
+#define STRIP_POINTER_NOPS(EXP)					\
+  while ((TREE_CODE (EXP) == NOP_EXPR				\
+	  || TREE_CODE (EXP) == CONVERT_EXPR			\
+	  || TREE_CODE (EXP) == NON_LVALUE_EXPR)		\
+	 && TREE_OPERAND (EXP, 0) != error_mark_node		\
+	 && TREE_CODE (TREE_TYPE (TREE_OPERAND (EXP, 0)))	\
+	    == POINTER_TYPE)					\
+    (EXP) = TREE_OPERAND (EXP, 0)
+
+      STRIP_NOPS (src);
+
+      if (TREE_CODE (src) != PLUS_EXPR)
+	return 0;
+
+      arg0 = TREE_OPERAND (src, 0);
+      arg1 = TREE_OPERAND (src, 1);
+
+      STRIP_POINTER_NOPS (arg0);
+      STRIP_POINTER_NOPS (arg1);
+
+      /* To use size_diffop below, we need to ensure offset must
+	 be less than the size of the string.
+	 If there is just pointer arithmetics and offset is bigger
+	 than size of the string, it triggers undefined behaviour per
+	 ISO C99 6.5.6.8.
+	 But if the string has been casted to some non-pointer type,
+	 the cast has implementation-defined behaviour and then things
+	 are unclear.  */
+      if ((TREE_CODE (arg0) != ADDR_EXPR
+	   || TREE_CODE (TREE_OPERAND (arg0, 0)) != STRING_CST)
+	  && (TREE_CODE (arg1) != ADDR_EXPR
+	      || TREE_CODE (TREE_OPERAND (arg1, 0)) != STRING_CST))
+	return 0;
 
       /* We don't know the starting offset, but we do know that the string
 	 has no internal zero bytes.  We can assume that the offset falls


	Jakub



More information about the Gcc-patches mailing list