This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] fix and improve strlen conditional handling of merged stores (PR 91183, 91294, 91315)

On 7/31/19 6:36 PM, Martin Sebor wrote:
> More extensive testing of the last week's strlen patch for
> PR91183 on various non-mainstream targets and with better tests
> has exposed a few gaps and a couple of bugs.  The attached patch
> addresses all in one change.  I considered splitting it up but
> in the end decided the changes were small and sufficiently
> isolated that it wasn't necessary.  (If someone feels strongly
> otherwise it can be easily split t up.)
> The wrong-code bug (PR 91294) is due to handle_store neglecting
> to fully consider the case when a single multi-byte store
> involving a PHI of two "strings" the same size (so they are merged
> into a single int store) but of unequal length.  The function
> simply takes the length of the shorter string as the resulting
> length when it needs to only set the lower bound of the length
> (it does that treating the result as potentially not nul-
> terminated).
> The gaps are in not handling some MEM_REF forms that come up
> in multi-byte assignments (this is the rest of PR 91183 and was
> exposed on strictly aligned targets), and in handle_builtin_strlen
> discarding the lower bound on a string length instead of exposing
> it to downstream passes.  This is PR 91315 that was exposed by
> a few cases in the existing tests for PR 91294 failing after
> the fix for PR 91294.
> Tested on x86_64-linux and spot-checked with a sparc-solaris2.11
> cross-compiler.
> Martin
> gcc-91294-2.diff
> PR tree-optimization/91315 - missing strlen lower bound of a string known to be at least N characters
> PR tree-optimization/91294 - wrong strlen result of a conditional with an offset
> PR tree-optimization/91183 - strlen of a strcpy result with a conditional source not folded
> gcc/testsuite/ChangeLog:
> 	PR tree-optimization/91315
> 	PR tree-optimization/91294
> 	PR tree-optimization/91183
> 	* gcc.dg/strlenopt-44.c: Avoid using length of 1.
> 	* gcc.dg/strlenopt-70.c: Disable overly optimistic tests.
> 	* gcc.dg/strlenopt-73.c: New test.
> 	* gcc.dg/strlenopt-74.c: New test.
> 	* gcc.dg/strlenopt-75.c: New test.
> 	* gcc.dg/strlenopt-76.c: New test.
> 	* gcc.dg/strlenopt-77.c: New test.
> gcc/ChangeLog:
> 	PR tree-optimization/91315
> 	PR tree-optimization/91294
> 	PR tree-optimization/91183
> 	* gimple-fold.c (gimple_fold_builtin_strlen): Add expected argument.
> 	* tree-ssa-strlen.c (set_strlen_range): Add function argument.
> 	(maybe_set_strlen_range): Add expected argument.
> 	(handle_builtin_strlen): Call set_strlen_range.
> 	(count_nonzero_bytes): Add function arguments.  Handle strinfo
> 	first.  Handle "single" assignment.
> 	(handle_store): Set the lower bound of length for multibyte stores
> 	of unequal lengths.
> 	* tree-ssa-strlen.h (set_strlen_range): Add function argument.
> Index: gcc/tree-ssa-strlen.c
> ===================================================================
> --- gcc/tree-ssa-strlen.c	(revision 273914)
> +++ gcc/tree-ssa-strlen.c	(working copy)
> @@ -1434,6 +1434,12 @@ handle_builtin_strlen (gimple_stmt_iterator *gsi)
>  		  tree adj = fold_build2_loc (loc, MINUS_EXPR,
>  					      TREE_TYPE (lhs), lhs, old);
>  		  adjust_related_strinfos (loc, si, adj);
> +		  /* Use the constant minimim length as the lower bound

> @@ -3408,7 +3457,13 @@ static bool
>  	}
>        gimple *stmt = SSA_NAME_DEF_STMT (exp);
> -      if (gimple_code (stmt) != GIMPLE_PHI)
> +      if (gimple_assign_single_p (stmt))
> +	{
> +	  tree rhs = gimple_assign_rhs1 (stmt);
> +	  return count_nonzero_bytes (rhs, offset, nbytes, lenrange, nulterm,
> +				      allnul, allnonnul, snlim);
> +	}
> +      else if (gimple_code (stmt) != GIMPLE_PHI)
>  	return false;
What cases are you handling here?  Are there any cases where a single
operand expression on the RHS affects the result.  For example, if we've
got a NOP_EXPR which zero extends RHS?  Does that change the nonzero
bytes in a way that is significant?

I'm not opposed to handling single operand objects here, I'm just
concerned that we're being very lenient in just stripping away the
operator and looking at the underlying object.

> @@ -3795,7 +3824,14 @@ handle_store (gimple_stmt_iterator *gsi)
>  	    }
>  	  else
>  	    si->nonzero_chars = build_int_cst (size_type_node, offset);
> -	  si->full_string_p = full_string_p;
> +
> +	  /* Set FULL_STRING_P only if the length of the strings being
> +	     written is the same, and clear it if the strings have
> +	     different lengths.  In the latter case the length stored
> +	     in si->NONZERO_CHARS becomes the lower bound.
> +	     FIXME: Handle the upper bound of the length if possible.  */
> +	  si->full_string_p = full_string_p && lenrange[0] == lenrange[1];
So there seems to be a disconnect between the comment and the code.

The comment indicates you care about the lengths of the two strings
being the same.  But what you're really comparing when the lenrange[0]
== lenrange[1] test is that the min and max of RHS are the same.

It generally looks reasonable, so I think we just need to reach a
conclusion on the gimple_assign_single_p cases we're trying to handle
and the possible mismatch between the comment and the code.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]