[PATCH] improve memcmp and memchr constant folding (PR 78257)

Martin Sebor msebor@gmail.com
Thu Aug 13 17:44:46 GMT 2020


On 8/13/20 10:21 AM, Jeff Law wrote:
> On Fri, 2020-07-31 at 17:55 -0600, Martin Sebor via Gcc-patches wrote:
>> The folders for these functions (and some others) call c_getsr
>> which relies on string_constant to return the representation of
>> constant strings.  Because the function doesn't handle constants
>> of other types, including aggregates, memcmp or memchr calls
>> involving those are not folded when they could be.
>>
>> The attached patch extends the algorithm used by string_constant
>> to also handle constant aggregates involving elements or members
>> of the same types as native_encode_expr.  (The change restores
>> the empty initializer optimization inadvertently disabled in
>> the fix for pr96058.)
>>
>> To avoid accidentally misusing either string_constant or c_getstr
>> with non-strings I have introduced a pair of new functions to get
>> the representation of those: byte_representation and getbyterep.
>>
>> Tested on x86_64-linux.
>>
>> Martin
> 
>> PR tree-optimization/78257 - missing memcmp optimization with constant arrays
>>
>> gcc/ChangeLog:
>>
>> 	PR middle-end/78257
>> 	* builtins.c (expand_builtin_memory_copy_args): Rename called function.
>> 	(expand_builtin_stpcpy_1): Remove argument from call.
>> 	(expand_builtin_memcmp): Rename called function.
>> 	(inline_expand_builtin_bytecmp): Same.
>> 	* expr.c (convert_to_bytes): New function.
>> 	(constant_byte_string): New function (formerly string_constant).
>> 	(string_constant): Call constant_byte_string.
>> 	(byte_representation): New function.
>> 	* expr.h (byte_representation): Declare.
>> 	* fold-const-call.c (fold_const_call): Rename called function.
>> 	* fold-const.c (c_getstr): Remove an argument.
>> 	(getbyterep): Define a new function.
>> 	* fold-const.h (c_getstr): Remove an argument.
>> 	(getbyterep): Declare a new function.
>> 	* gimple-fold.c (gimple_fold_builtin_memory_op): Rename callee.
>> 	(gimple_fold_builtin_string_compare): Same.
>> 	(gimple_fold_builtin_memchr): Same.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 	PR middle-end/78257
>> 	* gcc.dg/memchr.c: New test.
>> 	* gcc.dg/memcmp-2.c: New test.
>> 	* gcc.dg/memcmp-3.c: New test.
>> 	* gcc.dg/memcmp-4.c: New test.
>>
>> diff --git a/gcc/expr.c b/gcc/expr.c
>> index a150fa0d3b5..a124df54655 100644
>> --- a/gcc/expr.c
>> +++ b/gcc/expr.c
>> @@ -11594,15 +11594,103 @@ is_aligning_offset (const_tree offset, const_tree exp)
>>     /* This must now be the address of EXP.  */
>>     return TREE_CODE (offset) == ADDR_EXPR && TREE_OPERAND (offset, 0) == exp;
>>   }
>> -

>> -/* Return the tree node if an ARG corresponds to a string constant or zero
>> -   if it doesn't.  If we return nonzero, set *PTR_OFFSET to the (possibly
>> -   non-constant) offset in bytes within the string that ARG is accessing.
>> -   If MEM_SIZE is non-zero the storage size of the memory is returned.
>> -   If DECL is non-zero the constant declaration is returned if available.  */
>>   
>> -tree
>> -string_constant (tree arg, tree *ptr_offset, tree *mem_size, tree *decl)
>> +/* If EXPR is a constant initializer (either an expression or CONSTRUCTOR),
>> +   attempt to obtain its native representation as an array of nonzero BYTES.
>> +   Return true on success and false on failure (the latter without modifying
>> +   BYTES).  */
>> +
>> +static bool
>> +convert_to_bytes (tree type, tree expr, vec<unsigned char> *bytes)
>> +{
>> +  if (TREE_CODE (expr) == CONSTRUCTOR)
>> +    {
>> +      /* Set to the size of the CONSTRUCTOR elements.  */
>> +      unsigned HOST_WIDE_INT ctor_size = bytes->length ();
>> +
>> +      if (TREE_CODE (type) == ARRAY_TYPE)
>> +	{
>> +	  tree val, idx;
>> +	  tree eltype = TREE_TYPE (type);
>> +	  unsigned HOST_WIDE_INT elsize =
>> +	    tree_to_uhwi (TYPE_SIZE_UNIT (eltype));
>> +	  unsigned HOST_WIDE_INT i, last_idx = HOST_WIDE_INT_M1U;
>> +	  FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (expr), i, idx, val)
>> +	    {
>> +	      /* Append zeros for elements with no initializers.  */
>> +	      if (!tree_fits_uhwi_p (idx))
>> +		return false;
>> +	      unsigned HOST_WIDE_INT cur_idx = tree_to_uhwi (idx);
>> +	      if (unsigned HOST_WIDE_INT size = cur_idx - (last_idx + 1))
>> +		{
>> +		  size = size * elsize + bytes->length ();
>> +		  bytes->safe_grow_cleared (size);
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

>> +		}
>> +
>> +	      if (!convert_to_bytes (eltype, val, bytes))
>> +		return false;
>> +
>> +	      last_idx = cur_idx;
>> +	    }
>> +	}
>> +      else if (TREE_CODE (type) == RECORD_TYPE)
>> +	{
>> +	  tree val, fld;
>> +	  unsigned HOST_WIDE_INT i;
>> +	  FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (expr), i, fld, val)
>> +	    {
>> +	      /* Append zeros for members with no initializers and
>> +		 any padding.  */
>> +	      unsigned HOST_WIDE_INT cur_off = int_byte_position (fld);
>> +	      if (bytes->length () < cur_off)
>> +		bytes->safe_grow_cleared (cur_off);
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>> +
>> +	      if (!convert_to_bytes (TREE_TYPE (val), val, bytes))
>> +		return false;
>> +	    }
>> +	}
>> +      else
>> +	return false;
>> +
>> +      /* Compute the size of the COSNTRUCTOR elements.  */
>> +      ctor_size = bytes->length () - ctor_size;
>> +
>> +      /* Append zeros to the byte vector to the full size of the type.
>> +	 The type size can be less than the size of the CONSTRUCTOR
>> +	 if the latter contains initializers for a flexible array
>> +	 member.  */
>> +      tree size = TYPE_SIZE_UNIT (type);
>> +      unsigned HOST_WIDE_INT type_size = tree_to_uhwi (size);
>> +      if (ctor_size < type_size)
>> +	if (unsigned HOST_WIDE_INT size_grow = type_size - ctor_size)
>> +	  bytes->safe_grow_cleared (bytes->length () + size_grow);
>> +
>> +      return true;
>> +    }
> So I think you need to be more careful with CONSTRUCTOR nodes here.  Not all
> elements of an object need to appear in the CONSTRUCTOR.  Elements which do not
> appear in the CONSTRUCTOR node are considered zero-initialized, unless
> CONSTRUCTOR_NO_CLEARING is set.
> 
> I don't see anything in the code above which deals with those oddities of
> CONSTRUCTOR nodes.  Did I miss it?

Just capturing for reference what we just discussed off list:

The underlined code above zeroes out the bytes of elements with
no initializers as well as any padding between fields.  It doesn't
consider CONSTRUCTOR_NO_CLEARING.  I didn't know about that bit so
I looked it up.  According to the internals manual:

   Unrepresented fields will be cleared (zeroed), unless the
   CONSTRUCTOR_NO_CLEARING flag is set, in which case their value
   becomes undefined.

So assuming they're zero should be fine, as would doing nothing.
We agreed on the former so I will go ahead with the patch as is.

Thanks
Martin


More information about the Gcc-patches mailing list