[PATCH] improve memcmp and memchr constant folding (PR 78257)

Martin Sebor msebor@gmail.com
Fri Aug 14 23:14:19 GMT 2020


On 8/13/20 11:44 AM, Martin Sebor wrote:
> On 8/13/20 10:21 AM, Jeff Law wrote:
>> On Fri, 2020-07-31 at 17:55 -0600, Martin Sebor via Gcc-patches wrote:
>>> The folders for these functions (and some others) call c_getsr
>>> which relies on string_constant to return the representation of
>>> constant strings.  Because the function doesn't handle constants
>>> of other types, including aggregates, memcmp or memchr calls
>>> involving those are not folded when they could be.
>>>
>>> The attached patch extends the algorithm used by string_constant
>>> to also handle constant aggregates involving elements or members
>>> of the same types as native_encode_expr.  (The change restores
>>> the empty initializer optimization inadvertently disabled in
>>> the fix for pr96058.)
>>>
>>> To avoid accidentally misusing either string_constant or c_getstr
>>> with non-strings I have introduced a pair of new functions to get
>>> the representation of those: byte_representation and getbyterep.
>>>
>>> Tested on x86_64-linux.
>>>
>>> Martin
>>
>>> PR tree-optimization/78257 - missing memcmp optimization with 
>>> constant arrays
>>>
>>> gcc/ChangeLog:
>>>
>>>     PR middle-end/78257
>>>     * builtins.c (expand_builtin_memory_copy_args): Rename called 
>>> function.
>>>     (expand_builtin_stpcpy_1): Remove argument from call.
>>>     (expand_builtin_memcmp): Rename called function.
>>>     (inline_expand_builtin_bytecmp): Same.
>>>     * expr.c (convert_to_bytes): New function.
>>>     (constant_byte_string): New function (formerly string_constant).
>>>     (string_constant): Call constant_byte_string.
>>>     (byte_representation): New function.
>>>     * expr.h (byte_representation): Declare.
>>>     * fold-const-call.c (fold_const_call): Rename called function.
>>>     * fold-const.c (c_getstr): Remove an argument.
>>>     (getbyterep): Define a new function.
>>>     * fold-const.h (c_getstr): Remove an argument.
>>>     (getbyterep): Declare a new function.
>>>     * gimple-fold.c (gimple_fold_builtin_memory_op): Rename callee.
>>>     (gimple_fold_builtin_string_compare): Same.
>>>     (gimple_fold_builtin_memchr): Same.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>>     PR middle-end/78257
>>>     * gcc.dg/memchr.c: New test.
>>>     * gcc.dg/memcmp-2.c: New test.
>>>     * gcc.dg/memcmp-3.c: New test.
>>>     * gcc.dg/memcmp-4.c: New test.
>>>
>>> diff --git a/gcc/expr.c b/gcc/expr.c
>>> index a150fa0d3b5..a124df54655 100644
>>> --- a/gcc/expr.c
>>> +++ b/gcc/expr.c
>>> @@ -11594,15 +11594,103 @@ is_aligning_offset (const_tree offset, 
>>> const_tree exp)
>>>     /* This must now be the address of EXP.  */
>>>     return TREE_CODE (offset) == ADDR_EXPR && TREE_OPERAND (offset, 
>>> 0) == exp;
>>>   }
>>> -

>>> -/* Return the tree node if an ARG corresponds to a string constant 
>>> or zero
>>> -   if it doesn't.  If we return nonzero, set *PTR_OFFSET to the 
>>> (possibly
>>> -   non-constant) offset in bytes within the string that ARG is 
>>> accessing.
>>> -   If MEM_SIZE is non-zero the storage size of the memory is returned.
>>> -   If DECL is non-zero the constant declaration is returned if 
>>> available.  */
>>> -tree
>>> -string_constant (tree arg, tree *ptr_offset, tree *mem_size, tree 
>>> *decl)
>>> +/* If EXPR is a constant initializer (either an expression or 
>>> CONSTRUCTOR),
>>> +   attempt to obtain its native representation as an array of 
>>> nonzero BYTES.
>>> +   Return true on success and false on failure (the latter without 
>>> modifying
>>> +   BYTES).  */
>>> +
>>> +static bool
>>> +convert_to_bytes (tree type, tree expr, vec<unsigned char> *bytes)
>>> +{
>>> +  if (TREE_CODE (expr) == CONSTRUCTOR)
>>> +    {
>>> +      /* Set to the size of the CONSTRUCTOR elements.  */
>>> +      unsigned HOST_WIDE_INT ctor_size = bytes->length ();
>>> +
>>> +      if (TREE_CODE (type) == ARRAY_TYPE)
>>> +    {
>>> +      tree val, idx;
>>> +      tree eltype = TREE_TYPE (type);
>>> +      unsigned HOST_WIDE_INT elsize =
>>> +        tree_to_uhwi (TYPE_SIZE_UNIT (eltype));
>>> +      unsigned HOST_WIDE_INT i, last_idx = HOST_WIDE_INT_M1U;
>>> +      FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (expr), i, idx, val)
>>> +        {
>>> +          /* Append zeros for elements with no initializers.  */
>>> +          if (!tree_fits_uhwi_p (idx))
>>> +        return false;
>>> +          unsigned HOST_WIDE_INT cur_idx = tree_to_uhwi (idx);
>>> +          if (unsigned HOST_WIDE_INT size = cur_idx - (last_idx + 1))
>>> +        {
>>> +          size = size * elsize + bytes->length ();
>>> +          bytes->safe_grow_cleared (size);
>                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
>>> +        }
>>> +
>>> +          if (!convert_to_bytes (eltype, val, bytes))
>>> +        return false;
>>> +
>>> +          last_idx = cur_idx;
>>> +        }
>>> +    }
>>> +      else if (TREE_CODE (type) == RECORD_TYPE)
>>> +    {
>>> +      tree val, fld;
>>> +      unsigned HOST_WIDE_INT i;
>>> +      FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (expr), i, fld, val)
>>> +        {
>>> +          /* Append zeros for members with no initializers and
>>> +         any padding.  */
>>> +          unsigned HOST_WIDE_INT cur_off = int_byte_position (fld);
>>> +          if (bytes->length () < cur_off)
>>> +        bytes->safe_grow_cleared (cur_off);
>                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>> +
>>> +          if (!convert_to_bytes (TREE_TYPE (val), val, bytes))
>>> +        return false;
>>> +        }
>>> +    }
>>> +      else
>>> +    return false;
>>> +
>>> +      /* Compute the size of the COSNTRUCTOR elements.  */
>>> +      ctor_size = bytes->length () - ctor_size;
>>> +
>>> +      /* Append zeros to the byte vector to the full size of the type.
>>> +     The type size can be less than the size of the CONSTRUCTOR
>>> +     if the latter contains initializers for a flexible array
>>> +     member.  */
>>> +      tree size = TYPE_SIZE_UNIT (type);
>>> +      unsigned HOST_WIDE_INT type_size = tree_to_uhwi (size);
>>> +      if (ctor_size < type_size)
>>> +    if (unsigned HOST_WIDE_INT size_grow = type_size - ctor_size)
>>> +      bytes->safe_grow_cleared (bytes->length () + size_grow);
>>> +
>>> +      return true;
>>> +    }
>> So I think you need to be more careful with CONSTRUCTOR nodes here.  
>> Not all
>> elements of an object need to appear in the CONSTRUCTOR.  Elements 
>> which do not
>> appear in the CONSTRUCTOR node are considered zero-initialized, unless
>> CONSTRUCTOR_NO_CLEARING is set.
>>
>> I don't see anything in the code above which deals with those oddities of
>> CONSTRUCTOR nodes.  Did I miss it?
> 
> Just capturing for reference what we just discussed off list:
> 
> The underlined code above zeroes out the bytes of elements with
> no initializers as well as any padding between fields.  It doesn't
> consider CONSTRUCTOR_NO_CLEARING.  I didn't know about that bit so
> I looked it up.  According to the internals manual:
> 
>    Unrepresented fields will be cleared (zeroed), unless the
>    CONSTRUCTOR_NO_CLEARING flag is set, in which case their value
>    becomes undefined.
> 
> So assuming they're zero should be fine, as would doing nothing.
> We agreed on the former so I will go ahead with the patch as is.

I had missed a few Ada failures.  Apparently, Ada can specify
arbitrary array bounds (not just upper but also lower) but
the code assumed the lower bound would always be zero.  I've
adjusted it to avoid making that assumption and committed
the updated revision in r11-2709.

Martin


More information about the Gcc-patches mailing list