[PATCH] improve memcmp and memchr constant folding (PR 78257)
Martin Sebor
msebor@gmail.com
Thu Aug 13 17:44:46 GMT 2020
On 8/13/20 10:21 AM, Jeff Law wrote:
> On Fri, 2020-07-31 at 17:55 -0600, Martin Sebor via Gcc-patches wrote:
>> The folders for these functions (and some others) call c_getsr
>> which relies on string_constant to return the representation of
>> constant strings. Because the function doesn't handle constants
>> of other types, including aggregates, memcmp or memchr calls
>> involving those are not folded when they could be.
>>
>> The attached patch extends the algorithm used by string_constant
>> to also handle constant aggregates involving elements or members
>> of the same types as native_encode_expr. (The change restores
>> the empty initializer optimization inadvertently disabled in
>> the fix for pr96058.)
>>
>> To avoid accidentally misusing either string_constant or c_getstr
>> with non-strings I have introduced a pair of new functions to get
>> the representation of those: byte_representation and getbyterep.
>>
>> Tested on x86_64-linux.
>>
>> Martin
>
>> PR tree-optimization/78257 - missing memcmp optimization with constant arrays
>>
>> gcc/ChangeLog:
>>
>> PR middle-end/78257
>> * builtins.c (expand_builtin_memory_copy_args): Rename called function.
>> (expand_builtin_stpcpy_1): Remove argument from call.
>> (expand_builtin_memcmp): Rename called function.
>> (inline_expand_builtin_bytecmp): Same.
>> * expr.c (convert_to_bytes): New function.
>> (constant_byte_string): New function (formerly string_constant).
>> (string_constant): Call constant_byte_string.
>> (byte_representation): New function.
>> * expr.h (byte_representation): Declare.
>> * fold-const-call.c (fold_const_call): Rename called function.
>> * fold-const.c (c_getstr): Remove an argument.
>> (getbyterep): Define a new function.
>> * fold-const.h (c_getstr): Remove an argument.
>> (getbyterep): Declare a new function.
>> * gimple-fold.c (gimple_fold_builtin_memory_op): Rename callee.
>> (gimple_fold_builtin_string_compare): Same.
>> (gimple_fold_builtin_memchr): Same.
>>
>> gcc/testsuite/ChangeLog:
>>
>> PR middle-end/78257
>> * gcc.dg/memchr.c: New test.
>> * gcc.dg/memcmp-2.c: New test.
>> * gcc.dg/memcmp-3.c: New test.
>> * gcc.dg/memcmp-4.c: New test.
>>
>> diff --git a/gcc/expr.c b/gcc/expr.c
>> index a150fa0d3b5..a124df54655 100644
>> --- a/gcc/expr.c
>> +++ b/gcc/expr.c
>> @@ -11594,15 +11594,103 @@ is_aligning_offset (const_tree offset, const_tree exp)
>> /* This must now be the address of EXP. */
>> return TREE_CODE (offset) == ADDR_EXPR && TREE_OPERAND (offset, 0) == exp;
>> }
>> -
>> -/* Return the tree node if an ARG corresponds to a string constant or zero
>> - if it doesn't. If we return nonzero, set *PTR_OFFSET to the (possibly
>> - non-constant) offset in bytes within the string that ARG is accessing.
>> - If MEM_SIZE is non-zero the storage size of the memory is returned.
>> - If DECL is non-zero the constant declaration is returned if available. */
>>
>> -tree
>> -string_constant (tree arg, tree *ptr_offset, tree *mem_size, tree *decl)
>> +/* If EXPR is a constant initializer (either an expression or CONSTRUCTOR),
>> + attempt to obtain its native representation as an array of nonzero BYTES.
>> + Return true on success and false on failure (the latter without modifying
>> + BYTES). */
>> +
>> +static bool
>> +convert_to_bytes (tree type, tree expr, vec<unsigned char> *bytes)
>> +{
>> + if (TREE_CODE (expr) == CONSTRUCTOR)
>> + {
>> + /* Set to the size of the CONSTRUCTOR elements. */
>> + unsigned HOST_WIDE_INT ctor_size = bytes->length ();
>> +
>> + if (TREE_CODE (type) == ARRAY_TYPE)
>> + {
>> + tree val, idx;
>> + tree eltype = TREE_TYPE (type);
>> + unsigned HOST_WIDE_INT elsize =
>> + tree_to_uhwi (TYPE_SIZE_UNIT (eltype));
>> + unsigned HOST_WIDE_INT i, last_idx = HOST_WIDE_INT_M1U;
>> + FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (expr), i, idx, val)
>> + {
>> + /* Append zeros for elements with no initializers. */
>> + if (!tree_fits_uhwi_p (idx))
>> + return false;
>> + unsigned HOST_WIDE_INT cur_idx = tree_to_uhwi (idx);
>> + if (unsigned HOST_WIDE_INT size = cur_idx - (last_idx + 1))
>> + {
>> + size = size * elsize + bytes->length ();
>> + bytes->safe_grow_cleared (size);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>> + }
>> +
>> + if (!convert_to_bytes (eltype, val, bytes))
>> + return false;
>> +
>> + last_idx = cur_idx;
>> + }
>> + }
>> + else if (TREE_CODE (type) == RECORD_TYPE)
>> + {
>> + tree val, fld;
>> + unsigned HOST_WIDE_INT i;
>> + FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (expr), i, fld, val)
>> + {
>> + /* Append zeros for members with no initializers and
>> + any padding. */
>> + unsigned HOST_WIDE_INT cur_off = int_byte_position (fld);
>> + if (bytes->length () < cur_off)
>> + bytes->safe_grow_cleared (cur_off);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>> +
>> + if (!convert_to_bytes (TREE_TYPE (val), val, bytes))
>> + return false;
>> + }
>> + }
>> + else
>> + return false;
>> +
>> + /* Compute the size of the COSNTRUCTOR elements. */
>> + ctor_size = bytes->length () - ctor_size;
>> +
>> + /* Append zeros to the byte vector to the full size of the type.
>> + The type size can be less than the size of the CONSTRUCTOR
>> + if the latter contains initializers for a flexible array
>> + member. */
>> + tree size = TYPE_SIZE_UNIT (type);
>> + unsigned HOST_WIDE_INT type_size = tree_to_uhwi (size);
>> + if (ctor_size < type_size)
>> + if (unsigned HOST_WIDE_INT size_grow = type_size - ctor_size)
>> + bytes->safe_grow_cleared (bytes->length () + size_grow);
>> +
>> + return true;
>> + }
> So I think you need to be more careful with CONSTRUCTOR nodes here. Not all
> elements of an object need to appear in the CONSTRUCTOR. Elements which do not
> appear in the CONSTRUCTOR node are considered zero-initialized, unless
> CONSTRUCTOR_NO_CLEARING is set.
>
> I don't see anything in the code above which deals with those oddities of
> CONSTRUCTOR nodes. Did I miss it?
Just capturing for reference what we just discussed off list:
The underlined code above zeroes out the bytes of elements with
no initializers as well as any padding between fields. It doesn't
consider CONSTRUCTOR_NO_CLEARING. I didn't know about that bit so
I looked it up. According to the internals manual:
Unrepresented fields will be cleared (zeroed), unless the
CONSTRUCTOR_NO_CLEARING flag is set, in which case their value
becomes undefined.
So assuming they're zero should be fine, as would doing nothing.
We agreed on the former so I will go ahead with the patch as is.
Thanks
Martin
More information about the Gcc-patches
mailing list