[PATCH] improve memcmp and memchr constant folding (PR 78257)

Christophe Lyon christophe.lyon@linaro.org
Sat Aug 15 14:19:17 GMT 2020


Hi Martin,


On Sat, 15 Aug 2020 at 01:14, Martin Sebor via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> On 8/13/20 11:44 AM, Martin Sebor wrote:
> > On 8/13/20 10:21 AM, Jeff Law wrote:
> >> On Fri, 2020-07-31 at 17:55 -0600, Martin Sebor via Gcc-patches wrote:
> >>> The folders for these functions (and some others) call c_getsr
> >>> which relies on string_constant to return the representation of
> >>> constant strings.  Because the function doesn't handle constants
> >>> of other types, including aggregates, memcmp or memchr calls
> >>> involving those are not folded when they could be.
> >>>
> >>> The attached patch extends the algorithm used by string_constant
> >>> to also handle constant aggregates involving elements or members
> >>> of the same types as native_encode_expr.  (The change restores
> >>> the empty initializer optimization inadvertently disabled in
> >>> the fix for pr96058.)
> >>>
> >>> To avoid accidentally misusing either string_constant or c_getstr
> >>> with non-strings I have introduced a pair of new functions to get
> >>> the representation of those: byte_representation and getbyterep.
> >>>
> >>> Tested on x86_64-linux.
> >>>
> >>> Martin
> >>
> >>> PR tree-optimization/78257 - missing memcmp optimization with
> >>> constant arrays
> >>>
> >>> gcc/ChangeLog:
> >>>
> >>>     PR middle-end/78257
> >>>     * builtins.c (expand_builtin_memory_copy_args): Rename called
> >>> function.
> >>>     (expand_builtin_stpcpy_1): Remove argument from call.
> >>>     (expand_builtin_memcmp): Rename called function.
> >>>     (inline_expand_builtin_bytecmp): Same.
> >>>     * expr.c (convert_to_bytes): New function.
> >>>     (constant_byte_string): New function (formerly string_constant).
> >>>     (string_constant): Call constant_byte_string.
> >>>     (byte_representation): New function.
> >>>     * expr.h (byte_representation): Declare.
> >>>     * fold-const-call.c (fold_const_call): Rename called function.
> >>>     * fold-const.c (c_getstr): Remove an argument.
> >>>     (getbyterep): Define a new function.
> >>>     * fold-const.h (c_getstr): Remove an argument.
> >>>     (getbyterep): Declare a new function.
> >>>     * gimple-fold.c (gimple_fold_builtin_memory_op): Rename callee.
> >>>     (gimple_fold_builtin_string_compare): Same.
> >>>     (gimple_fold_builtin_memchr): Same.
> >>>
> >>> gcc/testsuite/ChangeLog:
> >>>
> >>>     PR middle-end/78257
> >>>     * gcc.dg/memchr.c: New test.
> >>>     * gcc.dg/memcmp-2.c: New test.
> >>>     * gcc.dg/memcmp-3.c: New test.
> >>>     * gcc.dg/memcmp-4.c: New test.
> >>>
> >>> diff --git a/gcc/expr.c b/gcc/expr.c
> >>> index a150fa0d3b5..a124df54655 100644
> >>> --- a/gcc/expr.c
> >>> +++ b/gcc/expr.c
> >>> @@ -11594,15 +11594,103 @@ is_aligning_offset (const_tree offset,
> >>> const_tree exp)
> >>>     /* This must now be the address of EXP.  */
> >>>     return TREE_CODE (offset) == ADDR_EXPR && TREE_OPERAND (offset,
> >>> 0) == exp;
> >>>   }
> >>> -
> >>> -/* Return the tree node if an ARG corresponds to a string constant
> >>> or zero
> >>> -   if it doesn't.  If we return nonzero, set *PTR_OFFSET to the
> >>> (possibly
> >>> -   non-constant) offset in bytes within the string that ARG is
> >>> accessing.
> >>> -   If MEM_SIZE is non-zero the storage size of the memory is returned.
> >>> -   If DECL is non-zero the constant declaration is returned if
> >>> available.  */
> >>> -tree
> >>> -string_constant (tree arg, tree *ptr_offset, tree *mem_size, tree
> >>> *decl)
> >>> +/* If EXPR is a constant initializer (either an expression or
> >>> CONSTRUCTOR),
> >>> +   attempt to obtain its native representation as an array of
> >>> nonzero BYTES.
> >>> +   Return true on success and false on failure (the latter without
> >>> modifying
> >>> +   BYTES).  */
> >>> +
> >>> +static bool
> >>> +convert_to_bytes (tree type, tree expr, vec<unsigned char> *bytes)
> >>> +{
> >>> +  if (TREE_CODE (expr) == CONSTRUCTOR)
> >>> +    {
> >>> +      /* Set to the size of the CONSTRUCTOR elements.  */
> >>> +      unsigned HOST_WIDE_INT ctor_size = bytes->length ();
> >>> +
> >>> +      if (TREE_CODE (type) == ARRAY_TYPE)
> >>> +    {
> >>> +      tree val, idx;
> >>> +      tree eltype = TREE_TYPE (type);
> >>> +      unsigned HOST_WIDE_INT elsize =
> >>> +        tree_to_uhwi (TYPE_SIZE_UNIT (eltype));
> >>> +      unsigned HOST_WIDE_INT i, last_idx = HOST_WIDE_INT_M1U;
> >>> +      FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (expr), i, idx, val)
> >>> +        {
> >>> +          /* Append zeros for elements with no initializers.  */
> >>> +          if (!tree_fits_uhwi_p (idx))
> >>> +        return false;
> >>> +          unsigned HOST_WIDE_INT cur_idx = tree_to_uhwi (idx);
> >>> +          if (unsigned HOST_WIDE_INT size = cur_idx - (last_idx + 1))
> >>> +        {
> >>> +          size = size * elsize + bytes->length ();
> >>> +          bytes->safe_grow_cleared (size);
> >                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> >
> >>> +        }
> >>> +
> >>> +          if (!convert_to_bytes (eltype, val, bytes))
> >>> +        return false;
> >>> +
> >>> +          last_idx = cur_idx;
> >>> +        }
> >>> +    }
> >>> +      else if (TREE_CODE (type) == RECORD_TYPE)
> >>> +    {
> >>> +      tree val, fld;
> >>> +      unsigned HOST_WIDE_INT i;
> >>> +      FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (expr), i, fld, val)
> >>> +        {
> >>> +          /* Append zeros for members with no initializers and
> >>> +         any padding.  */
> >>> +          unsigned HOST_WIDE_INT cur_off = int_byte_position (fld);
> >>> +          if (bytes->length () < cur_off)
> >>> +        bytes->safe_grow_cleared (cur_off);
> >                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> >>> +
> >>> +          if (!convert_to_bytes (TREE_TYPE (val), val, bytes))
> >>> +        return false;
> >>> +        }
> >>> +    }
> >>> +      else
> >>> +    return false;
> >>> +
> >>> +      /* Compute the size of the COSNTRUCTOR elements.  */
> >>> +      ctor_size = bytes->length () - ctor_size;
> >>> +
> >>> +      /* Append zeros to the byte vector to the full size of the type.
> >>> +     The type size can be less than the size of the CONSTRUCTOR
> >>> +     if the latter contains initializers for a flexible array
> >>> +     member.  */
> >>> +      tree size = TYPE_SIZE_UNIT (type);
> >>> +      unsigned HOST_WIDE_INT type_size = tree_to_uhwi (size);
> >>> +      if (ctor_size < type_size)
> >>> +    if (unsigned HOST_WIDE_INT size_grow = type_size - ctor_size)
> >>> +      bytes->safe_grow_cleared (bytes->length () + size_grow);
> >>> +
> >>> +      return true;
> >>> +    }
> >> So I think you need to be more careful with CONSTRUCTOR nodes here.
> >> Not all
> >> elements of an object need to appear in the CONSTRUCTOR.  Elements
> >> which do not
> >> appear in the CONSTRUCTOR node are considered zero-initialized, unless
> >> CONSTRUCTOR_NO_CLEARING is set.
> >>
> >> I don't see anything in the code above which deals with those oddities of
> >> CONSTRUCTOR nodes.  Did I miss it?
> >
> > Just capturing for reference what we just discussed off list:
> >
> > The underlined code above zeroes out the bytes of elements with
> > no initializers as well as any padding between fields.  It doesn't
> > consider CONSTRUCTOR_NO_CLEARING.  I didn't know about that bit so
> > I looked it up.  According to the internals manual:
> >
> >    Unrepresented fields will be cleared (zeroed), unless the
> >    CONSTRUCTOR_NO_CLEARING flag is set, in which case their value
> >    becomes undefined.
> >
> > So assuming they're zero should be fine, as would doing nothing.
> > We agreed on the former so I will go ahead with the patch as is.
>
> I had missed a few Ada failures.  Apparently, Ada can specify
> arbitrary array bounds (not just upper but also lower) but
> the code assumed the lower bound would always be zero.  I've
> adjusted it to avoid making that assumption and committed
> the updated revision in r11-2709.
>

This commit is causing a regression on arm:
FAIL:    gcc.dg/strlenopt-55.c scan-tree-dump-times gimple "memcmp" 0
FAIL:    gcc.dg/strlenopt-55.c scan-tree-dump-times optimized
"call_in_true_branch_not_eliminated" 0

Christophe


> Martin


More information about the Gcc-patches mailing list