[PATCH] improve memcmp and memchr constant folding (PR 78257)

Jeff Law law@redhat.com
Mon Aug 17 14:53:22 GMT 2020


On Sat, 2020-08-15 at 16:19 +0200, Christophe Lyon wrote:
> Hi Martin,
> 
> 
> On Sat, 15 Aug 2020 at 01:14, Martin Sebor via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
> > On 8/13/20 11:44 AM, Martin Sebor wrote:
> > > On 8/13/20 10:21 AM, Jeff Law wrote:
> > > > On Fri, 2020-07-31 at 17:55 -0600, Martin Sebor via Gcc-patches wrote:
> > > > > The folders for these functions (and some others) call c_getsr
> > > > > which relies on string_constant to return the representation of
> > > > > constant strings.  Because the function doesn't handle constants
> > > > > of other types, including aggregates, memcmp or memchr calls
> > > > > involving those are not folded when they could be.
> > > > > 
> > > > > The attached patch extends the algorithm used by string_constant
> > > > > to also handle constant aggregates involving elements or members
> > > > > of the same types as native_encode_expr.  (The change restores
> > > > > the empty initializer optimization inadvertently disabled in
> > > > > the fix for pr96058.)
> > > > > 
> > > > > To avoid accidentally misusing either string_constant or c_getstr
> > > > > with non-strings I have introduced a pair of new functions to get
> > > > > the representation of those: byte_representation and getbyterep.
> > > > > 
> > > > > Tested on x86_64-linux.
> > > > > 
> > > > > Martin
> > > > > PR tree-optimization/78257 - missing memcmp optimization with
> > > > > constant arrays
> > > > > 
> > > > > gcc/ChangeLog:
> > > > > 
> > > > >     PR middle-end/78257
> > > > >     * builtins.c (expand_builtin_memory_copy_args): Rename called
> > > > > function.
> > > > >     (expand_builtin_stpcpy_1): Remove argument from call.
> > > > >     (expand_builtin_memcmp): Rename called function.
> > > > >     (inline_expand_builtin_bytecmp): Same.
> > > > >     * expr.c (convert_to_bytes): New function.
> > > > >     (constant_byte_string): New function (formerly string_constant).
> > > > >     (string_constant): Call constant_byte_string.
> > > > >     (byte_representation): New function.
> > > > >     * expr.h (byte_representation): Declare.
> > > > >     * fold-const-call.c (fold_const_call): Rename called function.
> > > > >     * fold-const.c (c_getstr): Remove an argument.
> > > > >     (getbyterep): Define a new function.
> > > > >     * fold-const.h (c_getstr): Remove an argument.
> > > > >     (getbyterep): Declare a new function.
> > > > >     * gimple-fold.c (gimple_fold_builtin_memory_op): Rename callee.
> > > > >     (gimple_fold_builtin_string_compare): Same.
> > > > >     (gimple_fold_builtin_memchr): Same.
> > > > > 
> > > > > gcc/testsuite/ChangeLog:
> > > > > 
> > > > >     PR middle-end/78257
> > > > >     * gcc.dg/memchr.c: New test.
> > > > >     * gcc.dg/memcmp-2.c: New test.
> > > > >     * gcc.dg/memcmp-3.c: New test.
> > > > >     * gcc.dg/memcmp-4.c: New test.
> > > > > 
> > > > > diff --git a/gcc/expr.c b/gcc/expr.c
> > > > > index a150fa0d3b5..a124df54655 100644
> > > > > --- a/gcc/expr.c
> > > > > +++ b/gcc/expr.c
> > > > > @@ -11594,15 +11594,103 @@ is_aligning_offset (const_tree offset,
> > > > > const_tree exp)
> > > > >     /* This must now be the address of EXP.  */
> > > > >     return TREE_CODE (offset) == ADDR_EXPR && TREE_OPERAND (offset,
> > > > > 0) == exp;
> > > > >   }
> > > > > -
> > > > > -/* Return the tree node if an ARG corresponds to a string constant
> > > > > or zero
> > > > > -   if it doesn't.  If we return nonzero, set *PTR_OFFSET to the
> > > > > (possibly
> > > > > -   non-constant) offset in bytes within the string that ARG is
> > > > > accessing.
> > > > > -   If MEM_SIZE is non-zero the storage size of the memory is returned.
> > > > > -   If DECL is non-zero the constant declaration is returned if
> > > > > available.  */
> > > > > -tree
> > > > > -string_constant (tree arg, tree *ptr_offset, tree *mem_size, tree
> > > > > *decl)
> > > > > +/* If EXPR is a constant initializer (either an expression or
> > > > > CONSTRUCTOR),
> > > > > +   attempt to obtain its native representation as an array of
> > > > > nonzero BYTES.
> > > > > +   Return true on success and false on failure (the latter without
> > > > > modifying
> > > > > +   BYTES).  */
> > > > > +
> > > > > +static bool
> > > > > +convert_to_bytes (tree type, tree expr, vec<unsigned char> *bytes)
> > > > > +{
> > > > > +  if (TREE_CODE (expr) == CONSTRUCTOR)
> > > > > +    {
> > > > > +      /* Set to the size of the CONSTRUCTOR elements.  */
> > > > > +      unsigned HOST_WIDE_INT ctor_size = bytes->length ();
> > > > > +
> > > > > +      if (TREE_CODE (type) == ARRAY_TYPE)
> > > > > +    {
> > > > > +      tree val, idx;
> > > > > +      tree eltype = TREE_TYPE (type);
> > > > > +      unsigned HOST_WIDE_INT elsize =
> > > > > +        tree_to_uhwi (TYPE_SIZE_UNIT (eltype));
> > > > > +      unsigned HOST_WIDE_INT i, last_idx = HOST_WIDE_INT_M1U;
> > > > > +      FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (expr), i, idx, val)
> > > > > +        {
> > > > > +          /* Append zeros for elements with no initializers.  */
> > > > > +          if (!tree_fits_uhwi_p (idx))
> > > > > +        return false;
> > > > > +          unsigned HOST_WIDE_INT cur_idx = tree_to_uhwi (idx);
> > > > > +          if (unsigned HOST_WIDE_INT size = cur_idx - (last_idx + 1))
> > > > > +        {
> > > > > +          size = size * elsize + bytes->length ();
> > > > > +          bytes->safe_grow_cleared (size);
> > >                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > 
> > > > > +        }
> > > > > +
> > > > > +          if (!convert_to_bytes (eltype, val, bytes))
> > > > > +        return false;
> > > > > +
> > > > > +          last_idx = cur_idx;
> > > > > +        }
> > > > > +    }
> > > > > +      else if (TREE_CODE (type) == RECORD_TYPE)
> > > > > +    {
> > > > > +      tree val, fld;
> > > > > +      unsigned HOST_WIDE_INT i;
> > > > > +      FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (expr), i, fld, val)
> > > > > +        {
> > > > > +          /* Append zeros for members with no initializers and
> > > > > +         any padding.  */
> > > > > +          unsigned HOST_WIDE_INT cur_off = int_byte_position (fld);
> > > > > +          if (bytes->length () < cur_off)
> > > > > +        bytes->safe_grow_cleared (cur_off);
> > >                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > > > +
> > > > > +          if (!convert_to_bytes (TREE_TYPE (val), val, bytes))
> > > > > +        return false;
> > > > > +        }
> > > > > +    }
> > > > > +      else
> > > > > +    return false;
> > > > > +
> > > > > +      /* Compute the size of the COSNTRUCTOR elements.  */
> > > > > +      ctor_size = bytes->length () - ctor_size;
> > > > > +
> > > > > +      /* Append zeros to the byte vector to the full size of the type.
> > > > > +     The type size can be less than the size of the CONSTRUCTOR
> > > > > +     if the latter contains initializers for a flexible array
> > > > > +     member.  */
> > > > > +      tree size = TYPE_SIZE_UNIT (type);
> > > > > +      unsigned HOST_WIDE_INT type_size = tree_to_uhwi (size);
> > > > > +      if (ctor_size < type_size)
> > > > > +    if (unsigned HOST_WIDE_INT size_grow = type_size - ctor_size)
> > > > > +      bytes->safe_grow_cleared (bytes->length () + size_grow);
> > > > > +
> > > > > +      return true;
> > > > > +    }
> > > > So I think you need to be more careful with CONSTRUCTOR nodes here.
> > > > Not all
> > > > elements of an object need to appear in the CONSTRUCTOR.  Elements
> > > > which do not
> > > > appear in the CONSTRUCTOR node are considered zero-initialized, unless
> > > > CONSTRUCTOR_NO_CLEARING is set.
> > > > 
> > > > I don't see anything in the code above which deals with those oddities of
> > > > CONSTRUCTOR nodes.  Did I miss it?
> > > 
> > > Just capturing for reference what we just discussed off list:
> > > 
> > > The underlined code above zeroes out the bytes of elements with
> > > no initializers as well as any padding between fields.  It doesn't
> > > consider CONSTRUCTOR_NO_CLEARING.  I didn't know about that bit so
> > > I looked it up.  According to the internals manual:
> > > 
> > >    Unrepresented fields will be cleared (zeroed), unless the
> > >    CONSTRUCTOR_NO_CLEARING flag is set, in which case their value
> > >    becomes undefined.
> > > 
> > > So assuming they're zero should be fine, as would doing nothing.
> > > We agreed on the former so I will go ahead with the patch as is.
> > 
> > I had missed a few Ada failures.  Apparently, Ada can specify
> > arbitrary array bounds (not just upper but also lower) but
> > the code assumed the lower bound would always be zero.  I've
> > adjusted it to avoid making that assumption and committed
> > the updated revision in r11-2709.
> > 
> 
> This commit is causing a regression on arm:
> FAIL:    gcc.dg/strlenopt-55.c scan-tree-dump-times gimple "memcmp" 0
> FAIL:    gcc.dg/strlenopt-55.c scan-tree-dump-times optimized
> "call_in_true_branch_not_eliminated" 0
I'm seeing this on a variety of targets as well.

jeff



More information about the Gcc-patches mailing list