This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Optimiza aggregate a = b = c = {} (PR c/78408, take 2)
On Fri, 16 Dec 2016, Jakub Jelinek wrote:
> On Fri, Dec 16, 2016 at 01:50:54PM +0100, Richard Biener wrote:
> > > +static void
> > > +optimize_memcpy (gimple_stmt_iterator *gsip, tree dest, tree src, tree len)
> > > +{
> > > + gimple *stmt = gsi_stmt (*gsip);
> > > + if (gimple_has_volatile_ops (stmt)
> > > + || TREE_THIS_VOLATILE (dest)
> > > + || TREE_THIS_VOLATILE (src))
> >
> > The last two should be redundant and/or not necessary.
>
> I've been doing this if stmt is __builtin_memcpy and the actual
> objects are TREE_THIS_VOLATILE, then I wanted to punt.
> But I guess if we never transform __builtin_memcpy into anything
> other than __builtin_memset, then it isn't a problem.
Yeah, and a memcpy of a volatile object isn't a volatile access anyway.
> > > + return;
> > > +
> > > + tree vuse = gimple_vuse (stmt);
> > > + if (vuse == NULL)
> > > + return;
> > > +
> > > + gimple *defstmt = SSA_NAME_DEF_STMT (vuse);
> > > + tree src2 = NULL_TREE, len2 = NULL_TREE;
> > > + HOST_WIDE_INT offset, offset2;
> > > + tree val = integer_zero_node;
> > > + if (gimple_store_p (defstmt)
> > > + && gimple_assign_single_p (defstmt)
> > > + && TREE_CODE (gimple_assign_rhs1 (defstmt)) == CONSTRUCTOR
> > > + && CONSTRUCTOR_NELTS (gimple_assign_rhs1 (defstmt)) == 0
> >
> > Should be always true for stores from constructors.
>
> Ok, can I just gcc_checking_assert verify it or is that not worth it?
Not worth it -- tree-cfg.c gimple verification contains that already
(ok, not exactly - it allows vector CONSTRUCTORs as RHS of
!DECL_GIMPLE_REG_P vectors -- see a bit above (MEM_REF:) for a
missed check.
>
> > > + && !gimple_clobber_p (defstmt))
> > > + src2 = gimple_assign_lhs (defstmt);
> > > + else if (gimple_call_builtin_p (defstmt, BUILT_IN_MEMSET)
> > > + && TREE_CODE (gimple_call_arg (defstmt, 0)) == ADDR_EXPR
> > > + && TREE_CODE (gimple_call_arg (defstmt, 1)) == INTEGER_CST)
> > > + {
> > > + src2 = TREE_OPERAND (gimple_call_arg (defstmt, 0), 0);
> > > + len2 = gimple_call_arg (defstmt, 2);
> > > + val = gimple_call_arg (defstmt, 1);
> > > + if (!integer_zerop (val) && is_gimple_assign (stmt))
> >
> > Please add a comment here. I think below ... (*)
> >
> > > + src2 = NULL_TREE;
> > > + }
> > > +
> > > + if (src2 == NULL_TREE)
> > > + return;
> > > +
> > > + if (len == NULL_TREE)
> > > + len = (TREE_CODE (src) == COMPONENT_REF
> > > + ? DECL_SIZE_UNIT (TREE_OPERAND (src, 1))
> > > + : TYPE_SIZE_UNIT (TREE_TYPE (src)));
> > > + if (len2 == NULL_TREE)
> > > + len2 = (TREE_CODE (src2) == COMPONENT_REF
> > > + ? DECL_SIZE_UNIT (TREE_OPERAND (src2, 1))
> > > + : TYPE_SIZE_UNIT (TREE_TYPE (src2)));
> > > + if (len == NULL_TREE
> > > + || TREE_CODE (len) != INTEGER_CST
> > > + || len2 == NULL_TREE
> > > + || TREE_CODE (len2) != INTEGER_CST)
> > > + return;
> > > +
> > > + src = get_addr_base_and_unit_offset (src, &offset);
> > > + src2 = get_addr_base_and_unit_offset (src2, &offset2);
> > > + if (src == NULL_TREE
> > > + || src2 == NULL_TREE
> > > + || offset < offset2)
> > > + return;
> > > +
> > > + if (!operand_equal_p (src, src2, 0))
> > > + return;
> > > +
> > > + /* [ src + offset2, src + offset2 + len2 - 1 ] is set to val.
> > > + Make sure that
> > > + [ src + offset, src + offset + len - 1 ] is a subset of that. */
> > > + if (wi::to_widest (len) + (offset - offset2) > wi::to_widest (len2))
> >
> > I think you can use ::to_offset which is more efficient.
>
> That is reasonable.
>
> > > + return;
> > > +
> > > + if (dump_file && (dump_flags & TDF_DETAILS))
> > > + {
> > > + fprintf (dump_file, "Simplified\n ");
> > > + print_gimple_stmt (dump_file, stmt, 0, dump_flags);
> > > + fprintf (dump_file, "after previous\n ");
> > > + print_gimple_stmt (dump_file, defstmt, 0, dump_flags);
> > > + }
> > > +
> > > + if (is_gimple_assign (stmt))
> >
> > (*) a better check would be if val is zero because even a memcpy
> > could be optimized to an assignment from {}. I suppose you're
> > doing it this way because of simplicity and because we're
> > late in the compilation and thus it doesn't matter in practice
> > for optimization? (the 2nd destination could become
> > non-TREE_ADDRESSABLE, enabling better alias disambiguation)
>
> Is memset and = {} equivalent even for aggregates with paddings?
When lowering memset to = {} you should use
MEM<char[]> (&decl, off-of-original-alias-type) = {}-of-type-char[];
similar for lowering of memcpy (I think what gimple-fold.c does
for example is slightly unsafe). But in practice I think nothing
in GCC assumes you can lower accesses and ignore padding if that
pass (like SRA) does not see all accesses to prove that is safe.
> If val is not zero, then I can only handle it if stmt is memcpy,
> (or in theory if stmt is assignment, and dest is addressable it
> could be transformed into memset). If val is zero, and dest isn't
> volatile and the size matches the size of dest, then perhaps it
> could be transformed into = {} (if there are no paddings/bitfields
> or if it doesn't matter), I haven't done that for simplicity indeed.
Yes, please add a comment so a curious bystander doesn't have to figure
that out himself.
Richard.