This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFA] [tree-optimization/80576] Handle non-constant sizes in DSE


On 8/21/19 4:34 PM, Jeff Law wrote:
On 8/16/19 4:21 PM, Martin Sebor wrote:
On 8/16/19 12:15 PM, Jeff Law wrote:
On 8/16/19 12:09 PM, Marc Glisse wrote:
On Fri, 16 Aug 2019, Jeff Law wrote:

This patch improves our ability to detect dead stores by handling cases
where the size memcpy, memset, strncpy, etc call is not constant.  This
addresses some, but not all, of the issues in 80576.

The key here is when the size is not constant we can make conservative
decisions that still give us a chance to analyze the code for dead
stores.

Remember that for dead store elimination, we're trying to prove that
given two stores, the second store overwrites (partially or fully) the
same memory locations as the first store.  That makes the first store
either partially or fully dead.

When we encounter the first store, we set up a bitmap of bytes written
by that store (live_bytes).  We then look at subsequent stores and
clear
the appropriate entries in the bitmap.

If the first store has a nonconstant length argument we can use the
range of the length argument (max) and the size of the destination
object to make a conservative estimation of how many bytes are written.

For the second store the conservative thing to do for a non-constant
length is to use the minimum of the range of the length argument.

So I guess it won't handle things like

void f(char*p,int n){
    __builtin_memset(p,3,n);
    __builtin_memset(p,7,n);
}

where we know nothing about the length, except that it is the same? Or
do you look at symbolic ranges?
Nope.  I think ao_ref can represent that, so it'd just be a matter of
recording "n" as the length, then verifying that the second call's
length is "n" as well.  That makes the first call dead.  We'd have to
bypass the byte tracking in that case, but I think that's trivial
because we already have a means to do that when the sizes are too large.


This doesn't come up a lot in practice.  But it also happens to put
some
of the infrastructure in place to handle strcpy and strcpy_chk which
are
needed to fully resolve 80576.

Bootstrapped and regression tested on x86, x86_64, ppc64le, ppc64,
ppc32, aarch64, sparc, s390x and probably others.  Also verified that
the tests work on the various *-elf targets in my tester.

OK for the trunk?

ENOPATCH
Opps.    Attached.

It's an improvement and I realize you said it doesn't handle
everything and that you don't think it comes up a lot, but...
I would actually expect the following example (from the bug)
not to be that uncommon:

   void g (char *s)
   {
     char a[8];
     __builtin_strcpy (a, s);
     __builtin_memset (a, 0, sizeof a);
     f (a);
   }

or at least to be more common than the equivalent alternative
the improvement does optimize:

   void g (char *s)
   {
     char a[8];
     __builtin_memcpy (a, s,  __builtin_strlen (s));
     __builtin_memset (a, 0, 8);
     f (a);
   }

It seems that making the first one work should be just a matter
of handling strcpy analogously (the string length doesn't matter).

As an aside, the new tests make me realize that -Wstringop-overflow
should be enhanced to detect this problem (i.e., a consider
the largest size in a PHI).
Certainly not seeing much of these in the code I've looked at.  It may
be a case that aliasing gets in the way often.

Also note that we've got the same issues in this space that we did with
the strlen optimization improvements for last year.  I totally spaced
that and the net result may be we have to avoid using the type size for
anything in DSE which may make it impossible to really do a good job.

The issue with the strlen optimization was that it relied on
the type of subobjects of multidimensional arrays and structs
to determine the maximum valid size of the access to them.
This ran afoul of assumptions in code that doesn't respect
subobject boundaries.

This is not a concern for outermost objects like in the example
above.  strlen and other parts of GCC still make use of their
size for optimization.  The middle-end diagnostics I've been
adding still expect code to respect subobject boundaries.  I've
been doing that for this very reason: to let us do a better job
not just optimizing code, but also diagnosing bugs: find more
real problems with fewer false positives.

Martin


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]