This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] avoid calling memset et al. with excessively large sizes (PR 79095)

From: Martin Sebor <msebor at gmail dot com>
To: Jeff Law <law at redhat dot com>, Gcc Patch List <gcc-patches at gcc dot gnu dot org>
Date: Tue, 17 Jan 2017 20:04:04 -0700
Subject: Re: [PATCH] avoid calling memset et al. with excessively large sizes (PR 79095)
Authentication-results: sourceware.org; auth=none
References: <497e7848-5690-2c4e-7277-cab674a60a35@gmail.com> <d886be73-3b9e-011f-156b-b461f1ea994b@redhat.com> <e79b9029-8914-5a3f-3423-04ba5fd2900c@gmail.com> <661df20e-3446-e303-a125-80adf3a74359@redhat.com>

On 01/17/2017 10:57 AM, Jeff Law wrote:

On 01/17/2017 09:12 AM, Martin Sebor wrote:

On 01/17/2017 08:26 AM, Jeff Law wrote:

On 01/16/2017 05:06 PM, Martin Sebor wrote:

The test case submitted in bug 79095 - [7 regression] spurious
stringop-overflow warning shows that GCC optimizes some loops
into calls to memset with size arguments in excess of the object
size limit.  Since such calls will unavoidably lead to a buffer
overflow and memory corruption the attached patch detects them
and replaces them with a trap.  That both prevents the buffer
overflow and eliminates the warning.

But doesn't the creation of the bogus memset signal an invalid
transformation in the loop optimizer?  ie, if we're going to convert a
loop into a memset, then we'd damn well better be sure the loop bounds
are reasonable.


I'm not sure that emitting the memset call is necessarily a bug in
the loop optimizer (which in all likelihood wasn't written with
the goal of preventing or detecting possible buffer overflows).
The loop with the excessive bound is in the source code and can
be reached given the right inputs (calling v.resize(v.size() - 1)
on an empty vector.  It's a lurking bug in the program that, if
triggered, will overflow the vector and crash the program (or worse)
with or without the optimization.

Right, but that doesn't mean that the loop optimizer can turn it into a
memset.  If the bounds are such that we're going to invoke undefined
behaviour from memset, then the loop optimizer must leave the loop alone.


What else could the loop optimizer could do in this instance?
I suppose it could just leave the loop alone and avoid emitting
the memset call.  That would avoid the warning but mask the
problem with the overflow.  In my mind, preventing the overflow
given that we have the opportunity is the right thing to do.
That is, after all, the goal of the warning.

The right warning in this case is WRT the loop iteration space
independent of mem*.


I agree that warning for the user code would be appropriate if
the loop with the excessive bound were unavoidable or at least
reachable.  But in the submitted test case it isn't because
the call to vector::resize() is guarded.  In fact, the out of-
bounds memset is also emitted with this modified test case:

  void f (std::vector<int> &v)
  {
    size_t n = v.size ();

    if (n > 1 && n < 5)
      v.resize (n - 1);
  }

I believe the root cause of the the out-of-bounds memset is the
lack of support for pointer ranges or at least some notion of
their relationships and constraints.  A std::vector is defined
by three pointers that satisfy the following relation:

  start <= finish <= end_of_storage

Queries about the capacity and size of a vector are done in terms
of expressions involving these three pointers:

  size_type capacity () {
    return end_of_storage - start;
  }

  size_type size () {
    return finish - start;
  }

Space remaining is hand-coded as

  end_of_storage - finish

The trouble is that GCC has no idea about the constraints on
the three pointers and so it treats them essentially as unrelated
integers with no ranges (except for their scale that depends on
the type the pointer points to).

Absent support for pointer ranges, GCC would be able to generate
much better code with just some help from annotations telling it
about at least some of the constraints.  In this case, for example,
adding the following optimistic invariant:

   size_type space_left = end_of_storage - finish;

   if (space_left > SIZE_MAX / 2 / sizeof (T))
     __builtin_unreachable ();

to vector::_M_default_append() lets GCC avoid the memset and emit
significantly more efficient code.  On x86_64 it reduces the number
instructions for the test case by 40%.

Martin

References:
- [PATCH] avoid calling memset et al. with excessively large sizes (PR 79095)
  - From: Martin Sebor
- Re: [PATCH] avoid calling memset et al. with excessively large sizes (PR 79095)
  - From: Jeff Law
- Re: [PATCH] avoid calling memset et al. with excessively large sizes (PR 79095)
  - From: Martin Sebor
- Re: [PATCH] avoid calling memset et al. with excessively large sizes (PR 79095)
  - From: Jeff Law

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]