This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] warn on mem calls modifying objects of non-trivial types (PR 80560)
- From: Pedro Alves <palves at redhat dot com>
- To: Martin Sebor <msebor at gmail dot com>, Gcc Patch List <gcc-patches at gcc dot gnu dot org>, Jason Merrill <jason at redhat dot com>
- Date: Sun, 30 Apr 2017 21:02:53 +0100
- Subject: Re: [PATCH] warn on mem calls modifying objects of non-trivial types (PR 80560)
- Authentication-results: sourceware.org; auth=none
- References: <b729a519-fdef-1ad7-3e33-4b3d451d1a1a@gmail.com>
Hi Martin,
Thanks much for doing this. A few comments below, in light of my
experience doing the equivalent checks in the gdb patch linked below,
using standard C++11.
On 04/29/2017 09:09 PM, Martin Sebor wrote:
> Calling memset, memcpy, or similar to write to an object of
> a non-trivial type (such as one that defines a ctor or dtor,
> or has such a member) can break the invariants otherwise
> maintained by the class and cause undefined behavior.
>
> The motivating example that prompted this work was a review of
> a change that added to a plain old struct a new member with a ctor
> and dtor (in this instance the member was of type std::vector).
>
> To help catch problems of this sort some projects (such as GDB)
> have apparently even devised their own clever solutions to detect
> them: https://sourceware.org/ml/gdb-patches/2017-04/msg00378.html.
>
> The attached patch adds a new warning, -Wnon-trivial-memaccess,
> that has GCC detect these mistakes. The patch also fixes up
> a handful of instances of the problem in GCC. These instances
> correspond to the two patterns below:
>
> struct A
> {
> void *p;
> void foo (int n) { p = malloc (n); }
> ~A () { free (p); }
> };
>
> void init (A *a)
> {
> memset (a, 0, sizeof *a);
> }
>
> and
>
> struct B
> {
> int i;
> ~A ();
> };
(typo: "~B ();")
>
> void copy (B *p, const B *q)
> {
> memcpy (p, q, sizeof *p);
> ...
> }
>
IMO the check should be relaxed from "type is trivial" to "type is
trivially copyable" (which is what the gdb detection at
https://sourceware.org/ml/gdb-patches/2017-04/msg00378.html
uses for memcpy/memmove). Checking that the destination is trivial is
going to generate false positives -- specifically, [basic-types]/3
specifies that it's fine to memcpy trivially _copyable_ types, not
trivial types. A type can be both non-trivial and trivially copyable
at the same time. For example, this compiles, but triggers
your new warning:
#include <stdlib.h>
#include <string.h>
#include <type_traits>
struct NonTrivialButTriviallyCopyable
{
NonTrivialButTriviallyCopyable () : i (0) {}
int i;
};
static_assert (!std::is_trivial<NonTrivialButTriviallyCopyable>::value, "");
static_assert (std::is_trivially_copyable<NonTrivialButTriviallyCopyable>::value, "");
void copy (NonTrivialButTriviallyCopyable *dst, NonTrivialButTriviallyCopyable *src)
{
memcpy (dst, src, sizeof (*src));
}
$ /opt/gcc/bin/g++ -std=gnu++11 trivial-warn.cc -o trivial-warn -g3 -O0 -Wall -Wextra -c
trivial-warn.cc: In function ‘void copy(NonTrivialButTriviallyCopyable*, NonTrivialButTriviallyCopyable*)’:
trivial-warn.cc:16:34: warning: calling ‘void* memcpy(void*, const void*, size_t)’ with a pointer to a non-trivial type ‘struct NonTrivialButTriviallyCopyable’ [-Wnon-trivial-memaccess]
memcpy (dst, src, sizeof (*src));
^
$
Implementations of vector-like classes can very well (and are
encouraged) to make use of std::is_trivially_copyable to know whether
they can copy a range of elements to new storage
using memcpy/memmove/mempcpy.
Running your patch against GDB trips on such a case:
src/gdb/btrace.h: In function ‘btrace_insn_s* VEC_btrace_insn_s_quick_insert(VEC_btrace_insn_s*, unsigned int, const btrace_insn_s*, const char*, unsigned int)’:
src/gdb/common/vec.h:948:62: error: calling ‘void* memmove(void*, const void*, size_t)’ with a pointer to a non-trivial type ‘btrace_insn_s {aka struct btrace_insn}’ [-Werror=non-trivial-memaccess]
memmove (slot_ + 1, slot_, (vec_->num++ - ix_) * sizeof (T)); \
^
There is nothing wrong with the code being warned here.
While "struct btrace_insn" is trivial (has a user-provided default
ctor), it is still trivially copyable.
Now, this gdb code is using the old VEC (originated from
gcc's C days, it's not the current C++fied VEC implementation),
but the point is that any other random vector-like container out there
is free to optimize copy of a range of non-trivial but trivially
copyable types using memcpy/memmove.
Note that libstdc++ does not actually do that optimization, but
that's just a missed optimization, see PR libstdc++/68350 [1]
"std::uninitialized_copy overly restrictive for
trivially_copyable types". (libstdc++'s std::vector defers
copy to std::unitialized_copy.)
[1] - https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68350
> These aren't undefined and the patch could be tweaked to allow
> them.
I think they're undefined because the types are not trivially
copyable (non-trivial destructor with side effects).
> I decided not to invest effort into it because, although
> not strictly erroneous, I think they represent poor practice.
I think that for my suggestion, all you really need is use
call trivially_copyable_p instead of trivial_type_p, for
memcpy/memmove/mempcpy at least.
For memset, I'd suggest to go the other direction actually, and
instead of relaxing the trivial check, to tighten it, by warning about
memset'ting objects of non-standard-layout type too. I.e., warn for
memset of all non-POD (non-trivial + non-standard-layout) types. That's
what I did in the gdb patch. That would also produce warnings for memset
of trivial types that via refactoring end up with references, like:
struct Trivial
{
Trivial () = default;
Trivial (int &i) : m_i (i) {}
void add (int howmuch) { m_i += howmuch; }
private:
int &m_i;
};
void reset (Trivial *triv)
{
memset (triv, 0, sizeof (Trivial));
}
void recompute (Trivial *triv)
{
reset (triv); // start over
triv->add (10); // whoops, null reference.
}
It's also warn for memset of trivial types that aren't standard
layout due to having a mix of public/protected/private fields,
which is likely not a real problem in practice, but I'd call
those a code smell that warrants a warning too:
struct S
{
private:
int a;
public:
int b;
};
S s;
memset (&s, 0, sizeof (S));
Playing with the patch, I noticed that you can't silence
it by casting the pointer to void, but you can with
casting to char, like:
void copy (B *dst, B *src)
{
memcpy (dst, src, sizeof (*src)); // warns
memcpy ((void*) dst, (void *) src, sizeof (*src)); // still warns
memcpy ((char*) dst, (char *) src, sizeof (*src)); // doesn't warn
memcpy ((unsigned char*) dst, (unsigned char *) src, sizeof (*src)); // doesn't warn
}
I can understand how we end up like that, given char's magic properties, but still
I think many will reach for "void" first if they (really really) need to add a cast to
silence the warning. In any case, I think it'd be very nice to add cases with such
casts to the new tests, and maybe mention it in the documentation too.
Thanks,
Pedro Alves