[PATCH 1/2] add -Wuse-after-free

Martin Sebor msebor@gmail.com
Wed Nov 3 00:22:06 GMT 2021

On 11/2/21 4:29 PM, David Malcolm wrote:
> On Mon, 2021-11-01 at 16:17 -0600, Martin Sebor via Gcc-patches wrote:
>> Patch 1 in the series detects a small subset of uses of pointers
>> made indeterminate by calls to deallocation functions like free
>> or C++ operator delete.  To control the conditions the warnings
>> are issued under the new -Wuse-after-free= option provides three
>> levels.  At the lowest level the warning triggers only for
>> unconditional uses of freed pointers and doesn't warn for uses
>> in equality expressions.  Level 2 warns also for come conditional
>> uses, and level 3 also for uses in equality expressions.
>> I debated whether to make level 2 or 3 the default included in
>> -Wall.  I decided on 3 for two reasons: 1) to raise awareness
>> of both the problem and GCC's new ability to detect it: using
>> a pointer after it's been freed, even only in principle, by
>> a successful call to realloc, is undefined, and 2) because
>> it's trivial to lower the level either globally, or locally
>> by suppressing the warning around such misuses.
>> I've tested the patch on x86_64-linux and by building Glibc
>> and Binutils/GDB.  It triggers a number of times in each, all
>> due to comparing invalidated pointers for equality (i.e., level
>> 3).  I have suppressed these in GCC (libiberty) by a #pragma,
>> and will see how the Glibc folks want to deal with theirs (I
>> track them in BZ #28521).
> For reference, this is:
>    https://sourceware.org/bugzilla/show_bug.cgi?id=28521
>> The tests contain a number of xfails due to limitations I'm
>> aware of.  I marked them pr?????? until the patch is approved.
>> I will open bugs for them before committing if I don't resolve
>> them in a followup.
> [...snip...]
>> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
>> index c5730228821..eb4ecb56dcc 100644
>> --- a/gcc/doc/invoke.texi
>> +++ b/gcc/doc/invoke.texi
>> @@ -4341,6 +4341,60 @@ annotations.
>>   Warn about overriding virtual functions that are not marked with the
>>   @code{override} keyword.
>> +@item -Wuse-after-free
>> +@itemx -Wuse-after-free=@var{n}
>> +@opindex Wuse-after-free
>> +@opindex Wno-use-after-free
>> +Warn about uses of pointers to dynamically allocated objects that
> have
>> +been rendered indeterminate by a call to a deallocation function.
>> +
>> +@table @gcctabopt
>> +@item -Wuse-after-free=1
>> +At level 1 the warning attempts to diagnose only unconditional uses
> of
>> +pointers made indeterminate by a deallocation call.  This includes
>> +double-@code{free} calls.  Although undefined, uses of indeterminate
>> +pointers in equality (or inequality) expressions are not diagnosed
> at
>> +this level.
>> +@item -Wuse-after-free=2
>> +At level 2, in addition to unconditional uses the warning also
> diagnoses
>> +conditional uses of pointers made indeterminate by a deallocation
> call.
>> +As at level 1, uses in (or inequality) equality expressions are not
>> +diagnosed.  For example, the second call to @code{free} in the
> following
>> +function is diagnosed at this level:
>> +@smallexample
>> +struct A @{ int refcount; void *data; @};
>> +
>> +void release (struct A *p)
>> +@{
>> +  int refcount = --p->refcount;
>> +  free (p);
>> +  if (refcount == 0)
>> +    free (p->data);   // warning: p may be used after free
>> +@}
>> +@end smallexample
>> +@item -Wuse-after-free=3
>> +At level 3, the warning also diagnoses uses of indeterminate
> pointers in
>> +equality expressions.  All uses of indeterminate pointers are
> undefined
>> +but equality tests sometimes appear after calls to @code{realloc} as
>> +an attempt to determine whether the call resulted in relocating the
> object
>> +to a different address.  They are diagnosed at a separate level to
> aid
>> +legacy code gradually transition to safe alternatives.  For example,
>> +the equality test in the function below is diagnosed at this level:
>> +@smallexample
>> +void adjust_pointers (int**, int);
>> +
>> +void grow (int **p, int n)
>> +@{
>> +  int **q = (int**)realloc (p, n *= 2);
>> +  if (q == p)
>> +    return;
>> +  adjust_pointers ((int**)q, n);
>> +@}
>> +@end smallexample
>> +@end table
>> +
>> +@option{-Wuse-after-free=3} is included in @option{-Wall}.
> Recapping our chat earlier today, I confess to not being familiar with
> this aspect of the C standard, but IIRC you were saying that the
> pointer passed in to "realloc" is always "indeterminate" after a
> successful call, and indeed I see on:
>    https://en.cppreference.com/w/c/memory/realloc
> "The original pointer ptr is invalidated and any access to it is
> undefined behavior (even if reallocation was in-place)."
> Does this really mean that it's undefined to use the original "p" after
> an in-place reallocation (as opposed to "*p"?).  I find this
> surprising, and I think many of our users would too.

Yes, it's undefined to use the pointer after a successful
reallocation (it's okay to use it after a failed one, although
as discussed in WG14 DR400, even that can be problematic due
to a lack of clarity in the standards).

One rationale for it is to make it possible to implement realloc
with no in-place extension, as a sequence of three calls: malloc,
memcpy, and free.  Another is that using invalidated pointers is
inherently unsafe and the C committee wanted to leave room for
debugging hardware that traps on the use of such pointers
(US7966480 describes one such solution).  Such solutions are
only feasible when all invalid pointers are treated the same.

> If that's the case, is there a standards-conforming way for code to
> distinguish between a successful in-place vs a successful moving
> realloc?  How would a user "transition [their code] to safe
> alternatives"?  (the docs should probably spell that out).  i.e. what's
> an idiomatic way for the user to write this logic correctly?

There's no portable mechanism to detect whether a realloc call
moved an object.  Strictly conforming programs must avoid it.
The idiomatic way of avoiding it is to store offsets rather
than pointers to reallocated memory.

> How is this behavior unsafe, or how could it become unsafe?  If it's
> merely a theoretical concern, I think level 2 would be better for -Wall
> (or the default).

It's difficult to quantify how unsafe this or that undefined
construct is.  I think it's probably comparable to passing null
pointers and zero size to memcpy (diagnosed by -Wnonnull, also
in -Wall).  Or to passing T* to %p in a call to printf where T
isn't void (diagnosed by -Wformat, in -Wall).  Or to [ab]using
uninitialized variables as a source of entropy.

It's a judgment call what checkers to include at what level of
what option.  I made this choice not just because it's undefined
but also because it could help find mistakes in nearby code (as
in the Binutils case below or ideally more impactful).  Because
it's little known I'm hoping even the benign instances will help
raise awareness of the issue.  But I made it a separate level
to make it easy for users to suppres, and for us to adjust if

> Have you seen level 3 catch anything that *isn't* a usage of realloc
> checking success for in-place resizing vs moving?

The Glibc use in ldconfig.c is a use after free.  But it's just
sloppy coding, not a "real" bug.  It also found one instance in
Binutils (libctf/ctf-open-bfd.c).  That one is also not really
a bug, more like a thinko:

       isymbuf = bfd_elf_get_elf_syms (abfd, symhdr, symcount, 0,
				      NULL, symtab, NULL);
       free (isymbuf);
       if (isymbuf == NULL)
	  bfderrstr = N_("cannot read symbol table");
	  goto err_free_sym;

The call to free should follow the if statement.

I expect it to find plenty more benign misues like this in other
code, but what I'm really hoping of course is that it will help
find actual bugs, even if indirectly.

> [...snip...]
> Hope this is constructive; sorry about my ignorance here.
> Dave



More information about the Gcc-patches mailing list