Bug 78174 - out of bounds array subscript in rtl.h NOTE_DATA macro
Summary: out of bounds array subscript in rtl.h NOTE_DATA macro
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 7.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-10-31 22:29 UTC by Martin Sebor
Modified: 2016-11-04 20:20 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2016-11-04 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Sebor 2016-10-31 22:29:59 UTC
As discussed in https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02498.html:

A patch to enhance buffer overflow warnings (and c/53562) exposed a problem in the definition of the NOTE_DATA macro defined in the GCC rtl.h header.  The macro expands to a reference to a non-existent element 3 of the one-element rtx_note::rtx_insn::rtx_def::u.fld array, like so:

  memset (&((note)->u.fld[3]), 0, sizeof (((note)->u.fld[3])));

note is a pointer to rtx_note.

The computed address, while outside the array, is within the boundaries of the larger rtx_note object in which the array is declared and so it is not invalid in and of itself, but deriving the address this way is undefined.  The patch, which enhances memcpy and other functions to detect and warn on writes past the end of an object (similarly to __builtin___memcpy_chk et al.), detects that the subscript is out of bounds of the array from which it was derived and issues a warning.  To avoid the warning the address should be computed/derived not from the array but rather from the surrounding object, for example like so:

  char *p = (char*) &(note)->u.fld[0];
  p += sizeof (((note)->u.fld[0])) * 3;
  memset (p, 0, sizeof *p);
Comment 1 Andrew Pinski 2016-10-31 22:45:05 UTC
I think the warning is not called for and here is why.  There is no way in both C89 and C++ (in C99 there is) to say the array at the end of a struct (even inside an union) is a variable length.  So GCC decided that any array at the end of a structure (even inside an union or another stucture) can be considered a variable length array.

See https://gcc.gnu.org/onlinedocs/gcc-6.2.0/gcc/Zero-Length.html .
Comment 2 Martin Sebor 2016-11-01 00:04:24 UTC
The problem can be reduced to the following test case that triggers the warning even with unpatched GCC (and prior releases).

$ cat b.c && g++ -O2 -S -Wall b.c
struct A { int i, j; };
struct B { int i0, j0, i1, j1, i2, j2, i3, j3, i4, j4; };

struct C {
  union {
    struct A a[1];
    struct B b;
  } u;
};

struct D: C { };

void f (struct D *d)
{
  struct A *p = &d->u.a[3];
  __builtin___memset_chk (p, 0, sizeof *p, __builtin_object_size (p, 1));
}
b.c: In function ‘void f(D*)’:
b.c:17:73: warning: call to void* __builtin___memset_chk(void*, int, long unsigned int, long unsigned int) will always overflow destination buffer
   __builtin___memset_chk (p, 0, sizeof *p, __builtin_object_size (p, 1));
                                                                         ^
Such code is undefined under the C++ (and C) standard rules.  (Curiously, GCC doesn't issue the warning for an equivalent function that takes struct C* as an argument.)

The same warning is also emitted when the memset call in emit-rtl.c is replaced with a __builtin___memset_chk call corresponding to the one above.  The patch that triggers the warning simply enables the same compile-time checking for memset as is done by __builtin___memset_chk with type-1 object size checking.

I see that Glibc uses type-0 checking for memset even with _FORTIFY_SOURCE=2, so the patch could either be changed to do the same or enhanced to let users specify the __builtin_object_size checking type, and perhaps even a different type for raw memory functions like memset and for string functions, to make it possible to fine-tune the checking to the state of the code base and avoid issuing warnings for this kind of code in legacy software.  I think those would be enhancements worth considering, but I also think that GCC itself should try to avoid relying on undefined behavior.

GCC provides zero-length arrays as a substitute for flexible array members.  In this case, the array has one (not zero) element.  It would be nice if changing it to a zero-element array would let GCC treat it as a flexible array member on steroids so that users would have to opt in to relying on an extension to avoid the warning here, but that's not the case.  I realize that GCC sometimes treats even one element arrays (and even bigger if they're last) as zero-length even though that's not documented anywhere (AFAIK), but it would be nice to get away from that.
Comment 3 Andrew Pinski 2016-11-01 00:12:16 UTC
(In reply to Martin Sebor from comment #2)
>  I realize
> that GCC sometimes treats even one element arrays (and even bigger if
> they're last) as zero-length even though that's not documented anywhere
> (AFAIK), but it would be nice to get away from that.

It was a decision not to document it as it was more of an extension which does not need to be documented (maybe it should be documented in an internals GCC documentation but it should not be documented in an external facing one as it is not recommended one to do).  Also the other reason why it needs to be work still is that old code needs to work and we support many other code.


Do we really have a class structure here in GCC's code?
Comment 4 Martin Sebor 2016-11-01 02:13:45 UTC
Yes, rtx_note derives from rtx_insn which derives from rtx_def which is where the union containing the one-element fld array is defined.
Comment 5 Jakub Jelinek 2016-11-01 14:08:34 UTC
Using __builtin_object_size (x, 1) for memset/memcpy etc. is just incorrect, don't do that.
Comment 6 Martin Sebor 2016-11-01 16:19:56 UTC
(In reply to Jakub Jelinek from comment #5)

Do you have an explanation of why it's "just incorrect?" or an example where it results in warning on valid code?

I have found another compiler that issues a warning for the same code.  When the test case from comment #2 is slightly modified and compiled with IBM XLC++ it produces the warning below.  A translation unit obtained from the emit-rtl.c file causes tons of such warnings with this compiler.

$ cat u.c && xlC -O2 -xc++ u.c
struct A { int i, j; };
struct B { int i0, j0, i1, j1, i2, j2, i3, j3, i4, j4; };

struct C {
  union {
    struct A a[1];
    struct B b;
  } u;
};

struct D: C { };

extern "C" void* memset (void*, const void*, unsigned long);
void f (struct D *d)
{
  struct A *p = &d->u.a[3];
  memset (p, 0, sizeof *p);
}
"u.c", line 16.25: 1540-2907 (W) The subscript 3 is out of range. The valid range is 0 to 0.

The IBM compiler also emits code that assumes there are no elements in the array beyond the first and programs that assume otherwise tend to behave unexpectedly.  For example, the following function only reads the value p->u.b.k once and not in every iteration of the loop, computing a different result than when it's compiled with GCC.

int f (struct D *p)
{
  int n = 0;

  for (int i = 0; i != p->u.b.k - 1; ++i, ++n)
    p->u.a [i + 1].a = 3;

  return n;
}

Since GCC is intended to be compiled by other compilers besides itself it should be written in portable C++ without relying on its own extensions, especially those that are undocumented like the one that's the subject of this bug.

In any event, since you seem to know abvout __builtin_object_size what the rest of us don't it would be helpful if you could document these restrictions so we know how use the built-in correctly.  Alternatively, if you write them up either here or in response to one of the other bugs I and others have opened for it over the years I'd be happy to update the manual myself.  That said, without a rationale for these restrictions comments like "it's just wrong, don't do it" aren't helpful.
Comment 7 Jakub Jelinek 2016-11-01 16:45:05 UTC
(In reply to Martin Sebor from comment #6)
> (In reply to Jakub Jelinek from comment #5)
> 
> Do you have an explanation of why it's "just incorrect?" or an example where
> it results in warning on valid code?

Only -D_FORTIFY_SOURCE=1 which always uses __bos (, 0) is fully complaint mode,
-D_FORTIFY_SOURCE=2 is a mode that imposes additional restrictions (both that
%n in *printf family can't be used in writable format strings and that str*/stp* functions can't cross field boundaries.
Doing memset (field_address, 0, sizeof (whole_struct) - offsetof (struct, field)); and similar is just so common that breaking it is not desirable even in that extra mode.  As in the warning you want to add (which I'm still not very happy about, because the important distinction where you just warn about something that may happen or not vs. warning where you warn that if you execute
this particular code path, you'll always __chk_fail () and abort the process is lost, plus there is no separate option or warning levels for those must fail and "must invoke UB" and "may invoke UB") doesn't know if the additional restrictions are imposed or not, except when __builtin_*_chk builtins are used,
IMNSHO you should use __bos (, 0) for all those other cases.

It is true that ideally we'd use flexible array members for u.fld, except that
C++ doesn't have them, and somebody decided it is a good idea to wrap them into further classes.  I know coverity is unhappy about that, perhaps a way out of this would be to just always use fld[10]; or whatever is the highest number of RTL operands (similarly for tree_exp).  But that doesn't change anything on that this is a very common technique used by tons of other programs and we do not want to warn about that.
Comment 8 Martin Sebor 2016-11-02 03:13:56 UTC
Okay, thanks.  Your comments seem to be focused on my patch and not so much on this problem that was exposed by it.  I realize I invited those comments with my response and do want to continue to have that discussion, but I want to have it separately from this issue.  Here I'd like to get a confirmation of just the NOTE_DATA problem so I can prepare patch for it.
Comment 9 Jakub Jelinek 2016-11-02 08:02:36 UTC
No, I've also explained what to do about the u.fld thing.
sed -n '/^DEF_RTL_EXPR/{s/DEF_RTL_EXPR([^,]*,[^,]*, "//;s/".*$//;p}' rtl.def | sort -u
shows we have at most 8 fields right now, so we could define some macro to 8 and replace
    rtunion fld[1];
with
    rtunion fld[RTL_FLD_MAX_COUNT];
plus some verification (selftest) that no RTL format not containing w is longer than that.
Similarly for
    HOST_WIDE_INT hwint[1];
We probably can't use real.h here, so again we have to live with some maximum, real.h shows
the longest supported REAL_WIDTH is 6.
Then in struct tree_exp we have:
  tree GTY ((special ("tree_exp"),
             desc ("TREE_CODE ((tree) &%0)")))
    operands[1];
which have I think maximum of 7 right now (again, we'd need macro and self test verification).
Then struct tree_omp_clause has:
   tree GTY ((length ("omp_clause_num_ops[OMP_CLAUSE_CODE ((tree)&%h)]")))
    ops[1];
and in that case the maximum is 5.
The question is if we ever just have a struct rtx_def, struct tree_exp etc. or some struct
that contains those as a field or class that has those as a base as a global variable or
automatic variable, such changes would change it obviously.  Most of it is allocated
through corresponding tree or rtl allocators and allocates offsetof (..., field) + length.

We'd be trading one kind of pedantic undefined behavior for another one I bet, right now
we often have larger objects than what the compiler can see and thus use the "flexible-array-like"
extension we probably don't document very well, but support nevertheless - too much code
in the wild would break otherwise.  And with the above proposed changes we'd turn it into something
different, the compiler would be told the structures are large, but in reality they would be shorter,
the compiler couldn't rely in all of the structure to be readable etc. (that actually is the case
now too, e.g. for RTLs with zero operands or OpenMP clauses with zero operands).

But it would shut up coverity and other static analyzers.  Perhaps we should do it only
conditionally (in some compilation modes) that would be enabled for those static analyzers.

GCC isn't alone that uses techniques like this and so warning about it is not really useful.
Comment 10 Jakub Jelinek 2016-11-02 08:06:15 UTC
Other such spots are in gimple.h (e.g. gimple_statement_with_ops, gimple_statement_with_memory_ops, gcall, gphi, gasm.  Most likely we have other spots with that.
Comment 11 Martin Sebor 2016-11-04 20:20:15 UTC
(In reply to Jakub Jelinek from comment #9)
> No, I've also explained what to do about the u.fld thing.

Okay, I take that as your confirmation that the bug should be fixed.  I'll leave the "how" to whomever is going to work on it.  The latest version of the patch referenced in comment #0 doesn't trigger the warning so fixing it isn't as pressing as when it did.