This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Make strlen range computations more conservative


On 08/05/2018 11:27 AM, Richard Biener wrote:
On August 4, 2018 10:52:02 PM GMT+02:00, Martin Sebor <msebor@gmail.com> wrote:
On 08/03/2018 01:43 AM, Jakub Jelinek wrote:
On Thu, Aug 02, 2018 at 09:59:13PM -0600, Martin Sebor wrote:
If I call this with foo (2, 1), do you still claim it is not valid
C?

String functions like strlen operate on character strings stored
in character arrays.  Calling strlen (&s[1]) is invalid because
&s[1] is not the address of a character array.  The fact that
objects can be represented as arrays of bytes doesn't change
that.  The standard may be somewhat loose with words on this
distinction but the intent certainly isn't for strlen to traverse
arbitrary sequences of bytes that cross subobject boundaries.
(That is the intent behind the raw memory functions, but
the current text doesn't make the distinction clear.)

But the standard doesn't say that right now.

It does, in the restriction on multi-dimensional array accesses.
Given the array 'char a[2][2];' it's only valid to access a[0][0]
and a[0][1], and a[1][0], and a[1][1].  It's not valid to access
a[2][0] or a[2][1], even though they happen to be located at
the same addresses as a[1][0] and a[1][1].

There is no exception for distinct struct members.  So in
a struct { char a[2], b[2]; }, even though a and b and laid
out the same way as char[2][2] would be, it's not valid to
treat a as such.  There is no distinction between array
subscripting and pointer arithmetic, so it doesn't matter
what form the access takes.

What does the standard say to comparing & s. a[2] and & s. b[0] and what does that mean when you consider converting those to uintptr_t and back and then access the data pointed to?
Points-to analysis considers the first pointer to point to both subobjects while the second only to the second. (just pointing out other maybe inconsistent itself within GIMPLE handling of subobjects in points-to analysis)

The text (since C99) says that such pointers compare equal.

This doesn't imply that it's intended to be valid to access
the adjacent object using the past-the-end pointer.  Making
this clear is one of the main goals of the (evolving)
provenance proposal.  Converting to uintptr_t isn't meant
to change that either (the provenance is preserved through
such conversions).

Martin


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]