Bug 86259

Summary: [8 Regression] min(4, strlen(s)) optimized to strlen(s) with -flto
Product: gcc Reporter: Vladimir Panteleev <gcc>
Component: tree-optimizationAssignee: Not yet assigned to anyone <unassigned>
Status: NEW ---    
Severity: normal CC: bernd.edlinger, davmac, dimhen, egallager, jsm28, marxin, msebor, rguenth
Priority: P2 Keywords: lto, wrong-code
Version: 8.1.1   
Target Milestone: 8.4   
Host: x86_64-pc-linux-gnu Target: x86_64-pc-linux-gnu
Build: x86_64-pc-linux-gnu Known to work: 9.0
Known to fail: Last reconfirmed: 2019-04-11 00:00:00

Description Vladimir Panteleev 2018-06-21 06:03:19 UTC
////////////////// test.c /////////////////
#include <stdio.h>
#include <string.h>

#define min(a, b) (((a) < (b)) ? (a) : (b))

char buf[32];

void fun1(char *s)
{
    memcpy(buf, s, min(4, strlen(s)));
    memcpy(buf, s, min(4, strlen(s)));
}

typedef struct
{
    char s[4];
    char s2;
} T;

void fun2(char* s)
{
    T *t = (T *) s;
    fun1(t->s);
}

int main()
{
    fun2("abcdefghijklmnopqrstuvwxyz");
    puts(buf);
    return 0;
}
///////////////////////////////////////////

Gives different results with `gcc test.c` and `gcc -O2 -flto test.c`.

The buffer in the example above fits the entire string in either case, but in the non-reduced application, this causes a heap buffer overflow.

Can be reproduced with 8.1.1 and current trunk (r261830).
Comment 1 Andrew Pinski 2018-06-21 06:21:40 UTC
This code is undefined. Try -fno-strict-aliasing .

But that might not even cause the undefined code to be resolved to bring defined as you are accessing outside the bounds of an array.
Comment 2 Andrew Pinski 2018-06-21 06:25:56 UTC
Note gcc thinks strlen(s) is less than or equal to 3 as s is really T.s which is an array of 4 in size and there for the last element has to be a null char.
Comment 3 Vladimir Panteleev 2018-06-21 06:42:39 UTC
(In reply to Andrew Pinski from comment #1)
> This code is undefined.

What's the problem? Might be just a bad reduction. Original code is Xorg:

https://cgit.freedesktop.org/xorg/xserver/tree/xkb/XKBGAlloc.c#n818
https://cgit.freedesktop.org/xorg/xserver/tree/xkb/xkb.c#n5140

>  Try -fno-strict-aliasing .

No effect.

(In reply to Andrew Pinski from comment #2)
> Note gcc thinks strlen(s) is less than or equal to 3 as s is really T.s
> which is an array of 4 in size and there for the last element has to be a
> null char.

(In reply to Andrew Pinski from comment #2)
> Note gcc thinks strlen(s) is less than or equal to 3 as s is really T.s
> which is an array of 4 in size and there for the last element has to be a
> null char.

The original code looks closer to this:

typedef struct
{
    unsigned char s[4];
    unsigned char t[4];
} T;

void fun2(char* s)
{
    T *t = (T *) s;
    fun1((char*)t->s);
}

> there for the last element has to be a null char

Why is that? Is that specific to "char"? Is there a signed 8-bit type without that property, then?

I can try to re-reduce, but I'm not sure what the restrictions are.
Comment 4 Vladimir Panteleev 2018-06-21 06:44:34 UTC
(In reply to Andrew Pinski from comment #2)
> Note gcc thinks strlen(s) is less than or equal to 3 as s is really T.s
> which is an array of 4 in size and there for the last element has to be a
> null char.

Why does that apply only to the second strcpy, then?
Comment 5 Vladimir Panteleev 2018-06-21 07:09:56 UTC
(In reply to Andrew Pinski from comment #2)
> Note gcc thinks strlen(s) is less than or equal to 3 as s is really T.s
> which is an array of 4 in size and there for the last element has to be a
> null char.

Hmm. Here is a simpler example which illustrates this:

/////////////////// test.c ///////////////////
#include <stdio.h>
#include <string.h>

struct S
{
    int x[1];
};

union U
{
    struct S arr[64];
    char s[256];
};

int main()
{
    union U u;
    strcpy(u.s, "abcdefghijklmnopqrstuvwxyz");
    size_t len = strlen((char*)&u.arr[1].x);
    puts(len > 10 ? "YES" : "NO");
    return 0;
}
//////////////////////////////////////////////

This prints "NO" with -O1 and above. clang always prints "YES".

Are you sure this is an optimization the compiler is allowed to make, though? I would think that the explicit cast to char* removes all bets as to how long the string really is.
Comment 6 Richard Biener 2018-06-21 09:34:16 UTC
(In reply to Vladimir Panteleev from comment #5)
> (In reply to Andrew Pinski from comment #2)
> > Note gcc thinks strlen(s) is less than or equal to 3 as s is really T.s
> > which is an array of 4 in size and there for the last element has to be a
> > null char.
> 
> Hmm. Here is a simpler example which illustrates this:
> 
> /////////////////// test.c ///////////////////
> #include <stdio.h>
> #include <string.h>
> 
> struct S
> {
>     int x[1];
> };
> 
> union U
> {
>     struct S arr[64];
>     char s[256];
> };
> 
> int main()
> {
>     union U u;
>     strcpy(u.s, "abcdefghijklmnopqrstuvwxyz");
>     size_t len = strlen((char*)&u.arr[1].x);
>     puts(len > 10 ? "YES" : "NO");
>     return 0;
> }
> //////////////////////////////////////////////
> 
> This prints "NO" with -O1 and above. clang always prints "YES".
> 
> Are you sure this is an optimization the compiler is allowed to make,
> though? I would think that the explicit cast to char* removes all bets as to
> how long the string really is.

the explicit conversion to char * is unimportant (strlen formal argument
is of type const char * already).  Indeed strlen may read any memory
and thus is not bound to type layout.

GCC optimizes this during CCP which nowadays uses get_range_strlen (),
IMHO indeed a questionable optimization we shouldn't perform.  The
optimization happens because of

  if (tree lhs = gimple_call_lhs (stmt))
    if (TREE_CODE (lhs) == SSA_NAME
        && INTEGRAL_TYPE_P (TREE_TYPE (lhs)))
      set_range_info (lhs, VR_RANGE, minlen, maxlen);

and we compute maxlen to 3.

That function was designed for warnings we may not use it for optimization.
Comment 7 Richard Biener 2018-06-21 12:50:53 UTC
*** Bug 86265 has been marked as a duplicate of this bug. ***
Comment 8 Martin Sebor 2018-06-21 14:38:46 UTC
The code in comment #0 is undefined.  It's not valid to advance a pointer to A to point to B and dereference the pointer.  It doesn't matter if the pointer points to the outermost object or to one of its subobjects.  This is true for hand-rolled loops as much as for functions like strlen that take strings as arguments.  Likewise, the test case in comment #5 is undefined for the same reason (a char array of size one can only hold the empty string).
Comment 9 Martin Sebor 2018-06-21 14:54:35 UTC
There is an ongoing effort to clarify the provenance of pointers in C.  A recent proposal for such clarification is N2219:
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2219.htm
I don't see anything in the paper that speaks directly to the test cases here but 2.3.1 and 2.3.2 come close.  There is a separate question of whether strings (as in arguments to string functions like strlen) extend to arbitrary seq
Comment 10 Martin Sebor 2018-06-21 14:56:43 UTC
(to complete the sentence from comment #9)

There is a separate question of whether strings (as in arguments to string functions like strlen) extend to arbitrary sequences of bytes regardless of the object type or whether they should be required to have a type of array of character type.
Comment 11 Richard Biener 2018-06-22 07:19:28 UTC
(In reply to Martin Sebor from comment #8)
> The code in comment #0 is undefined.  It's not valid to advance a pointer to
> A to point to B and dereference the pointer.  It doesn't matter if the
> pointer points to the outermost object or to one of its subobjects.  This is
> true for hand-rolled loops as much as for functions like strlen that take
> strings as arguments.  Likewise, the test case in comment #5 is undefined
> for the same reason (a char array of size one can only hold the empty
> string).

I am not convinced.  You say that

 struct { int a; int b; } s, s2;
 memcpy (&s2.a, &s.a, sizeof (s));

is invalid, aka not copying the whole structure since you pass in a
pointer to s2.a rather than s2?  Alternatively

 struct { int x; int a; int b; } s, s2;
 memcpy (&s2.a, &s.a, sizeof (s) - sizeof (int));

is similarly invalid, not copying the last two elements?

I don't see how str* are in anyway less special than mem* here in case you
come up with an exception for mem*.

At least from a QOI perspective (and GIMPLE representation perspective!)
claiming the above are invoking undefined behavior is plain wrong.
Comment 12 Richard Biener 2018-06-22 07:24:01 UTC
Btw, GCC still claims to support C89, so please cite appropriate C89 wording.

To GCC internals while a direct access like

 a.b.c[i].d

has to remain inside bounds the variant via a pointer

 p = &a.b.c[i].d;
 ... = *p;

does not have such restrictions.  Thus once a function call is involved
like strlen (&a.b.c[0].d), the actual access is via a pointer (in the
strlen implementation) and thus GCC doesn't (and may not) assume the
access stays inside bounds.

This is the very reason that we avoid propagating the pointer example above
to

  ... = *(&a.b.c[i].d) = a.b.c[i].d;

because to GCC that is not an identity transform semantically.
Comment 13 joseph@codesourcery.com 2018-06-22 16:38:41 UTC
Well, C90 TC1 adds (to the informative Annex listing undefined behavior) 
the case of array subscripts being out of range even if they wouldn't be 
simply based on the top-level object, such as a[1][7] given the 
declaration int a[4][5].  And of course array references are defined in 
terms of equivalent pointer arithmetic, so that's *(*(a+1) + 7).  So 
certainly for array objects, once you've got to a[1] or *(a+1) - as 
opposed to &a[1] or &*(a+1) - you can't then get something from a[2] from 
there.  (But e.g. &a[1] + 1 is a perfectly valid way of writing &a[2].  
And at least some of the examples here may be closer to that.)
Comment 14 Martin Sebor 2018-06-24 20:38:47 UTC
> You say that
> 
>  struct { int a; int b; } s, s2;
>  memcpy (&s2.a, &s.a, sizeof (s));
> 
> is invalid, aka not copying the whole structure since you pass in a
> pointer to s2.a rather than s2?

Yes.  It's invalid for the same reason as the following:

  int *p = &s.a;
  int *q = &s2.a;

  *q = *p;    // okay
  ++q, ++p;   // okay
  *q = *p;    // undefined #1
  ++q, ++p;   // undefined #2

In s.a and s.b are effectively arrays of one element, i.e., int[1].  As Joseph explained in comment #13, pointer arithmetic is defined only for and within each array object, irrespective of whether the array is itself an element of another array or a member of a struct.  So above, #1 is undefined because it accesses a past-the-end element of an array, and #2 is undefined because the only valid arithmetic on a past-the-end pointer is to subtract one from it.

This is specified in 6.5.6 Additive operators, p8 of C11.  In C89, the corresponding text is in section 3.3.6.  The same restriction applies to all library functions, including memcpy and strcpy.

> Alternatively
> 
>  struct { int x; int a; int b; } s, s2;
>  memcpy (&s2.a, &s.a, sizeof (s) - sizeof (int));
> 
> is similarly invalid, not copying the last two elements?

Yes.  It's invalid for the same reasons as above.  This would be valid:

  struct S { int x; int a; int b; } s, s2;
  memcpy ((char*)&s2 + offseof (struct S, a),
          (char*)&2 + offseof (struct S, a),
          sizeof (s) - offseof (struct S, a));

This is valid because &s and &s2 point to the objects s and s2.  Any object can be interpreted as an array of char with a size equal to the object itself and the pointer arithmetic is valid anywhere within the object.

> 
> I don't see how str* are in anyway less special than mem* here in case you
> come up with an exception for mem*.

In the standard, the restrictions above apply equally to expressions as well as library functions.  For better or worse, GCC treats memxxx() functions differently than strxxx() functions because of legacy invalid code that relies on the second memcpy example working.  This is evident from the effects of _FORTIFY_SOURCE.  For example, the following gets a warning when compiled with _FORTIFY_SOURCE=1 and aborts with a buffer overflow error at runtime:

  #include <string.h>

  int main (void)
  {
    struct {
      char a[4];
      char b[3];
    } a;

    a[0] = '0';
    strcpy (a.a + 1, "12345");

    __builtin_puts (a.a);
  }

In function ‘strcpy’,
    inlined from ‘main’ at c.c:12:5:
/usr/include/bits/string3.h:110:10: warning: ‘__builtin___memcpy_chk’ writing 6 bytes into a region of size 3 overflows the destination [-Wstringop-overflow=]
...
*** buffer overflow detected ***: ./a.out terminated

(The region of 3 is the amount of space in the a.a array at offset 1.)

but because the special treatment applies only to the memxxx functions and not to string functions, the equivalent program using memcpy() compiles without a warning and runs successfully to completion:

  #include <string.h>

  int main (void)
  {
    struct {
      char a[4];
      char b[3];
    } a;

    a[0] = '0';
    memcpy (a.a + 1, "12345", 6);

    __builtin_puts (a.a);
  }

Because the special treatment applies only to raw memory functions like memcpy and memmove and not to string functions like strcpy or strlen, none of the test cases here is valid even under GCC's relaxed rules.
Comment 15 Marc Glisse 2018-06-25 04:26:03 UTC
(In reply to Martin Sebor from comment #14)
> > You say that
> > 
> >  struct { int a; int b; } s, s2;
> >  memcpy (&s2.a, &s.a, sizeof (s));
> > 
> > is invalid, aka not copying the whole structure since you pass in a
> > pointer to s2.a rather than s2?
> 
> Yes.

Let's give the struct a name and introduce some casts

typedef struct { int a; int b; } S;
S s, s2;
memcpy ((S*)&s2.a, (S*)&s.a, sizeof(s));

The standard makes it valid to convert &s.a to S* and makes it equivalent to &s. Because of the call to memcpy, it gets converted to void* afterwards. So you are saying that (void*)&s.a is not the same as (void*)(S*)&s.a (note that no arithmetic is involved at this point). This looks like it is going to cause trouble...
Comment 16 rguenther@suse.de 2018-06-25 08:16:40 UTC
On Mon, 25 Jun 2018, glisse at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86259
> 
> --- Comment #15 from Marc Glisse <glisse at gcc dot gnu.org> ---
> (In reply to Martin Sebor from comment #14)
> > > You say that
> > > 
> > >  struct { int a; int b; } s, s2;
> > >  memcpy (&s2.a, &s.a, sizeof (s));
> > > 
> > > is invalid, aka not copying the whole structure since you pass in a
> > > pointer to s2.a rather than s2?
> > 
> > Yes.
> 
> Let's give the struct a name and introduce some casts
> 
> typedef struct { int a; int b; } S;
> S s, s2;
> memcpy ((S*)&s2.a, (S*)&s.a, sizeof(s));
> 
> The standard makes it valid to convert &s.a to S* and makes it equivalent to
> &s. Because of the call to memcpy, it gets converted to void* afterwards. So
> you are saying that (void*)&s.a is not the same as (void*)(S*)&s.a (note that
> no arithmetic is involved at this point). This looks like it is going to cause
> trouble...

Similarly for (char *) conversions as performed by str* calls,
6.3.2.3/7 applies which says "When a pointer to _an object_ is
converted to a pointer to a character type, the result points
to the lowest addressed byte of the object.  Successive increments
of the result, up to the size of the object, yield pointers to the
remaining bytes of the object."  Now, "object" is defined as
"region of data storage in the execution environment, the contents of
which can represent values" and 6.5.3.2 suggests that & of an lvalue
points to the "object" designated by that lvalue.  That again
makes (char *)&*(S *)&s2.a not the same as (char *)&s2.a
semantically.

As a general comment I find it disturbing that the user
is required to write (char *)&s2 + offsetof(S, a) instead
of plain &s2.a (whatever offset a actually has, thus considering
my second example as well) when calling str* or mem* routines.
At least it re-inforces my doubt on any address-computation to
component-ref folding we still have (plenty!).

That said, as of QOI I seriously doubt taking advantage of
the above for optimization purposes is a good idea given
there isn't even a flag like -fwrapv or -fno-strict-aliasing
to disable it!
Comment 17 Martin Sebor 2018-06-25 16:57:34 UTC
> Let's give the struct a name and introduce some casts
> 
> typedef struct { int a; int b; } S;
> S s, s2;
> memcpy ((S*)&s2.a, (S*)&s.a, sizeof(s));
> 
> The standard makes it valid to convert &s.a to S* and makes it equivalent to
> &s. Because of the call to memcpy, it gets converted to void* afterwards. So
> you are saying that (void*)&s.a is not the same as (void*)(S*)&s.a (note
> that no arithmetic is involved at this point). This looks like it is going
> to cause trouble...

That code like this exists out there is one of the reasons why GCC allows memcpy() to cross subobject boundaries, so there is no trouble there(*).  That the code is invalid shouldn't be surprising: replacing the members with one-element arrays (as C treats singleton objects) and replacing the memcpy with direct assignments to the array elememts has triggered a -Warray-bounds since GCC 4.6:

  typedef struct { int a[1], b[1]; } S;
  S s, s2;

  void f (void)
  {
    s2.a[1] = s.a[1];   // warning: array subscript 1 is above array bounds of ‘int[1]’ [-Warray-bounds]
  }

But this report (or the related 86265) isn't about memcpy.  It's about strlen() accessing elements of multiple consecutive subobjects.  As I showed, using string functions to cross subobject boundaries has been diagnosed by GCC since 2005 (GCC 4.3) and has affected the generated code with _FORTIFY_SOURCE since then.  Whenever possible (and except for the memxxx functions), GCC diagnoses not just writes but also reads that cross subobject boundaries, either by -Wstringop-overflow (for built-in functions) or by -Warray-bounds (for direct accesses and for some uses of built-ins).  It doesn't diagnose such invalid reads by strlen() yet but starting with GCC 8 the -Wstringop-truncation warning makes an effort to catch some before they happen.  GCC 9 will try to detect some such reads even by strlen and strnlen (see for example bug 86199).

The appropriate mechanism for accessing memory across subobject boundaries and irrespective of object types are the raw memory functions like memcpy and memchr (with memchr(..., 0, ...) being the function to use to compute the same result as strlen).  I would recommend respecting subobject boundaries and starting with the address of the enclosing object when using those as well but for now GCC does not diagnose violations.  Uses of all other (typed) functions should respect object sizes and types.

[*] I'd like to see GCC start diagnosing questionable code like that to drive code changes that will make it easier to do better analysis, detect more bugs, and ultimately emit even more efficient object code.  It is one of the goals of the effort by Peter Sewell and others to clarify C object model to make this possible.
Comment 18 Martin Sebor 2018-06-25 18:03:56 UTC
> Similarly for (char *) conversions as performed by str* calls,
> 6.3.2.3/7 applies which says "When a pointer to _an object_ is
> converted to a pointer to a character type, the result points
> to the lowest addressed byte of the object.  Successive increments
> of the result, up to the size of the object, yield pointers to the
> remaining bytes of the object."

The intent of this text is to require implementations to store objects at consecutive memory locations (as opposed to, for instance, interleaving them somehow).  It's not a license for programs to disregard restrictions on accessing objects and their subobjects outlined later in the text, such as those in 6.5.6 for additive operators.

> As a general comment I find it disturbing that the user
> is required to write (char *)&s2 + offsetof(S, a) instead
> of plain &s2.a (whatever offset a actually has, thus considering
> my second example as well) when calling str* or mem* routines.

It's a simple rule: to access a part/member of an object via a pointer, the pointer must be derived from the address of the object (and not from one of its subobjects).  This applies recursively to subobjects.

> That said, as of QOI I seriously doubt taking advantage of
> the above for optimization purposes is a good idea given
> there isn't even a flag like -fwrapv or -fno-strict-aliasing
> to disable it!

There are many micro-optimizations in GCC that rely on the absence of undefined behavior in a program that cannot be disabled by an option (such as those in gimple-fold).  I have made a lot of effort to detect and diagnose the undefined behavior when possible, but more often than not, optimization is given preference to detecting undefined behavior, even if it means generating invalid/undefined code.  For example, even at -O0 GCC emits code that writes past the end of the array below even though the buffer overflow is trivially detectable.  There is no option to avoid it.  It wasn't until GCC 8 and -Wrestrict that this has been diagnosed (only with optimization), and more as a happy accident than as a result of a conscious effort.  If GCC avoided folding the invalid memcpy() call into a MEM_REF the overflow would be detected and prevented by _FORTIFY_SOURCE.

  char a[1];

  void f (const void *s)
  {
    __builtin_memcpy (a, s, 4);
  }

A different example that isn't diagnosed even today as a result of folding is the one below.  GCC issues no warning here because the call to strcpy() is folded into memcpy() which is allowed to cross subobject boundaries.  If the folder instead took member sizes into consideration and avoided folding strcpy() into memcpy() the bug would be detected.  Despite that, there is no option to disable this obviously questionable choice.

  struct S {
    char d[4];
    void (*f)();
  } s;

  void f (void)
  {
    __builtin_strcpy (s.d, "1234567");
  }

These examples aren't meant as an argument against changing things to detect (and prevent) these kinds of bugs.  The point is to illustrate that the strlen() case we have been discussing is by no means unique.  The big difference between these examples and the strlen() case is that the former are trivial to do something about, while the latter will be much harder.
Comment 19 Davin McCall 2018-07-11 10:19:36 UTC
(In reply to rguenther@suse.de from comment #16)
> On Mon, 25 Jun 2018, glisse at gcc dot gnu.org wrote:
> 
> As a general comment I find it disturbing that the user
> is required to write (char *)&s2 + offsetof(S, a) instead
> of plain &s2.a

Even worse, the proposal doesn't mention the provenance of "offsetof" at all. If the result of offsetof has no provenance even the long form won't work. That seems like a big oversight to me. Also, code doing something like the following can't be terribly uncommon:

    type struct { int a; int b; } s;
    s s_o;
    // start with address of contained member:
    char *b_cp = &s_o.b;
    // subtract its offset:
    char *s_o_cp = b_cp - offsetof(s,b);
    s * s_o_p = (s *) s_o_cp;
    // access the containing object:
    s s2 = *s_o_p;

It would only work generally if an offsetof result is given wildcard provenance.
Comment 20 Davin McCall 2018-07-11 10:46:41 UTC
(In reply to Davin McCall from comment #19)
> [...] If the result of offsetof has no provenance even the long form won't
> work.

"no provenance" meaning "empty provenance", and of course this is not actually correct; shouldn't have posted before coffee. However, if it had provenance of the member, that would be problematic. Having provenance of the compound object (s2') would be ok for Richard Biener's example of ((char *)&s2 + offsetof(S, a)) but not for my extended example of determining the address of the containing object using a pointer to a member object and offsetof.
Comment 21 Davin McCall 2018-07-12 22:55:10 UTC
Looking at this further, the proposal actually states, for the address-of operator:

> When the operand designates an object, the result has the single provenance of the outermost object containing that object.

That's "outermost" object; it implies that taking the address of an inner/contained object, and manipulating it to point at other parts of the containing object, should actually be fine (adding an integer offset with empty provenance should not affect the provenance of the pointer, according to the proposal). Martin Sebor: doesn't that contradict what you said in comment #8 ? In any case it seems it should allow the case I was concerned about, i.e calculating the containing object address from a contained object address.

While we can agree that it is anyway not allowed to advance a pointer past the end of an array, including an "array" consisting of a single object not actually declared as an array, surely casting the pointer to an integer type should get around that problem - but doesn't, in the program below, for which GCC 8.1 bizarrely generates code that prints "NO" (indicating that it has determined that len != 7) and then returns 7 (indicating that len == 7). Clearly this could only be "correct" if there is undefined behaviour - though it is somewhat bad handling even then - however I cannot see the U.B. in this program and no warnings are generated (which is at least a QOI issue). Note that by the provenance proposal the 'sp_ip' variable should have the provenance of the containing object, 'u', and so when cast to char * should be perfectly capable of navigating the entire union object:

---8>---
#include <stdio.h>
#include <string.h>
#include <stddef.h>
#include <stdint.h>

struct S {
    char a[4];
    char b[4];
    char c[4];
};

union U {
    struct S s;
    char xx[12];
};

int main()
{
    union U u;
    u.s = (struct S){0, 0, 0};
    char *bp = u.s.b;
    uintptr_t sp_ip = (uintptr_t)bp - offsetof(struct S,b);
    strcpy(u.xx, "abcdefghijk");
    size_t len = strlen((char *)(union U *)sp_ip + 4);
    puts(len == 7 ? "YES" : "NO");
    return len;
}
---<8---
Comment 22 Martin Sebor 2018-07-13 00:21:08 UTC
In areas where the authors of the proposal find the standard open to interpretation and when they feel it doesn't contradict the surveyed implementation practice they tend to suggest to tighten the requirements on implementations (I think they surveyed mainly Clang and GCC) to make code valid that may be questionable today.  Their implementation survey isn't comprehensive and so in some cases they may be suggesting changes that would invalidate some optimizations.  It's not entirely clear to me that this is one such case -- they may only be thinking of allocated storage and not auto/static objects as suggested in 2.3.3 Q9b in N2263.

WG14 takes a different view from the authors: where we agree that the standard is unclear we would like to tighten the requirements on programs to allow better analysis, better optimization, and better detection of bugs.  WG14 has formed a study group to try to come up with the next revision of the proposal that's closer to WG14's goal.

With respect to objects and their subobjects, the existing requirements are sufficiently clear and existing practice shows that compilers have been relying on those requirements for some time (GCC well over a decade).  For example:

struct S { char a[4], b[4]; };

void f (struct S *p, int i)
{
  if (i < 4) i = 4;
  char b = p->b[0];
  p->a[i] = 0;            // assumed not to change p->b (undefined otherwise)
  if (p->b[0] != b)       // folded to false
    __builtin_abort ();   // eliminated
}
In function ‘f’:
warning: array subscript 4 is above array bounds of ‘char[4]’ [-Warray-bounds]
   p->a[i] = 0;            // assumed not to change p->b (undefined if it did)
   ~~~~^~~

;; Function f (f, funcdef_no=0, decl_uid=1960, cgraph_uid=0, symbol_order=0)

f (struct S * p, int i)
{
  <bb 2> [local count: 1073741825]:
  i_6 = MAX_EXPR <i_2(D), 4>;
  p_4(D)->a[i_6] = 0;
  return;

}

Besides GCC, Intel ICC also performs the same optimization.

The test cases in this report are variations on this theme.  The only difference is that they use built-in functions to access the elements of the distinct subobjects rather than accessing them directly.  GCC has just extended the optimization above to a subset of calls of built-in functions.  Besides strlen(), here's another example from GCC 7:

struct S { char a[4], b[4]; };

void f (struct S *p, int i)
{
  int n = __builtin_snprintf (0, 0, "%s", p->a);   // n must be between 0 and 3
  if (n > 3)                                       // folded to false
    __builtin_abort ();                            // eliminated
}
Comment 23 Davin McCall 2018-07-13 08:15:19 UTC
(In reply to Martin Sebor from comment #22)
> The test cases in this report are variations on this theme. [...]

Ok, except that the one I posted in comment #21 specifically copies the string into a union member which is long enough to contain it, and while it takes the address of a subobject from the other union member, it does so while that member is active, and it casts to uintptr_t before subtracting the offset (so as to obtain a pointer to the containing object in a way that doesn't involve advancing a pointer beyond the bounds of the object it points into). It even casts this back to the union type before casting to (char *) again. At that stage it either:

- points at the union object itself and its active member, which is a char[12], or
- points at the union object but not its active member
- points at the union object (and possibly its active member) but dereference is illegal due to provenance rules.

The 3rd case would be greatly disturbing to myself and, I think, to many others; it would mean that you cannot meaningfully obtain a pointer to a containing object from a contained member other than the first one.

The 1st case would mean that GCC is in error in compiling that code, since it gives the wrong result.

Only the 2nd case avoids both those issues, but only if we allow that strlen on (part of) a non-char[] object has undefined behaviour even if the relevant portion of that object contains a suitably-sized char[] as a subobject in the relevant range. That seems tenuous and certainly not directly supported by the wording of the current specification, unless I've missed something.
Comment 24 Martin Sebor 2018-07-14 00:26:06 UTC
The code in example #21 has the same bug:

    union U u;
    u.s = (struct S){0, 0, 0};

    char *bp = u.s.b;   // <<< bp points to u.s.b

    uintptr_t sp_ip = (uintptr_t)bp - offsetof(struct S,b);   // sp_ip has u.s.b's provenance

    strcpy(u.xx, "abcdefghijk");
    size_t len = strlen((char *)(union U *)sp_ip + 4);   // still the same provenance

    puts(len == 7 ? "YES" : "NO");

The strlen call is undefined because (char*)sp_ip is known to point just past the last element of u.s.b.  It wouldn't matter if there happened to be a valid string at that address -- there isn't in this case because what's there is a char[4] with no terminating NUL.  The pointer wasn't derived from that address.  The pointer was derived from u.s.b and points to u.s.b + sizeof u.s.b, and there can never be anything valid beyond the end of an object. 

Compile the test case with -fdump-tree-fre1=/dev/stdout to see what GCC sees:

  bp.0_1 = (long unsigned int) &u.s.b;
  sp_ip_9 = bp.0_1 + 18446744073709551612;
  MEM[(char * {ref-all})&u] = MEM[(char * {ref-all})"abcdefghijk"];
  _4 = __builtin_strlen (&u.s.b);

The rule to keep in mind is that pointer arithmetic is only valid within the boundaries of the smallest subobject it points to.  This applies to structs as much as arrays.  Just like it's not valid to increment a pointer from a[0][1] to a[1][0] and dereference the latter in 'char a[2][2]; it's not valid to increment a pointer to one struct member to point to another and dereference it.
Comment 25 rguenther@suse.de 2018-07-14 09:02:24 UTC
On July 14, 2018 2:26:06 AM GMT+02:00, "msebor at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org> wrote:
>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86259
>
>--- Comment #24 from Martin Sebor <msebor at gcc dot gnu.org> ---
>The code in example #21 has the same bug:
>
>    union U u;
>    u.s = (struct S){0, 0, 0};
>
>    char *bp = u.s.b;   // <<< bp points to u.s.b
>
> uintptr_t sp_ip = (uintptr_t)bp - offsetof(struct S,b);   // sp_ip has
>u.s.b's provenance
>
>    strcpy(u.xx, "abcdefghijk");
> size_t len = strlen((char *)(union U *)sp_ip + 4);   // still the same
>provenance
>
>    puts(len == 7 ? "YES" : "NO");
>
>The strlen call is undefined because (char*)sp_ip is known to point
>just past
>the last element of u.s.b.  It wouldn't matter if there happened to be
>a valid
>string at that address -- there isn't in this case because what's there
>is a
>char[4] with no terminating NUL.  The pointer wasn't derived from that
>address.
>The pointer was derived from u.s.b and points to u.s.b + sizeof u.s.b,
>and
>there can never be anything valid beyond the end of an object. 
>
>Compile the test case with -fdump-tree-fre1=/dev/stdout to see what GCC
>sees:
>
>  bp.0_1 = (long unsigned int) &u.s.b;
>  sp_ip_9 = bp.0_1 + 18446744073709551612;
>  MEM[(char * {ref-all})&u] = MEM[(char * {ref-all})"abcdefghijk"];
>  _4 = __builtin_strlen (&u.s.b);
>
>The rule to keep in mind is that pointer arithmetic is only valid
>within the
>boundaries of the smallest subobject it points to.  This applies to
>structs as
>much as arrays.  Just like it's not valid to increment a pointer from
>a[0][1]
>to a[1][0] and dereference the latter in 'char a[2][2]; it's not valid
>to
>increment a pointer to one struct member to point to another and
>dereference
>it.

Istr the proposal suggests a -fno-provenance option. How would we handle these cases with that?
Comment 26 Bernd Edlinger 2018-07-14 10:18:36 UTC
Hmmm, does this imply that
the "container_of" macro in linux/include/kernel.h will be broken:

/**
 * container_of - cast a member of a structure out to the containing structure
 * @ptr:        the pointer to the member.
 * @type:       the type of the container struct this is embedded in.
 * @member:     the name of the member within the struct.
 *
 */
#define container_of(ptr, type, member) ({                              \
        void *__mptr = (void *)(ptr);                                   \
        BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) &&   \
                         !__same_type(*(ptr), void),                    \
                         "pointer type mismatch in container_of()");    \
        ((type *)(__mptr - offsetof(type, member))); })


Or is the arithmetic on void * exempt from this undefined behavior?
Comment 27 Davin McCall 2018-07-14 10:43:18 UTC
(In reply to Martin Sebor from comment #24)
> The code in example #21 has the same bug:
> [...]

... due to provenance, you are claiming, if I understand correctly. But I don't see anything in the current language standard that allows or even supports this reasoning (perhaps I'm missing it). For the other examples you can say that the result of the pointer arithmetic is not defined (because it is not specified by 6.5.6). But for this case, the pointer was cast to an integer type before any arithmetic was performed.

> The strlen call is undefined because (char*)sp_ip is known to point just
> past the last element of u.s.b.

It actually points at the first element of u.s.b - we start with &u.s.b, subtract the offset of that element from the container object (the offset will be 4), then add 4. I don't think this by itself invalidates what you have said, though.

>  It wouldn't matter if there happened to be
> a valid string at that address -- there isn't in this case because what's
> there is a char[4] with no terminating NUL.

That is true only if "address" means something more than "pointer value". I can assert that ((char *)sp_ip + 4) and (u.xx + 4) are equal before the strlen, and the compiler optimises away the assert. Furthermore, there is definitely a valid string at u.xx + 4 and therefore at ((char *)&u) + 4. The provenance rules you're suggesting lead to the conclusion that I can check (via an '==' comparison) if a pointer refers to a particular object, and find that it does, but then invoke undefined behaviour when dereferencing it [*]. While there may be changes in the committee pipeline that would make this the case, in the language as defined now I don't see how this interpretation can be justified.

[*] or if such a pointer comparison would also be undefined, i could anyway cast both pointers to an integer type and compare them then.

>  The pointer wasn't derived from
> that address.  The pointer was derived from u.s.b and points to u.s.b +
> sizeof u.s.b, and there can never be anything valid beyond the end of an
> object.

(It points at u.s.b, actually).

> 
> [...]  Just like it's not valid to increment a pointer from
> a[0][1] to a[1][0] and dereference the latter in 'char a[2][2]; it's not
> valid to increment a pointer to one struct member to point to another and
> dereference it.

Again, there was no pointer arithmetic (other than the line containing 'strlen', but that particular case the pointer has the address of the union object, which has been cast to (char *), and the '+ 4' should be valid then, surely, by 6.3.2.7 paragraph 7 (ignoring that it requires 'successive increments' rather than arbitrary addition, or is that supposed to be significant?).

I believe I understand the point of the provenance rules, but I do not think it is right to implement provenance as transferring to integers, on-by-default, in a compiler for the current language specification.
Comment 28 Bernd Edlinger 2018-07-14 13:58:35 UTC
(In reply to Davin McCall from comment #27)
> Again, there was no pointer arithmetic (other than the line containing
> 'strlen', but that particular case the pointer has the address of the union
> object, which has been cast to (char *), and the '+ 4' should be valid then,
> surely, by 6.3.2.7 paragraph 7 (ignoring that it requires 'successive
> increments' rather than arbitrary addition, or is that supposed to be
> significant?).

Can someone explain why the example in comment #21 works when
pointer arithmentic instead of integer arithmetic is used?

char *sp_ip = (char *)bp - offsetof(struct S,b);
strcpy(u.xx, "abcdefghijk");
size_t len = strlen((char *)(union U*)sp_ip + 4);
puts(len == 7 ? "YES" : "NO");

prints "YES"
Comment 29 Martin Sebor 2018-07-14 17:25:27 UTC
(In reply to rguenther@suse.de from comment #25)
> 
> Istr the proposal suggests a -fno-provenance option. How would we handle
> these cases with that?

The proposal is still being discussed and it's not clear how it will evolve or if the option will survive.  I'm not sure if the handling of these cases would be consistent under the option.  I imagine some would be expected to work as if whole objects were just arrays of bytes once provenance were not considered after some non-trivial pointer expression were involved, while others would still be undefined because of the array rule which is unrelated to provenance (e.g., indexing past the end of an array).  I haven't absorbed the proposals enough yet to say for sure where the line is, and even the authors are still forming their opinion on some of these cases.

(In reply to Bernd Edlinger from comment #26)
> Hmmm, does this imply that
> the "container_of" macro in linux/include/kernel.h will be broken:

The macro is broken today because it relies on undefined behavior: advancing a pointer from one subobject to another.  The latest revision of the proposal discusses some ideas that might make this and other similar examples work (e.g., a "function" such as a built-in that would make the compiler either lose the provenance of a pointer or assign it a different provenance without changing its value).  Some people have suggested that casts might make it work.

This isn't new.  Just like it's not valid to take a pointer to one array and advance it to the next and dereference it, it's not valid to take a pointer to a struct member, advance it to point to another member, and then derefernce it.  Given the following definition, the call f(2) is undefined and GCC eliminates the test on that basis.  The same rule applies to struct members.

  char a[2][2];

  void f (int i)
  {
    char c = a[1][0];
    char *p = &a[0][i];
    *p = 1;             // can only change the array a[0], not a[1]
    if (c != a[1][0])   // folded to false because a[0][i] is only defined when i is zero
      __builtin_abort ();
  }

To make it "work" this way you need to convince the compiler the two-dimensional 2 X 2 matrix is really a one dimensional 4-element array.  The following works with GCC but it's still undefined so I wouldn't recommend relying on it.  I imagine making the pointer volatile would always work (but it's still undefined).

void g (int i)
{
  char (*p)[4] = a;   // -Wincompatible-pointer-types here

  char c = (*p)[2];
  char *q = &(*p)[i];
  *q = 1;
  if (c != (*p)[2])   // not folded
    __builtin_abort ();
}
Comment 30 Martin Sebor 2018-07-14 20:07:09 UTC
(In reply to Bernd Edlinger from comment #28)
> 
> Can someone explain why the example in comment #21 works when
> pointer arithmentic instead of integer arithmetic is used?

Because the optimization (making use of the size of the referenced array) doesn't apply in the pointer case.  In the integer case, ccp simplifies the strlen argument to a COMPONENT_REF:

  Visiting statement:
  _2 = sp_ip_9 + 4;
  which is likely CONSTANT
  Match-and-simplified sp_ip_9 + 4 to bp.0_1
  Lattice value changed to CONSTANT bp.0_1.  Adding SSA edges to worklist.
  marking stmt to be not simulated again

  Visiting statement:
  _3 = (const char *) _2;
  which is likely CONSTANT
  Match-and-simplified (const char *) _2 to &u.s.b
  Lattice value changed to CONSTANT &u.s.b.  Adding SSA edges to worklist.
  marking stmt to be not simulated again

The COMPONENT_REF fully describes the structure of an access to a member and so lends itself to interesting analysis which then opens up opportunities for both optimization and bug detection (e.g., buffer overflow).

In the pointer case ccp replaces the argument with with a MEM_REF:

  Visiting statement:
  _1 = sp_ip_7 + 4;
  which is likely CONSTANT
  Lattice value changed to CONSTANT &MEM[(void *)&u + 4B].  Adding SSA edges to worklist.
  marking stmt to be not simulated again

A ME_REF is a concise but low-level way of referencing memory via a base address an an offset.  It doesn't include reliable information about the structure of the referenced memory.  It's easier to do some basic things with but much harder to use for interesting, higher level analysis.  By folding expressions into MEM_REF early on, GCC effectively disables subsequent optimizations that are designed to do interesting things at a higher level.
Comment 31 Jakub Jelinek 2018-07-26 10:58:59 UTC
GCC 8.2 has been released.
Comment 32 Richard Biener 2018-11-20 15:03:41 UTC
I am testing

diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 34c3799bc39..8dc93354dbf 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -1373,7 +1373,7 @@ get_range_strlen (tree arg, tree length[2], bitmap *visited, int type,
        {
          if (TREE_CODE (arg) == ADDR_EXPR)
            return get_range_strlen (TREE_OPERAND (arg, 0), length,
-                                    visited, type, fuzzy, flexp,
+                                    visited, type, 0 /* fuzzy */, flexp,
                                     eltsize, nonstr);
 
          if (TREE_CODE (arg) == ARRAY_REF)

and similar hunks for the SSA edge following.  I'll XFAIL diagnostic/optimization fallout.
Comment 33 Martin Sebor 2018-11-20 16:30:05 UTC
The following patch relaxes the optimization and handles the test suite fallout:
  https://gcc.gnu.org/ml/gcc-patches/2018-10/msg00129.html
It just needs to be rebased on top of the current trunk and some minor tweaking to respond to Jeff's concerns.  I plan to update and resubmit it soon.
Comment 34 Jakub Jelinek 2019-02-22 15:19:09 UTC
GCC 8.3 has been released.
Comment 35 Bernd Edlinger 2019-02-22 16:58:13 UTC
at least gcc 9 has been fixed.
Comment 36 Richard Biener 2019-04-11 11:44:24 UTC
Reconfirmed on the branch.  Not sure what rev. fixed it on the trunk.  Not working on this.
Comment 37 Martin Liška 2019-04-11 12:10:54 UTC
Fixed on trunk in r262522.