Input: extern int x; extern int a; extern int b; int f() { int y = x; return *(&y + (a + b)); } With -O3, trunk outputs: f(): movl x(%rip), %eax movl %eax, -4(%rsp) movl b(%rip), %eax addl a(%rip), %eax cltq movl -4(%rsp,%rax,4), %eax ret Clang, on the other hand, infers that (a + b) == 0: f(): # @f() movl x(%rip), %eax retq From richi on IRC: """ we don't do this kind of optimization at the moment a related one would be to place a if (a+b != 0) link_error (); after the memory access similarly for array accesses an if (i not-in-range) link_error () thus, we do not derive ranges for address-computation parts [at dereference sites] """
Confirmed.
Note for int x; ... *(&x + (a + b)) if x is common we need to take -funconstrained-commons into account, similarly for vars that end with flexible array members or similar arrays.
A simpler test case that shows the same optimization opportunity is: int f (int i) { extern int x; int y; return &y + i == &x; } A test case that both Clang and GCC get wrong is below. (This will be discussed at the next WG14 meeting -- see http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2222.htm#q23-can-one-do-comparison-between-pointers-to-objects-of-compatible-types-with-different-provenances-that-are-not-strictly-within-their-original-allocations) int g (int i) { int x, y, z; int *p = &x + i; int *q = &y; return p == q; }
On April 10, 2018 4:50:49 PM GMT+02:00, "msebor at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org> wrote: >https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85315 > >Martin Sebor <msebor at gcc dot gnu.org> changed: > > What |Removed |Added >---------------------------------------------------------------------------- > CC| |msebor at gcc dot gnu.org > >--- Comment #3 from Martin Sebor <msebor at gcc dot gnu.org> --- >A simpler test case that shows the same optimization opportunity is: > > int f (int i) > { > extern int x; > int y; > return &y + i == &x; > } > >A test case that both Clang and GCC get wrong is below. (This will be >discussed at the next WG14 meeting -- see >http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2222.htm#q23-can-one-do-comparison-between-pointers-to-objects-of-compatible-types-with-different-provenances-that-are-not-strictly-within-their-original-allocations) > > int g (int i) > { > int x, y, z; > > int *p = &x + i; > int *q = &y; > > return p == q; > } If you eventually expect a true result then please no - this should be undefined.
(In reply to rguenther@suse.de from comment #4) ... > If you eventually expect a true result then please no - this should be > undefined. The second test case in comment #4 is currently well-defined in C17 (by 6.5.9, p6) and requires the function to return true if x and y are adjacent and i is +/-1. (I'm not defending the requirement.) There is a proposal for C2X to make the result of the equality comparison between a past-the-end pointer and another pointer unspecified instead (N2219). Would you prefer it to be undefined? (Making it undefined would be a rather drastic departure from the historical requirement and likely objectionable to the authors.) See http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2219.htm (Scroll down to the Proposed Technical Corrigendum section at the end of the paper to read the proposed changes.)
On Tue, 10 Apr 2018, msebor at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85315 > > --- Comment #5 from Martin Sebor <msebor at gcc dot gnu.org> --- > (In reply to rguenther@suse.de from comment #4) > ... > > If you eventually expect a true result then please no - this should be > > undefined. > > The second test case in comment #4 is currently well-defined in C17 (by 6.5.9, > p6) and requires the function to return true if x and y are adjacent and i is > +/-1. (I'm not defending the requirement.) > > There is a proposal for C2X to make the result of the equality comparison > between a past-the-end pointer and another pointer unspecified instead (N2219). > Would you prefer it to be undefined? (Making it undefined would be a rather > drastic departure from the historical requirement and likely objectionable to > the authors.) Yes, it should be undefined. GCC treats it as undefined at the moment because otherwise points-to analysis cannot be used to optimize pointer equality tests at all which would be a major showstopper. Also consider the following: int main() { int a, b = 0; int *p = &a+1; if (p == &b) *p = 1; return b; } is that now valid and required to return 1 if b appens to be adjacent to a? points-to says p points to a and thus cannot be used to modify b. And IIRC *(&a+1) isn't valid but if &a+1 == &b isn't undefined but is required to return true if they are adjacent what sense does it make to _not_ allow a pointer that is equal to &b to access b? > See http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2219.htm > (Scroll down to the Proposed Technical Corrigendum section at the end of the > paper to read the proposed changes.) It's not clear to me how "happens to immediately follow the first array object in the address space" not always contradicts "single provenance". &a+1 and &b have different provenance, no? Given GCC has leeway to pad declarations an implementation could choose to always pad variables by at least one byte and thus never making "happens to immediately follow the first array object in the address space" true? What's the intended use of that special case? Is it to make the access via p above valid? If not how is it useful to allow this kind of comparison? Users can always go via [u]intptr_t if they want to compare pointers of different "provenance" so why's it necessary to add an ability to do so without??? That just needessly complicates implementations IMHO. So that -f[no-]provenance stuff is bollocks IMHO, whoever invented it must be making sth up on a sheet of paper rather than addressing a real problem.
I asked Peter about that yesterday. The access to *p in your example is still meant to be undefined even under the proposed provenance rules. Here's his response: http://www.open-std.org/jtc1/sc22/wg14/15051 It's not yet entirely clear to me how to derive this answer from the proposed wording but if it can't be it's presumably just a bug that will be fixed. I think the intent of the "happens to immediately follow the first array object in the address space" was to allow implementations to return true for the (&a + i == &b) kind of equalities and the text just came out wrong (as a requirement rather than a permission). The text was added in C99 but I haven't yet been able to find the paper that added it. The implications of the "immediately follows the first array" sentence for equality having to return true would be at odds with the "single provenance" and so the result of the equality is only specified with -fno-provenance (otherwise it's unspecified). Regarding your suggestion for the equality being undefined, since it's well-defined in C and explicitly unspecified in C++, changing it to undefined in C would introduce an incompatibility with C++. That would go against WG14's goal (part of C2X charter) to align more closely with C++. To make it pass in WG14, C++ would most likely have to agree to change first. But I'll bring it up during the meeting. The rationale for the-f[no-]provenance option is discussed in N2119. It sounds like the authors think it disabling the provenance rules might make for more intuitive behavior and help avoid bugs. I'll also bring it at the meeting.
On Wed, 11 Apr 2018, msebor at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85315 > > --- Comment #7 from Martin Sebor <msebor at gcc dot gnu.org> --- > I asked Peter about that yesterday. The access to *p in your example is still > meant to be undefined even under the proposed provenance rules. Here's his Even with -fno-provenance? > response: http://www.open-std.org/jtc1/sc22/wg14/15051 It's not yet entirely > clear to me how to derive this answer from the proposed wording but if it can't > be it's presumably just a bug that will be fixed. > > I think the intent of the "happens to immediately follow the first array > object in the address space" was to allow implementations to return true for > the (&a + i == &b) kind of equalities and the text just came out wrong (as a > requirement rather than a permission). The text was added in C99 but I haven't > yet been able to find the paper that added it. OK, even GCC might end up computing true if we don't optimize but translate to a cmp instruction and var layout happens to be. Like for -O0 and int *p = &a + i; int *q = &b; if (p == q) ... but at -O1 we'd always compute it as false. > The implications of the "immediately follows the first array" sentence for > equality having to return true would be at odds with the "single provenance" > and so the result of the equality is only specified with -fno-provenance > (otherwise it's unspecified). > > Regarding your suggestion for the equality being undefined, since it's > well-defined in C and explicitly unspecified in C++, changing it to undefined > in C would introduce an incompatibility with C++. That would go against WG14's I think for the compiler "unspecified" works as well - we can still unconditionally optimize it to return false, correct? > goal (part of C2X charter) to align more closely with C++. To make it pass in > WG14, C++ would most likely have to agree to change first. But I'll bring it > up during the meeting. > > The rationale for the-f[no-]provenance option is discussed in N2119. It sounds > like the authors think it disabling the provenance rules might make for more > intuitive behavior and help avoid bugs. I'll also bring it at the meeting. The issue I see is that implementing -fno-provenance is impossible (well, we might be able to alias it to -O0 and disable all folding). But I don't have time to read up its full specification/rationale right now.
(In reply to rguenther@suse.de from comment #8) > > I asked Peter about that yesterday. The access to *p in your example is still > > meant to be undefined even under the proposed provenance rules. Here's his > > Even with -fno-provenance? With -fno-provenance the access would be valid under the proposed rules. > I think for the compiler "unspecified" works as well - we can still > unconditionally optimize it to return false, correct? Correct.
OK, so whats the deal here. I can't really follow what the final request, or action is. Is there a conclusion on what needs to be done? if anything?
(In reply to Andrew Macleod from comment #10) > OK, so whats the deal here. I can't really follow what the final request, or > action is. > > Is there a conclusion on what needs to be done? if anything? See the original description - the request was to derive ranges for the offset operand in pointer arithmetic based on the size of the object offsetted. In the simple example a plain 'int' which is (in GIMPLE) offsetted by 4 * (a + b) where we should derive that a + b is zero.
Maybe I'm a little dense. if we are presuming that &x + (a + b) implies a + b == 0, then we also should assume that &x + a implies a == 0 and if we can make those assumptions, then &x + 1 is garbage because we can assume 1 == 0. And if a and b are both unsigned, then I guess we can also assume a == b == MAX_UINT / 2 ? Now, if we decided to actually do this... I see IL: <bb 2> : x.0_1 = x; y = x.0_1; a.1_2 = a; b.2_3 = b; _4 = a.1_2 + b.2_3; _5 = (long unsigned int) _4; _6 = _5 * 4; _7 = &y + _6; The clear implications is that _6 == 0 in this expression? If we implemented that in the operator_pointer_plus::op1_range routine, and then were to back substitute, we'd get (_6)[0,0] = _5 * 4 -> _5 = [0,0] (_5)[0,0] = (long unsigned int) _4; -> _4 == [0,0] (_4)[0,0] = a.1_2 + b.2_3 which gives us nothing additional... Other than a potential relationship to track I suppose a.1_2 == -B.2_3 for signed, but it would record that _4 is [0,0] when we calculate an outgoing range. but regardless, its seems that another straightforward place to do this would be in statement folding? Isn't the basic assumption: _7 = &y + _6; implies _6 is always 0, which would enable us to fold this to _7 = &y then _6 is unused and the other statements would ultimately just go away. So why not make folding simply throw away the "+ _6" part because it is now being forced to be 0? We can't really assume that it is [0,0], but then not use that information to optimize?
(In reply to Andrew Macleod from comment #12) > Maybe I'm a little dense. > > if we are presuming that > &x + (a + b) > implies a + b == 0, then we also should assume that &x + (a + b) for scalar x doesn't imply a + b == 0, it implies a + b <= 1. Only when it is dereferenced, i.e. (&x)[a + b] is accessed a + b has to be 0.
Andrew, we discussed the same idea for built-in functions at Couldron. For instance, in: void f (const char *s, int n) { char a[8]; memcpy (a, s, n); // n must be in [0, 8] if (n < 0 || n > 8) // fold to false abort (); }
On November 18, 2020 3:55:44 PM GMT+01:00, amacleod at redhat dot com <gcc-bugzilla@gcc.gnu.org> wrote: >https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85315 > >--- Comment #12 from Andrew Macleod <amacleod at redhat dot com> --- >Maybe I'm a little dense. > >if we are presuming that > &x + (a + b) >implies a + b == 0, then we also should assume that > > &x + a implies a == 0 > >and if we can make those assumptions, then >&x + 1 is garbage because we can assume 1 == 0. > >And if a and b are both unsigned, then I guess we can also assume a == >b == >MAX_UINT / 2 ? > > >Now, if we decided to actually do this... I see IL: > ><bb 2> : > x.0_1 = x; > y = x.0_1; > a.1_2 = a; > b.2_3 = b; > _4 = a.1_2 + b.2_3; > _5 = (long unsigned int) _4; > _6 = _5 * 4; > _7 = &y + _6; > >The clear implications is that _6 == 0 in this expression? > >If we implemented that in the operator_pointer_plus::op1_range routine, >and >then were to back substitute, we'd get >(_6)[0,0] = _5 * 4 -> _5 = [0,0] >(_5)[0,0] = (long unsigned int) _4; -> _4 == [0,0] >(_4)[0,0] = a.1_2 + b.2_3 which gives us nothing additional... Other >than a >potential relationship to track I suppose a.1_2 == -B.2_3 for signed, >but it >would record that _4 is [0,0] when we calculate an outgoing range. > >but regardless, its seems that another straightforward place to do this >would >be in statement folding? Isn't the basic assumption: > >_7 = &y + _6; >implies _6 is always 0, which would enable us to fold this to >_7 = &y >then _6 is unused and the other statements would ultimately just go >away. > >So why not make folding simply throw away the "+ _6" part because it is >now >being forced to be 0? We can't really assume that it is [0,0], but >then not >use that information to optimize? Well, clearly the zero case is degenerate but it extends to sth like int a[2] ; and &a + n. I guess you're already handling ARRAY_REF indices.
(In reply to Jakub Jelinek from comment #13) > (In reply to Andrew Macleod from comment #12) > > Maybe I'm a little dense. > > > > if we are presuming that > > &x + (a + b) > > implies a + b == 0, then we also should assume that > > &x + (a + b) for scalar x doesn't imply a + b == 0, it implies a + b <= 1. > Only when it is dereferenced, i.e. (&x)[a + b] is accessed a + b has to be 0. OK. certain things about this still confuse me, but thats OK for now. we'll come back to them. There seems to be 2 things at play here: 1) lhs = ptr + X has certain implications on X, it ptr is &scalar then X = [0, 1] * sizeof (&scalar)? 2) if lhs is later de-referenced, then X is known to have been [0,0]? We're ignoring the nonscalar cases of ptr for now, but Im guessing they are similar just the values for X are determined differently. 1) a) could be handled with something like a previously mentioned range_after_stmt() API for operands which are affected by statements. b) It can also be impacted by op2_range() during a wind back, but would likely require some tweaking of gimple_range_calc_op2() to determine that 'ptr' satisfied the criteria of this scalar condition. the a) solution would eliminate the necessity of this. 2) This one is a bit trickier. We cant use the non-null property since &x + blah will already make it non-null... so you are looking for an actual dereference of itself or an equivalence? I can probably tweak the non-null property manager to indicate whether the non-null property also contains a dereference.. but I guess you would need to know that ALL paths contain a dereference. We'd probably get most of what we're looking for if we simply check for a dereference of LHS in the same block? and then assert that X is 0. Is that he basic gist? Regardless, I'll come back to this eventually and get someone to clarify all the intricacies for me, IM just trying to understand the general requirements.