#include <stdint.h> int x, y; int main(void) { uintptr_t px = (uintptr_t) &x; uintptr_t py = (uintptr_t) &y; volatile uintptr_t d = px - py; uintptr_t p = py + d; x = 1; *(int *) p = 2; return x; } gcc 4.6(20110603) returns 1 at -O1 or higher. configure options: --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --prefix=/usr --sysconfdir=/etc --program-suffix=-4.6 --enable-languages=c,c++ --enable-checking --enable-build-with-cxx As far as I can see, this program is perfectly valid and is required to return 2. gcc seems to be optimising on the assumption that an addition to &y will not result in a pointer to a distinct object (and so stores 2 in y), but that assumption is only correct for a pointer addition, which the above is not.
A similar example, but one which does not convert the integer back to a pointer: #include <stdio.h> #include <stdlib.h> int a; int main() { unsigned long b = (unsigned long) &a - 134518548; volatile unsigned long c = b; if (c == 0) { if (b != 0) abort (); } printf("%lu\n", c); return a; } This should never abort. It aborts on my system (with -m32).
I can confirm the original testcase with r173419 (not the one from comment#1). The tree level optimizers handle this ok and we expand from <bb 2>: px_1 = (uintptr_t) &x; py_2 = (uintptr_t) &y; d.0_3 = px_1 - py_2; d ={v} d.0_3; d.1_4 ={v} d; p_5 = d.1_4 + py_2; x = 1; # PT = { x y } (glob) p.2_6 = (int *) p_5; *p.2_6 = 2; D.2722_7 = x; return D.2722_7; where you can also see that we correctly assume that p.2_6 may point either x or y. CSE1 optimizes the load from x to 1.
Ugh. base_alias_check for (symbol_ref:DI ("x") <var_decl 0x7ffff1a32000 x>) and (plus:DI (reg:DI 62 [ d.1 ]) (symbol_ref:DI ("y") <var_decl 0x7ffff1a320a0 y>)) returns 0. I'm afraid dropping the base_alias_check call would significantly penalize generated code, after all we still sometimes have MEM accesses without MEM_EXPR. Perhaps we could rely on REG_POINTER bits, extend them to SYMBOL_REF too and only do return NULL from find_base_term if SYMBOL_REF doesn't have SYMBOL_REF_POINTER bit set. During CSE etc., when replacing a pseudo with its definition REG_POINTER/SYMBOL_REF_POINTER would be kept only if it is set on the pseudo being optimized away as well. Thus, any casts from pointers to correspondingly sized integers and back would be seen in the RTL.
On Thu, 9 Jun 2011, jakub at gcc dot gnu.org wrote: > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49330 > > Jakub Jelinek <jakub at gcc dot gnu.org> changed: > > What |Removed |Added > ---------------------------------------------------------------------------- > CC| |jakub at gcc dot gnu.org > > --- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-06-09 10:04:28 UTC --- > Ugh. > base_alias_check for > (symbol_ref:DI ("x") <var_decl 0x7ffff1a32000 x>) > and > (plus:DI (reg:DI 62 [ d.1 ]) > (symbol_ref:DI ("y") <var_decl 0x7ffff1a320a0 y>)) > returns 0. I'm afraid dropping the base_alias_check call would significantly > penalize generated code, after all we still sometimes have MEM accesses without > MEM_EXPR. Perhaps we could rely on REG_POINTER bits, extend them to SYMBOL_REF > too and only do return NULL from find_base_term if SYMBOL_REF doesn't have > SYMBOL_REF_POINTER bit set. During CSE etc., when replacing a pseudo with its > definition REG_POINTER/SYMBOL_REF_POINTER would be kept only if it is set on > the pseudo being optimized away as well. Thus, any casts from pointers to > correspondingly sized integers and back would be seen in the RTL. Maybe restrict the base_alias_check to var-decls that do not have their address taken? Points-to analysis should conver them. Richard.
The testcase also fails similarly without the volatile qualification of d when compiling with -O -fno-tree-fre -fno-tree-forwprop -fno-tree-reassoc
find_base_value/find_base_term seem to be overly optimistic, prefer to return something over being conservative. E.g. if both operands of PLUS or MINUS return non-NULL find_base_term, we just return the first one, etc. It doesn't differentiate between "certainly not based on something" - e.g. CONST_INT, regs only initialized by them and only incremented by constant etc. would be such and unknown. While the overly optimistic implementation might be useful in some cases, at least for base_alias_check it is inappropriate.
The alias.c machinery is clearly based on the fundamental assumption of pointer arithmetics, i.e. that you aren't allowed to compute a difference unless both pointers point to the same enclosing object. The testcase is a nice attempt at smuggling the violation of this rule behind a uintptr_t based manipulation. Not clear what to do IMO. Richard is proposing to leverage the escape sites: px_1 = (uintptr_t) &x; py_2 = (uintptr_t) &y; but this will pessimize. Ideally we should leverage the one problematic line: p.2_6 = (int *) p_5; which creates the pointer out of the integer, but it has already disappeared in the very first RTL representation.
(In reply to comment #7) > The alias.c machinery is clearly based on the fundamental assumption of pointer > arithmetics, i.e. that you aren't allowed to compute a difference unless both > pointers point to the same enclosing object. The testcase is a nice attempt at > smuggling the violation of this rule behind a uintptr_t based manipulation. > > Not clear what to do IMO. Richard is proposing to leverage the escape sites: > > px_1 = (uintptr_t) &x; > py_2 = (uintptr_t) &y; > > but this will pessimize. Ideally we should leverage the one problematic line: > > p.2_6 = (int *) p_5; > > which creates the pointer out of the integer, but it has already disappeared in > the very first RTL representation. Creating that pointer is perfectly valid - you are allowed to cast a pointer to an uintptr_t and back, which is what the code does (in some obfuscated way of course). p.2_6 = (int *) ((uintptr_t) &y + ((uintptr_t) &x - (uintptr_t) &y))) which is, as is trivially obvious, (int *) (uintptr_t) &x. Consider int foo(uintptr_t a) { uintptr_t px = (uintptr_t) &x; uintptr_t py = a; volatile uintptr_t d = px - py; uintptr_t p = py + d; x = 1; *(int *) p = 2; return x; } int main() { foo(&y); } which is equivalent.
> Creating that pointer is perfectly valid - you are allowed to cast a > pointer to an uintptr_t and back, which is what the code does (in some > obfuscated way of course). No disagreement, "problematic" only in the sense that it breaks the assumption of the aliasing machinery, as you're creating a pointer out of nothing.
(In reply to comment #9) > > Creating that pointer is perfectly valid - you are allowed to cast a > > pointer to an uintptr_t and back, which is what the code does (in some > > obfuscated way of course). > > No disagreement, "problematic" only in the sense that it breaks the assumption > of the aliasing machinery, as you're creating a pointer out of nothing. Indeed. I fixed similar bugs in tree points-to analysis, like int x, y; int foo (int *p) { int *q = &x; int *p = &y; int i; for (i=0; i<sizeof(int *); ++i) ((char *)&q)[i] = ((char *)&p)[i]; *p = 1; *q = 2; return *p; } which I suspect might work on RTL only by accident (at -O3 with the loop unrolled).
*** Bug 52517 has been marked as a duplicate of this bug. ***
Re-confirmed on trunk.
*** Bug 86026 has been marked as a duplicate of this bug. ***
Reconfirmed.
IMHO as RTL drops the difference between pointers and integers (of the same mode) it has to drop the assumption that pointer arithmetic has to stay inside a pointed-to object similar to how it has to drop reliance on undefined overflow for optimization since it drops the notion of signedness. The other choice may seem to take REG_POINTER setting conservative and only have the assumption on REG_POINTER regs. (and thus make sure to set REG_POINTER and MEM_POINTER conservatively, which may be a difficult task on its own)
What do you think about the suggestion made in the most recent duplicate, namely expanding GIMPLE pointer-to-integer casts to non-transparent RTL assignments, i.e. going from val = (intptr_t) ptr; to asm ("" : "=g" (rval) : "0" (rptr)); Wouldn't this plug the hole in one shot instead of chasing down missing REG_POINTERs in multiple RTL passes?
On Mon, 4 Jun 2018, amonakov at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49330 > > Alexander Monakov <amonakov at gcc dot gnu.org> changed: > > What |Removed |Added > ---------------------------------------------------------------------------- > CC| |amonakov at gcc dot gnu.org > > --- Comment #16 from Alexander Monakov <amonakov at gcc dot gnu.org> --- > What do you think about the suggestion made in the most recent duplicate, > namely expanding GIMPLE pointer-to-integer casts to non-transparent RTL > assignments, i.e. going from > > val = (intptr_t) ptr; > > to > > asm ("" : "=g" (rval) : "0" (rptr)); > > Wouldn't this plug the hole in one shot instead of chasing down missing > REG_POINTERs in multiple RTL passes? I suspect such assignments are simply too common and doing this would be worse than not assuming pointer arithmetic cannot cross different objects at the RTL level. How do you for example treat an aggregate memcpy of a structure containing pointers RTL expanded to say word-wise (integer-mode!) copies and then RTL afterwards figuring out reg-reg dependence chains from that. You have to realize that you'd have to introduce those "barriers" during RTL optimization itself... in which case you're back at sth like REG_POINTER. I think it would be nice to isolate this assumption on RTL and put it behind a user-crontrollable flag. That would allow benchmarking this.
So for find_base_term to compute sth conservative we'd need to track RTX_SURELY_NON_POINTER (what RTX is surely _not_ based on a pointer and thus can be ignored). And when find_base_term ever figures two bases in say a PLUS it has to conservatively return 0. I fear the existing REG_POINTER does not help at all. For the testcase we have (plus:DI (reg:DI 83 [ d.0_2 ]) (symbol_ref:DI ("y") [flags 0x2] <var_decl 0x7ffff7fefb40 y>)) where reg:DI 83 is not marked with REG_POINTER and find_base_term doesn't find it to be an alternate base. For the testcase the offending MEM has a MEM_EXPR and we have proper points-to info. IMHO the proper solution is to kill base_alias_check or all problematic cases in find_base_term (binary ops with more than one non-CONST_INT operand). And eventually make sure to more properly preserve MEM_EXPRs. Maybe sth as "simple" as the following which of course fixes the testcase but will make find_base_term fail on any variable-indexed thing. diff --git a/gcc/alias.c b/gcc/alias.c index 93f53543d12..3a66e10b431 100644 --- a/gcc/alias.c +++ b/gcc/alias.c @@ -2009,12 +2009,14 @@ find_base_term (rtx x, vec<std::pair<cselib_val *, rtx base = find_base_term (tmp1, visited_vals); if (base != NULL_RTX && ((REG_P (tmp1) && REG_POINTER (tmp1)) - || known_base_value_p (base))) + || known_base_value_p (base)) + && CONST_INT_P (tmp2)) return base; base = find_base_term (tmp2, visited_vals); if (base != NULL_RTX && ((REG_P (tmp2) && REG_POINTER (tmp2)) - || known_base_value_p (base))) + || known_base_value_p (base)) + && CONST_INT_P (tmp1)) return base; /* We could not determine which of the two operands was the
(In reply to Richard Biener from comment #18) > So for find_base_term to compute sth conservative we'd need to track > RTX_SURELY_NON_POINTER (what RTX is surely _not_ based on a pointer > and thus can be ignored). And when find_base_term ever figures > two bases in say a PLUS it has to conservatively return 0. > > I fear the existing REG_POINTER does not help at all. For the testcase > we have > > (plus:DI (reg:DI 83 [ d.0_2 ]) > (symbol_ref:DI ("y") [flags 0x2] <var_decl 0x7ffff7fefb40 y>)) > > where reg:DI 83 is not marked with REG_POINTER and find_base_term > doesn't find it to be an alternate base. For the testcase the > offending MEM has a MEM_EXPR and we have proper points-to info. > > IMHO the proper solution is to kill base_alias_check or all problematic > cases in find_base_term (binary ops with more than one non-CONST_INT > operand). > > And eventually make sure to more properly preserve MEM_EXPRs. > > Maybe sth as "simple" as the following which of course fixes the > testcase but will make find_base_term fail on any variable-indexed > thing. > > diff --git a/gcc/alias.c b/gcc/alias.c > index 93f53543d12..3a66e10b431 100644 > --- a/gcc/alias.c > +++ b/gcc/alias.c > @@ -2009,12 +2009,14 @@ find_base_term (rtx x, vec<std::pair<cselib_val *, > rtx base = find_base_term (tmp1, visited_vals); > if (base != NULL_RTX > && ((REG_P (tmp1) && REG_POINTER (tmp1)) > - || known_base_value_p (base))) > + || known_base_value_p (base)) > + && CONST_INT_P (tmp2)) > return base; > base = find_base_term (tmp2, visited_vals); > if (base != NULL_RTX > && ((REG_P (tmp2) && REG_POINTER (tmp2)) > - || known_base_value_p (base))) > + || known_base_value_p (base)) > + && CONST_INT_P (tmp1)) > return base; > > /* We could not determine which of the two operands was the "benchmarking" this by comparing cc1 with/without shows a difference mostly in scheduling (but the number of differences is comparatively small!). Also overall text size shrinks with the patch (whatever that means). On GIMPLE we try hard to not construct addresses "based" on the wrong object, in fact IVOPTs has code to avoid building IVs based on things like &a - &b and propagation avoids turning unintptr_t arithmetic into pointer arithmetic even if it can see the converted from addresses. All those things cannot be done on RTL since we lost the distinction between pointers and integers and there's only PLUS. So I have a _very_ hard time seeing how RTL can ever be fixed to discover bases for alias analysis purposes without just resorting to MEM_EXPRs. That is, unless we want to live with this kind of wrong-code bugs. Similarly fishy is may_be_sp_based_p.
For stage3/gcc/*.o statistics show we perform 21051052 base_alias_check calls and in the end 706852 times it is the one that would have disambiguated things compared to if we remove it (thus as if we do base_alias_check last). Note there's also base = find_base_term (x_addr); if (base && (GET_CODE (base) == LABEL_REF || (GET_CODE (base) == SYMBOL_REF && CONSTANT_POOL_ADDRESS_P (base)))) return 0; which is suspicious but I guess harder to hit in practice so things go wrong. base_alias_check is not exactly the first thing we check (but nearly) so we'd roughly lose 3% disambiguations from RTL alias analysis if we scrap base_alias_check completely. That's probably too much. Note the CONSTANT_POOL_ADDRESS_P thing isn't necessary and subsumed by following checks so we could remove that without losing anything (it hits only 84 times at all in the above set and later checks subsume it).
Created attachment 45389 [details] statistic patch patch I added to record statistics
Things we fail to disambiguate are (mem:TF (pre_dec:SI (reg/f:SI 7 sp)) [0 S16 A8]) vs. (mem/c:TF (plus:SI (reg/f:SI 19 frame) (const_int -16 [0xfffffffffffffff0])) [1 S16 A128]) or (mem:SI (pre_dec:SI (reg/f:SI 7 sp)) [3 S4 A32]) vs. (mem/f/c:SI (symbol_ref:SI ("argv") [flags 0x2] <var_decl 0x7f782d1e6360 argv>) [2 argv+0 S4 A32]) where I don't find anything besides CSELIB cselib_sp_based_value_p handling in find_base_term that could be the one handling it? I guess we should be able to somehow handle both sp and frame based accesses in a more conservative way?
(In reply to Richard Biener from comment #22) > Things we fail to disambiguate are > > (mem:TF (pre_dec:SI (reg/f:SI 7 sp)) [0 S16 A8]) > vs. > (mem/c:TF (plus:SI (reg/f:SI 19 frame) > (const_int -16 [0xfffffffffffffff0])) [1 S16 A128]) > > or > > (mem:SI (pre_dec:SI (reg/f:SI 7 sp)) [3 S4 A32]) > vs. > (mem/f/c:SI (symbol_ref:SI ("argv") [flags 0x2] <var_decl 0x7f782d1e6360 > argv>) [2 argv+0 S4 A32]) > > where I don't find anything besides CSELIB cselib_sp_based_value_p handling > in find_base_term that could be the one handling it? > > I guess we should be able to somehow handle both sp and frame based > accesses in a more conservative way? it's really 99% like this which is why eventually that CONST_INT restriction worked so "well". Can we easily identify spill slot accesses somehow? The parameter accesses (frame references?) should simply get appropriate MEM_EXPRs IMHO.
On GCC testcases one large group of MEMs only disambiguated through base_alias_check is disambiguations agains DSEs group_info->base_mem which is BLKmode mems based on some "base" pointer. This base_mem lacks a MEM_EXPR but I think it shouldn't be difficult to add one, like with (completely lacking sanity testing): diff --git a/gcc/dse.c b/gcc/dse.c index 389c52d4284..098c77165de 100644 --- a/gcc/dse.c +++ b/gcc/dse.c @@ -1097,6 +1097,7 @@ canon_address (rtx mem, { machine_mode address_mode = get_address_mode (mem); rtx mem_address = XEXP (mem, 0); + tree mem_expr = MEM_EXPR (mem); rtx expanded_address, address; int expanded; @@ -1165,6 +1166,9 @@ canon_address (rtx mem, && const_or_frame_p (address)) { group_info *group = get_group_info (address); + if (!MEM_EXPR (group->base_mem) + && mem_expr) + set_mem_expr (group->base_mem, get_base_address (mem_expr)); if (dump_file && (dump_flags & TDF_DETAILS)) { btw, the disambiguations like (mem/c:SI (symbol_ref:DI ("g") [flags 0x2] <var_decl 0x7fe01f78a510 g>) [1 g+0 S4 A32]) vs. (mem:DI (pre_dec:DI (reg/f:DI 7 sp)) [0 S8 A8]) are handled through REG_BASE_VALUE which assigns 'sp' (address:DI -1). I believe we should be working towards adding proper MEM_EXPRs to more places and simply make find_base_term more conservative which means simplifying the PLUS/MINUS cases.
When considering the patch from comment#18 additional data is that only 95802 out of 636160 disambiguations that ultimately require base_alias_check involve non-CONST_INT_P "other" operand. That is out of 9531871 total cases that would run into base_alias_check, or 1%. This is w/o "fixing" DSE (the simple patch of course miscompiles things).
(In reply to Richard Biener from comment #25) > When considering the patch from comment#18 additional data is that only > 95802 out of 636160 disambiguations that ultimately require base_alias_check > involve non-CONST_INT_P "other" operand. That is out of 9531871 total > cases that would run into base_alias_check, or 1%. > > This is w/o "fixing" DSE (the simple patch of course miscompiles things). Sorting by invoking pass this is 236 combine 8138 cse1 8404 cse2 1423 cse_local 144964 dse1 83650 dse2 350 ira 32000 postreload 32758 sched2 75866 vartrack so the majority comes from DSE (note the above numbers are for the full boostrap so not comparable to the numbers quoted which are just for stage3-gcc/*.o) From CSE I also see the weird (mem:TF (plus:SI (reg/f:SI 7 sp) (scratch:SI)) [0 S16 A8]) vs. (mem/c:TF (plus:SI (reg/f:SI 19 frame) (const_int -16 [0xfffffffffffffff0])) [1 S16 A128]) what's this (scratch:SI)!? Anyway, I wonder how with unknown scratch value base_alias_check can figure out the above do not alias.
*** Bug 92462 has been marked as a duplicate of this bug. ***
I see the same even with pure pointers. I guess RTL doesn't care about such differences but it means the problem could bite a relatively innocent code. ---------------------------------------------------------------------- #include <stdio.h> __attribute__((noipa)) // imagine it in a separate TU static int *opaque(int *p) { return p; } int main() { static int x, y; int *r = opaque(&x) + (opaque(&y) - &y); x = 1; *r = 2; printf("x = %d\n", x); } ---------------------------------------------------------------------- $ gcc -std=c11 -pedantic -Wall -Wextra test.c && ./a.out x = 2 $ gcc -std=c11 -pedantic -Wall -Wextra -O3 test.c && ./a.out x = 1 ---------------------------------------------------------------------- gcc x86-64 version: gcc (GCC) 10.0.0 20191230 (experimental) ----------------------------------------------------------------------
(In reply to Alexander Cherepanov from comment #28) > I see the same even with pure pointers. I guess RTL doesn't care about such > differences but it means the problem could bite a relatively innocent code. Can you please open a separate bugreport for this and reference the new bug # here? It's a separate issue, and it's also a regression, gcc-4.7 did not miscompile this. The responsible pass seems to be RTL DSE.
Sure, I've filed pr93105. Thanks for the analysis!
*** Bug 93105 has been marked as a duplicate of this bug. ***
*** Bug 88433 has been marked as a duplicate of this bug. ***
*** Bug 114404 has been marked as a duplicate of this bug. ***
I'd like to provide a more subtle example: #include <stdio.h> #include <stdint.h> int main() { int a = -1; int b = -1; uintptr_t ia = (uintptr_t)&a; uintptr_t ib = (uintptr_t)(&b + 1); if (ia == ib) { *(int*)(ia == ib ? ia : ib) = 42; printf("%d\n", a); } return 0; } It prints -1 compiled by gcc -O1 (https://godbolt.org/z/ecf4Kjc16).
(In reply to Namniav from comment #34) > I'd like to provide a more subtle example: > > #include <stdio.h> > #include <stdint.h> > > int main() { > int a = -1; > int b = -1; > uintptr_t ia = (uintptr_t)&a; > uintptr_t ib = (uintptr_t)(&b + 1); > if (ia == ib) { > *(int*)(ia == ib ? ia : ib) = 42; > printf("%d\n", a); > } > return 0; > } > > It prints -1 compiled by gcc -O1 (https://godbolt.org/z/ecf4Kjc16). It's "subtle" only in a way showing that we simplify ia == ib ? ia : ib to ib and that we simplify (int *)(uintptr_t)(&b + 1) to (&b + 1). Not doing the latter wouldn't help as points-to analysis also tracks provenance through integers. I'm quite sure we do the simplification even iff ia and ib were pointers. The example shows we not only perform copy substitution through equivalences during optimization passes but also in expression folding.
(In reply to Richard Biener from comment #35) > It's "subtle" only in a way showing that we simplify ia == ib ? ia : ib > to ib and that we simplify (int *)(uintptr_t)(&b + 1) to (&b + 1). But it's in the branch of `if (ia == ib)`. Did you mean that the result of `ia == ib` is allowed to be inconsistent?
(In reply to Namniav from comment #36) > (In reply to Richard Biener from comment #35) > > It's "subtle" only in a way showing that we simplify ia == ib ? ia : ib > > to ib and that we simplify (int *)(uintptr_t)(&b + 1) to (&b + 1). > > But it's in the branch of `if (ia == ib)`. Did you mean that the result of > `ia == ib` is allowed to be inconsistent? No, I pointed out the expression 'ia == ib ? ia : ib' is simplified to 'ib' independent on context because that simplification is always valid. This is just another way where provenance is wrongly associated with equivalences.
Hmm, something changed in GCC 14 .... The original testcase now works so does the testcase in PR 93105 .
(In reply to Andrew Pinski from comment #38) > Hmm, something changed in GCC 14 .... The original testcase now works so > does the testcase in PR 93105 . Oh it was the fix for PR 113255 .
(In reply to Andrew Pinski from comment #39) > (In reply to Andrew Pinski from comment #38) > > Hmm, something changed in GCC 14 .... The original testcase now works so > > does the testcase in PR 93105 . > > Oh it was the fix for PR 113255 . Yeah, I finally fixed the RTL side of things ...