Consider the following MWE: extern void* malloc (long unsigned int); void fun (unsigned *x) { unsigned *a = malloc (*x); if (a == 0) return; if (a != x) // (A) *a = *x; *x = *a; } If compiled with GCC 10, then no warning is emitted. If compiled with GCC HEAD (currently 84005b8abf9), then the following warning is emitted: gcc -W -c -O2 mwe.c mwe.c: In function 'fun': mwe.c:8:8: warning: '*(a)' may be used uninitialized [-Wmaybe-uninitialized] 8 | *x = *a; | ^~ Rational why this example is strictly speaking fine: 1) Assume x is a valid pointer. Then malloc will either return a nullpointer and we return, or a pointer to a fresh object which address is different from any other existing object. Thus (A) always evaluates to true which means *a is initialized. 2) Assume x is an invalid pointer. Then dereferencing x prior to the call to malloc already results in UB. Thus the only case in which condition (A) may evaluate to false is when x is an invalid pointer (e.g. previously malloc'd and then free'd such that a further call to malloc returns the very same address rendering (A) false) results in UB. Since this is a maybe warning I'm wondering whether this is considered a bug or is acceptable. Any thoughts?
I think there are duplicates about the fact that while gcc knows that a and x cannot alias (if you read *x, write to *a, then read from *x again, gcc reuses the first value), it does not use that information to fold comparisons between x and a.
Started with r11-959-gb825a22890740f341eae566af27e18e528cd29a7
r11-959 enabled -Wuninitialized for allocated objects (alloca/VLA and malloc). The underlying problem is the failure to take advantage of the non-aliasing guarantee of the return value of these functions. One report that describes this limitation is pr87313. A slightly simpler but even more obvious test case is: $ cat pr96564.c && gcc -O2 -S -Wall -Wextra -fdump-tree-optimized=/dev/stdout pr96564.c void f (int *x) { int a[*x]; if (a != x) *a = 0; *x = *a; } pr96564.c: In function ‘f’: pr96564.c:8:6: warning: ‘*(<unknown>)[0]’ may be used uninitialized [-Wmaybe-uninitialized] 8 | *x = *a; | ~~~^~~~ ;; Function f (f, funcdef_no=0, decl_uid=1931, cgraph_uid=1, symbol_order=0) Removing basic block 5 f (int * x) { int[0:D.1936] * a.0; sizetype _2; int _8; sizetype _9; int prephitmp_18; int pretmp_20; <bb 2> [local count: 1073741824]: _8 = *x_1(D); _2 = (sizetype) _8; _9 = _2 * 4; a.0_11 = __builtin_alloca_with_align (_9, 32); if (x_1(D) != a.0_11) goto <bb 4>; [70.00%] else goto <bb 3>; [30.00%] <bb 3> [local count: 322122544]: pretmp_20 = (*a.0_11)[0]; <bb 4> [local count: 1073741824]: # prephitmp_18 = PHI <pretmp_20(3), 0(2)> *x_1(D) = prephitmp_18; return; }
Another (much older and borader) request to make use of the aliasing guarantee for pointer comparisons is pr13962.
note alias cannot disabiguate (A) on its own since both a and x could be NULL, it additionally needs the flow-based info that a is not NULL at (A).
While SSA_NAME_PTR_INFO/get_ptr_nonnull can't do this, either vrp pass or any other pass that would be using ranger can do that. So what pass would be the best match for that?
GCC 11.1 has been released, retargeting bugs to GCC 11.2.
GCC 11.2 is being released, retargeting bugs to GCC 11.3
GCC 11.3 is being released, retargeting bugs to GCC 11.4.
Note I don't think you really need exact aliasing information to optimize this though. See PR 109119 for an example where you just need to know on the path where PRE adds the load, the two addresses are equal due to the condition.
GCC 11.4 is being released, retargeting bugs to GCC 11.5.
So I think we could solve this with a bit of help from the alias oracle. We have the routine ptrs_compare_unequal, but points-to-null is going to get in the way. I think VRP and DOM have enough information to rule out NULL for both objects in question. So if we could query the points-to information, ignoring NULL then we could likely solve this particular bug. Essentially VRP or DOM would prove NULL isn't in the set of possible values at the comparison point. Then we query the alias information ignoring NULL. Voila we compute a static result for the comparison of the two pointers and the problematical block becomes unreachable and the bogus warning goes away. Richi, any thoughts in viability of such an API?
(In reply to Jeffrey A. Law from comment #12) > So I think we could solve this with a bit of help from the alias oracle. We > have the routine ptrs_compare_unequal, but points-to-null is going to get > in the way. > > I think VRP and DOM have enough information to rule out NULL for both > objects in question. So if we could query the points-to information, > ignoring NULL then we could likely solve this particular bug. > > Essentially VRP or DOM would prove NULL isn't in the set of possible values > at the comparison point. Then we query the alias information ignoring NULL. > Voila we compute a static result for the comparison of the two pointers and > the problematical block becomes unreachable and the bogus warning goes away. > > Richi, any thoughts in viability of such an API? We now treat pt.null conservatively and track non-null-ness derived from range-info in it. That means when VRP/DOM can prove a pointer is always not NULL they can do set_ptr_nonnull (p) on it. This means the /* ??? We'd like to handle ptr1 != NULL and ptr1 != ptr2 but those require pt.null to be conservatively correct. */ is no longer true and we could finally implement it, like with diff --git a/gcc/tree-ssa-alias.cc b/gcc/tree-ssa-alias.cc index e7c1c1aa624..5b6d9e0aa6a 100644 --- a/gcc/tree-ssa-alias.cc +++ b/gcc/tree-ssa-alias.cc @@ -479,9 +479,25 @@ ptrs_compare_unequal (tree ptr1, tree ptr2) } return !pt_solution_includes (&pi->pt, obj1); } - - /* ??? We'd like to handle ptr1 != NULL and ptr1 != ptr2 - but those require pt.null to be conservatively correct. */ + else if (TREE_CODE (ptr1) == SSA_NAME) + { + struct ptr_info_def *pi1 = SSA_NAME_PTR_INFO (ptr1); + if (!pi1 + || pi1->pt.vars_contains_restrict + || pi1->pt.vars_contains_interposable) + return false; + if (integer_zerop (ptr2) && !pi1->pt.null) + return true; + if (TREE_CODE (ptr2) == SSA_NAME) + { + struct ptr_info_def *pi2 = SSA_NAME_PTR_INFO (ptr2); + if (!pi2 + || pi2->pt.vars_contains_restrict + || pi2->pt.vars_contains_interposable) + if (!pi1->pt.null || !pi2->pt.null) + return !pt_solutions_intersect (&pi1->pt, &pi2->pt); + } + } return false; } but the testcase shows the non-null-ness is only conditional which means we'd have to use a range query above which necessarily falls back to the global ranges given we don't have any context available here. The old EVRP adjusted global ranges during the walk but this is no longer done. Note it's enough that one pointer is nonnull, so for your idea the API could be extended with a bool one_ptr_nonnull parameter.
gcc.c-torture/execute/pr64242.c is one of the testcases FAILing (guess it's invalid)
(In reply to Richard Biener from comment #13) > (In reply to Jeffrey A. Law from comment #12) > > So I think we could solve this with a bit of help from the alias oracle. We > > have the routine ptrs_compare_unequal, but points-to-null is going to get > > in the way. > > > > I think VRP and DOM have enough information to rule out NULL for both > > objects in question. So if we could query the points-to information, > > ignoring NULL then we could likely solve this particular bug. > > > > Essentially VRP or DOM would prove NULL isn't in the set of possible values > > at the comparison point. Then we query the alias information ignoring NULL. > > Voila we compute a static result for the comparison of the two pointers and > > the problematical block becomes unreachable and the bogus warning goes away. > > > > Richi, any thoughts in viability of such an API? > > We now treat pt.null conservatively and track non-null-ness derived from > range-info in it. That means when VRP/DOM can prove a pointer is always > not NULL they can do set_ptr_nonnull (p) on it. > > This means the > > /* ??? We'd like to handle ptr1 != NULL and ptr1 != ptr2 > but those require pt.null to be conservatively correct. */ > > is no longer true and we could finally implement it, like with > > diff --git a/gcc/tree-ssa-alias.cc b/gcc/tree-ssa-alias.cc > index e7c1c1aa624..5b6d9e0aa6a 100644 > --- a/gcc/tree-ssa-alias.cc > +++ b/gcc/tree-ssa-alias.cc > @@ -479,9 +479,25 @@ ptrs_compare_unequal (tree ptr1, tree ptr2) > } > return !pt_solution_includes (&pi->pt, obj1); > } > - > - /* ??? We'd like to handle ptr1 != NULL and ptr1 != ptr2 > - but those require pt.null to be conservatively correct. */ > + else if (TREE_CODE (ptr1) == SSA_NAME) > + { > + struct ptr_info_def *pi1 = SSA_NAME_PTR_INFO (ptr1); > + if (!pi1 > + || pi1->pt.vars_contains_restrict > + || pi1->pt.vars_contains_interposable) > + return false; > + if (integer_zerop (ptr2) && !pi1->pt.null) > + return true; > + if (TREE_CODE (ptr2) == SSA_NAME) > + { > + struct ptr_info_def *pi2 = SSA_NAME_PTR_INFO (ptr2); > + if (!pi2 > + || pi2->pt.vars_contains_restrict > + || pi2->pt.vars_contains_interposable) > + if (!pi1->pt.null || !pi2->pt.null) > + return !pt_solutions_intersect (&pi1->pt, &pi2->pt); > + } > + } > > return false; > } > > > but the testcase shows the non-null-ness is only conditional which means > we'd have to use a range query above which necessarily falls back to > the global ranges given we don't have any context available here. The > old EVRP adjusted global ranges during the walk but this is no longer done. > You mean it lied? because x_1 is not NULL until after _8 = *x_1(D); executes. It can still be NULL on that stmt can it not? Did it reset the global value afterwards? Contextually ranger knows both are non-null at EVRP time: a.0_27 : [irange] int[0:D.xxxx] * [1, +INF] 2->3 (T) x_1(D) : [irange] int * [1, +INF] 2->3 (T) a.0_27 : [irange] int[0:D.xxxx] * [1, +INF] 2->4 (F) x_1(D) : [irange] int * [1, +INF] 2->4 (F) a.0_27 : [irange] int[0:D.xxxx] * [1, +INF] So we know x_1 is non-NULL after the de-reference for the rest of the block (and function). It also sets a.0_27 globally to be [1, +INF]. > Note it's enough that one pointer is nonnull, so for your idea the > API could be extended with a bool one_ptr_nonnull parameter. ranger currently sets a.0 globally to be non-null in EVRP.
(In reply to Andrew Macleod from comment #15) > (In reply to Richard Biener from comment #13) > > (In reply to Jeffrey A. Law from comment #12) > > > So I think we could solve this with a bit of help from the alias oracle. We > > > have the routine ptrs_compare_unequal, but points-to-null is going to get > > > in the way. > > > > > > I think VRP and DOM have enough information to rule out NULL for both > > > objects in question. So if we could query the points-to information, > > > ignoring NULL then we could likely solve this particular bug. > > > > > > Essentially VRP or DOM would prove NULL isn't in the set of possible values > > > at the comparison point. Then we query the alias information ignoring NULL. > > > Voila we compute a static result for the comparison of the two pointers and > > > the problematical block becomes unreachable and the bogus warning goes away. > > > > > > Richi, any thoughts in viability of such an API? > > > > We now treat pt.null conservatively and track non-null-ness derived from > > range-info in it. That means when VRP/DOM can prove a pointer is always > > not NULL they can do set_ptr_nonnull (p) on it. > > > > This means the > > > > /* ??? We'd like to handle ptr1 != NULL and ptr1 != ptr2 > > but those require pt.null to be conservatively correct. */ > > > > is no longer true and we could finally implement it, like with > > > > diff --git a/gcc/tree-ssa-alias.cc b/gcc/tree-ssa-alias.cc > > index e7c1c1aa624..5b6d9e0aa6a 100644 > > --- a/gcc/tree-ssa-alias.cc > > +++ b/gcc/tree-ssa-alias.cc > > @@ -479,9 +479,25 @@ ptrs_compare_unequal (tree ptr1, tree ptr2) > > } > > return !pt_solution_includes (&pi->pt, obj1); > > } > > - > > - /* ??? We'd like to handle ptr1 != NULL and ptr1 != ptr2 > > - but those require pt.null to be conservatively correct. */ > > + else if (TREE_CODE (ptr1) == SSA_NAME) > > + { > > + struct ptr_info_def *pi1 = SSA_NAME_PTR_INFO (ptr1); > > + if (!pi1 > > + || pi1->pt.vars_contains_restrict > > + || pi1->pt.vars_contains_interposable) > > + return false; > > + if (integer_zerop (ptr2) && !pi1->pt.null) > > + return true; > > + if (TREE_CODE (ptr2) == SSA_NAME) > > + { > > + struct ptr_info_def *pi2 = SSA_NAME_PTR_INFO (ptr2); > > + if (!pi2 > > + || pi2->pt.vars_contains_restrict > > + || pi2->pt.vars_contains_interposable) > > + if (!pi1->pt.null || !pi2->pt.null) > > + return !pt_solutions_intersect (&pi1->pt, &pi2->pt); > > + } > > + } > > > > return false; > > } > > > > > > but the testcase shows the non-null-ness is only conditional which means > > we'd have to use a range query above which necessarily falls back to > > the global ranges given we don't have any context available here. The > > old EVRP adjusted global ranges during the walk but this is no longer done. > > > You mean it lied? because x_1 is not NULL until after _8 = *x_1(D); > executes. It can still be NULL on that stmt can it not? Did it reset the > global value afterwards? Yes and yes, old EVRP turned global ranges into "ranges at the point of stmt evaluation/folding" (and restored them to the global values later). Note that EVRP didn't do any sort of iteration for loop handling and it folded stmts as it analyzed them. Using the global ranges as "lattice" had the advantage that all folding utilities picked up "local" ranges for free. IMO it was quite elegant and fast what EVRP did (with it's obvious limitations of course). > Contextually ranger knows both are non-null at EVRP time: > a.0_27 : [irange] int[0:D.xxxx] * [1, +INF] > 2->3 (T) x_1(D) : [irange] int * [1, +INF] > 2->3 (T) a.0_27 : [irange] int[0:D.xxxx] * [1, +INF] > 2->4 (F) x_1(D) : [irange] int * [1, +INF] > 2->4 (F) a.0_27 : [irange] int[0:D.xxxx] * [1, +INF] > > So we know x_1 is non-NULL after the de-reference for the rest of the block > (and function). It also sets a.0_27 globally to be [1, +INF]. > > > Note it's enough that one pointer is nonnull, so for your idea the > > API could be extended with a bool one_ptr_nonnull parameter. > > ranger currently sets a.0 globally to be non-null in EVRP. After EVRP I see # PT = nonlocal null unsigned int * x_8(D) = x; ... <bb 2> : _1 = *x_8(D); # RANGE [irange] long unsigned int [0, 4294967295] MASK 0xffffffff VALUE 0x0 _2 = (long unsigned int) _1; # PT = null { D.2781 } # ALIGN = 8, MISALIGN = 0 # USE = nonlocal escaped # CLB = nonlocal escaped a_10 = malloc (_2); if (a_10 == 0B) ... <bb 4> : if (x_8(D) != a_10) the last test is the one we want to eliminate. The proposed change (with a missing 'return false' fixed) isn't enough since it just looks at global ranges where both x_8 and a_10 can be null. In the ptrs_compare_unequal function I could at most use range_of_expr without a stmt context (as I don't have that) which wouldn't help. EVRP with the trick to adjust global ranges effectively had a "global context" it would use. I suppose that one could have something like that for ranger as well, add ranger::set_context (gimple *) which EVRP could set when folding a stmt (set it to right before 'stmt' execution) and which would be the context to fall back to when a folding dependent utility didn't specify one? Adding a global (not ranger specific) "folding context stack" might do the trick as well. Any utility that could take advantage of a context could look at the stack top (which might be NULL). That would be less churn than wrapping each and every folding function inside a "folder" class containing a context. Of course one has to be careful with such a thing, like with recursively invoking number of iteration or SCEV analysis which work on a more fuzzy context than a specific stmt contained in a loop.
Handling pointer-vs-pointer in ptrs_compare_unequal isn't enough since we have # PT = nonlocal null unsigned int * x_7(D) = x; ... # PT = null { D.2785 } a_9 = malloc (_2); if (a_9 == 0B) goto <bb 7>; [0.04%] else goto <bb 3>; [99.96%] <bb 7> [local count: 429496]: goto <bb 6>; [100.00%] <bb 3> [local count: 1073312328]: if (x_7(D) != a_9) so querying points-to only has to consider both pointers being NULL. Now, I'm not sure how prange handles the above and whether or how it integrates knowledge from flow-insensitive points-to analysis. Aldy might know. I'm testing a patch enhancing ptrs_compare_unequal right now, also filling in a missing bit in points-to info.
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>: https://gcc.gnu.org/g:f3e5f4c58591f5dacdd14a65ec47bbe310df02a0 commit r15-580-gf3e5f4c58591f5dacdd14a65ec47bbe310df02a0 Author: Richard Biener <rguenther@suse.de> Date: Mon Mar 11 11:17:32 2024 +0100 tree-optimization/13962 - handle ptr-ptr compares in ptrs_compare_unequal Now that we handle pt.null conservatively we can implement the missing tracking of constant pool entries (aka STRING_CST) and handle ptr-ptr compares using points-to info in ptrs_compare_unequal. PR tree-optimization/13962 PR tree-optimization/96564 * tree-ssa-alias.h (pt_solution::const_pool): New flag. * tree-ssa-alias.cc (ptrs_compare_unequal): Handle pointer-pointer compares. (dump_points_to_solution): Dump the const_pool flag, fix guard of flag dumping. * gimple-pretty-print.cc (pp_points_to_solution): Likewise. * tree-ssa-structalias.cc (find_what_var_points_to): Set the const_pool flag for STRING. (pt_solution_ior_into): Handle the const_pool flag. (ipa_escaped_pt): Initialize it. * gcc.dg/tree-ssa/alias-39.c: New testcase. * g++.dg/vect/pr68145.cc: Use -fno-tree-pta to avoid UB to manifest in transforms no longer vectorizing this testcase for an ICE.
(In reply to Richard Biener from comment #17) > Handling pointer-vs-pointer in ptrs_compare_unequal isn't enough since we > have > > # PT = nonlocal null > unsigned int * x_7(D) = x; > ... > # PT = null { D.2785 } > a_9 = malloc (_2); > if (a_9 == 0B) > goto <bb 7>; [0.04%] > else > goto <bb 3>; [99.96%] > > <bb 7> [local count: 429496]: > goto <bb 6>; [100.00%] > > <bb 3> [local count: 1073312328]: > if (x_7(D) != a_9) > > so querying points-to only has to consider both pointers being NULL. Now, > I'm not sure how prange handles the above and whether or how it integrates > knowledge from flow-insensitive points-to analysis. prange currently does nothing different than what irange did. It's a first step in providing points-to and propagating the information through the CFG like we do for integer ranges. Or more precisely, prange is meant to be nothing more than a pair of integer end points (not unlimited like int_range_max), plus a value/mask pair. > > Aldy might know. > > I'm testing a patch enhancing ptrs_compare_unequal right now, also filling > in a missing bit in points-to info.