[Bug tree-optimization/61502] == comparison on "one-past" pointer gives wrong result

Fri Jan 10 14:01:00 GMT 2020

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61502

--- Comment #41 from Alexander Cherepanov <ch3root at openwall dot com> ---
(In reply to Richard Biener from comment #38)
> (In reply to Alexander Cherepanov from comment #37)
> > On 30/12/2019 10.51, rguenther at suse dot de wrote:
> > >> Obviously, it could be used to fold `a + i == b` to `0` if `a` and `b`
> > >> are two different known arrays and `i` is unknown
> > > 
> > > That's indeed the main thing. Basically it allows points-to analysis work at
> > > all in the presence of non-constant offsets.
> > 
> > But what is PTA used for? Examples that I've seen all deal with
> > dereferenceable pointers. And current gcc behaviour is just fine in that
> > case. The problem is with non-dereferenceable pointers. So is PTA important
> > for cases where dereferenceability is unknown (or known to be false) or it's
> > just too complicated to take dereferenceability into account?
> 
> points-to analysis doesn't care about whether a pointer is dereferenced or
> not when computing its points-to set.  You can very well add a dereference
> to your testcase and that shouldn't affect its outcome, no?

No, I mean dereferences of both pointers, these cannot be added to my testcase.
To exclude any ambiguity, I'm talking about my last testcase (comment 35) in
this bug report. (Or the original testcase from Peter Sewell in comment 0.) One
pointer is `&x`, fine, dereferenceable. The other one is `&y + 1`, just past
the end of the object `y`, non-dereferenceable (C11, 6.5.6p8).

So the rough idea is to do it like this: if both pointers are known to be
dereferenceable at the point of check (e.g., we want to move `a[i] = 1;` over
`b[j] = 2;`) then the results of the PTA could be used, otherwise (e.g., we
want to fold `a + i == b + j` only knowing that `a` and `b` are different
arrays) pass the comparison to run time.

A nice thing about this approach is that it treats pointers and integers in the
same way. In particular, it will also solve bug 93010.
And it is applicable to dynamic memory which is handled somewhat differently
now.
It even almost works for `restrict`! (It should be possible to optimize `int
f(int *restrict p, int *q) { *p = 1; *q; return p == q; }` but not with `*p =
1;` replaced with just `*p;`.)

In some sense this approach delegates a part of the work to the programmer. If
they put a dereference somewhere they effectively assert that the pointer (even
if it's a casted from an integer) is good at this particular moment -- that the
result of malloc was not null, that the memory was not free'd, delete'd or
reused via placement new since, or that the local variable is still in scope,
or that the pointer is not past the end, or that the storage is not of zero
size. This of course depends on the programmer respecting the provenances but
that's not a news:-)

What is a dereference in this context is a somewhat tricky question.
Dereferencing a pointer to an empty struct should not count. But calling a
static method in non-empty class probably should (need to check the C++ std).
Then, while analyzing `p == q` it's not necessary to require dereferenceability
of `p` and `q` themselves, dereferenceability of `p + k` and `q + l` is enough
if `k` and `l` are both nonnegative or strictly negative.

Many more improvements are possible too. E.g., `a + i == b + j` could be folded
to `0` if the addresses are not exposed or only exposed in ways that don't
allow one to reason about the arrangement of the objects. IIUC llvm does some
of this, e.g., https://reviews.llvm.org/rL249490.

> And yes, GCC uses points-to analysis results to optimize pointer equality
> compares like p == q to false if the points-to sets do not intersect (for
> a set of cases, but that's current implementation detail).  That helps
> surprisingly often for abstraction coming from the C++ standard library
> container iterators.

Isn't it mainly dynamic memory? Then it's already handled a bit differently,
even `new int == new int` is not optimized right now (unlike in clang).
AIUI this bug report is relevant to non-dynamic memory only (but the fix could
improve the case of dynamic memory too).

> I do agree that we have bugs in GCC but AFAICS those come from conditional
> equivalences being propagated and from the very old RTL alias analysis issue
> involving base_alias_check.  Once we dealt with the latter I'm happily
> exploring fixes for the former - but the latter will happily nullify fixes
> of the former.

IMHO those two are quite different problems -- this bug is about wrong results
at compile-time and conditional equivalence propagation depends on the results
not being available at the optimization time. I could be wrong but it seems to
me that it's better to deal with them separately.