[PATCH] Add pattern for pointer-diff on addresses with same base/offset (PR 94234)

Feng Xue OS fxue@os.amperecomputing.com
Wed Jun 3 09:12:54 GMT 2020


>>       * match.pd ((PTR + A) - (PTR + B)) -> (ptrdiff_t)(A - B): New
>>       simplification.

> Not new, modified.
OK.

>>       * ((PTR_A + O) - (PTR_B + O)) -> (PTR_A - PTR_B): New simplification.

> O might not be the best choice because of how close it looks to 0.
OK.

> What don't you like about the existing transformation? You are replacing a
> transformation that always folds by one that folds only in some cases, and
> looses the information that some overflows cannot happen. That looks like
> it is making things worse from an optimization point of view. Do you
> consider the transformation as unsafe with -fsanitize=pointer-overflow
> (does that correspond to the case where TYPE_OVERFLOW_UNDEFINED is true
> for a pointer type?)?
Yes. We should use !TYPE_OVERFLOW_SANITIZED, not TYPE_OVERFLOW_UNDEFINED.
But even for !TYPE_OVERFLOW_SANITIZED, some ptr_diff rules have the check, and some
do not. Here we could also remove it?

> Ah, looking at the PR, you decided to perform the operation as unsigned
> because that has fewer NOP conversions, which, in that particular testcase
> where the offsets are originally unsigned, means we simplify better. But I
> would expect it to regress other testcases (in particular if the offsets
> were originally signed). Also, changing the second argument of
> pointer_plus to be signed, as is supposed to eventually happen, would
> break your testcase again.
The old rule might produce overflow result (offset_a = (signed_int_max)UL, 
offset_b = 1UL). 

Additionally, (stype)(offset_a - offset_b) is more compact, there might be
further simplification opportunities on offset_a - offset_b, even it is not
in form of (A * C - B * C), for example (~A - 1 -> -A). But for old rule, we have
to introduce another rule as (T)A - (T)(B) -> (T)(A - B), which seems to
be too generic to benefit performance in all situations.

If the 2nd argument is signed, we can add a specific rule as your suggestion
(T)(A * C) - (T)(B * C) -> (T) (A - B) * C.

> At the very least we want to keep a comment next to the transformation
> explaining the situation.

> If there are platforms where the second argument of pointer_plus is a
> smaller type than the result of pointer_diff (can this happen? I keep
> forgetting all the weird things some platforms do), this version may do an
> unsafe zero-extension.
If the 2nd argument is a smaller type, this might bring confuse semantic to
pointer_plus operator. Suppose the type is a (unsigned) char, the expression
"ptr + ((char) -1)" represents ptr + 255 or ptr - 1?

Regards,
Feng


More information about the Gcc-patches mailing list