[PATCH] Add pattern for pointer-diff on addresses with same base/offset (PR 94234)

Thu Jun 4 08:30:44 GMT 2020

On Wed, Jun 3, 2020 at 4:33 PM Marc Glisse <marc.glisse@inria.fr> wrote:
>
> On Wed, 3 Jun 2020, Feng Xue OS via Gcc-patches wrote:
>
> >> Ah, looking at the PR, you decided to perform the operation as unsigned
> >> because that has fewer NOP conversions, which, in that particular testcase
> >> where the offsets are originally unsigned, means we simplify better. But I
> >> would expect it to regress other testcases (in particular if the offsets
> >> were originally signed). Also, changing the second argument of
> >> pointer_plus to be signed, as is supposed to eventually happen, would
> >> break your testcase again.
> > The old rule might produce overflow result (offset_a = (signed_int_max)UL,
> > offset_b = 1UL).
>
> signed_int_max-1 does not overflow. But the point is that pointer_plus /
> pointer_diff are defined in a way that if that subtraction would overflow,
> then one of the pointer_plus or pointed_diff would have been undefined
> already. In particular, you cannot have objects larger than half the
> address space, and pointer_plus/pointer_diff have to remain inside an
> object. Doing the subtraction in a signed type keeps (part of) that
> information.
>
> > Additionally, (stype)(offset_a - offset_b) is more compact,
>
> Not if offset_a comes from (utype)a and offset_b from (utype)b with a and
> b signed. Using size_t indices as in the bugzilla testcase is not
> recommended practice. Change it to ssize_t, and we do optimize the
> testcase in CCP1 already.
>
> > there might be
> > further simplification opportunities on offset_a - offset_b, even it is not
> > in form of (A * C - B * C), for example (~A - 1 -> -A). But for old rule, we have
> > to introduce another rule as (T)A - (T)(B) -> (T)(A - B), which seems to
> > be too generic to benefit performance in all situations.
>
> Sadly, conversions complicate optimizations and are all over the place, we
> need to handle them in more places. I sometimes dream of getting rid of
> NOP conversions, and having a single PLUS_EXPR with some kind of flag
> saying if it can wrap/saturate/trap when seen as a signed/unsigned
> operation, i.e. push the information on the operations instead of objects.
>
> > If the 2nd argument is signed, we can add a specific rule as your suggestion
> > (T)(A * C) - (T)(B * C) -> (T) (A - B) * C.
> >
> >> At the very least we want to keep a comment next to the transformation
> >> explaining the situation.
> >
> >> If there are platforms where the second argument of pointer_plus is a
> >> smaller type than the result of pointer_diff (can this happen? I keep
> >> forgetting all the weird things some platforms do), this version may do an
> >> unsafe zero-extension.
> > If the 2nd argument is a smaller type, this might bring confuse semantic to
> > pointer_plus operator. Suppose the type is a (unsigned) char, the expression
> > "ptr + ((char) -1)" represents ptr + 255 or ptr - 1?
>
> (pointer_plus ptr 255) would mean ptr - 1 on a platform where the second
> argument of pointer_plus has size 1 byte.

Yes.  Note this is the reason for using a signed type for the minus as comment

    /* The second argument of pointer_plus must be interpreted as signed, and
       thus sign-extended if necessary.  */

explains.  That the type of the offset operand in a pointer-plus is always
unsigned is semantically incorrect.  Whenever taken out of context you
have to re-interpret the offset value as a signed entity.

And yes, there do exist targets where pointers are larger than offsets.
The size of pointers (ptr_mode) is specified by POINTER_SIZE
while the pointer-plus offset type is SIZETYPE (not SIZE_TYPE,
the C ptrdiff_t type is derived from SIZE_TYPE).

As Marc said for this particular reason pointer-plus offsets should
be instead forced to have signed type [or not forced a particular
sign and precision so the operands sign specifies how to extend it].
But it's far from a trivial task to rectify this IL mistake.

Richard.

>
>
> Do note that I am not a reviewer, what I say isn't final.
>
> --
> Marc Glisse