This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFTesting] New POINTER_DIFF_EXPR


On Fri, 17 Nov 2017, Richard Biener wrote:

On Sat, Nov 11, 2017 at 12:44 AM, Marc Glisse <marc.glisse@inria.fr> wrote:

The exact undefined-behavior status should probably be clarified more.
First, I'd like to say that POINTER_DIFF_EXPR may only take 2 pointers into
the same "object" (like in C/C++), so they differ by at most half the size
of the address space, and a-b is not allowed to be the minimum negative
value (so I can safely use b-a), and if we compute a-b and b-c, then a and c
are in the same object so I can safely compute a-c, etc. Second, we probably
need modes without undefined behavior, wrapping with either -fwrapv or a new
-fwrapp, or sanitized. For the sanitized version, we could keep using
POINTER_DIFF_EXPR and check TYPE_OVERFLOW_SANITIZED on the pointer type as
we currently do for integers. For wrapping, either we say that
POINTER_DIFF_EXPR has wrapping semantics when that option is in effect, or
we do not use POINTER_DIFF_EXPR and instead cast to unsigned before applying
a MINUS_EXPR (my preference).

CCing C/C++ FE folks as well.

+      /* It is better if the caller provides the return type.  */
+      if (code == POINTER_DIFF_EXPR)
+       {
+         offset_int res = wi::sub (wi::to_offset (arg1),
+                                   wi::to_offset (arg2));
+         return force_fit_type (signed_type_for (TREE_TYPE (arg1)), res, 1,
+                                TREE_OVERFLOW (arg1) | TREE_OVERFLOW (arg2));
+       }
+

there's an overload that provides the type... (we should probably go
over all callers
and use that).  There is already a folding that is only done when the type is
available.  So I'd simply remove the above from the non-type overload handling.
What breaks then?

I haven't checked yet. On the other hand, now that I require the same precision for pointers and their difference, using signed_type_for should be safe enough (only issue is if a front-end is unhappy that we changed the exact type for another 'equivalent' one). But I'll try removing the version with signed_type_for.

With the patch it's of course somewhat ugly in that we need to deal with both
MINUS_EXPR on pointers and POINTER_DIFF_EXPR - at least that's what

+    case POINTER_DIFF_EXPR:
    case MINUS_EXPR:
+      /* Fold &a[i] - &a[j] to i-j.  */
+      if (TREE_CODE (arg0) == ADDR_EXPR
+         && TREE_CODE (TREE_OPERAND (arg0, 0)) == ARRAY_REF
+         && TREE_CODE (arg1) == ADDR_EXPR
+         && TREE_CODE (TREE_OPERAND (arg1, 0)) == ARRAY_REF)
+        {
+         tree tem = fold_addr_of_array_ref_difference (loc, type,
+                                                       TREE_OPERAND (arg0, 0),
+                                                       TREE_OPERAND (arg1, 0),
+                                                       code
+                                                       == POINTER_DIFF_EXPR);

suggests.  But is that really so?  The FEs today should never build
a MINUS_EXPR with pointer operands and your patch also doesn't.
I suppose we only arrive above because STRIP_NOPS strips the
pointer to integer conversion?

I think so. And since I only patched 2 front-ends, others probably still generate casts to integers and MINUS_EXPR and can benefit from this transformation. Also, if we implement -fwrapp by casting to unsigned and doing MINUS_EXPR, that will remain useful.

+    case POINTER_DIFF_EXPR:
+      if (!POINTER_TYPE_P (TREE_TYPE (TREE_OPERAND (t, 0)))
+         || !POINTER_TYPE_P (TREE_TYPE (TREE_OPERAND (t, 1))))
+       {
+         error ("invalid operand to pointer diff, operand is not a pointer");
+         return t;
+       }
+      CHECK_OP (0, "invalid operand to pointer diff");
+      CHECK_OP (1, "invalid operand to pointer diff");
+      break;

can you add
 || TREE_CODE (TREE_TYPE (t)) != INTEGER_TYPE
 || TYPE_UNSIGNED (TREE_TYPE (t))

I wasn't sure how much to duplicate between here and below. Added.

?  Note that if the precision of the pointer type arguments are not equal to the
precision of the return value then foldings like

  (simplify
+   (plus:c (pointer_diff @0 @1) (pointer_diff @2 @0))
+   (if (TYPE_OVERFLOW_UNDEFINED (type)
+       && !TYPE_OVERFLOW_SANITIZED (TREE_TYPE (@0)))
+    (pointer_diff @2 @1)))

might not be correct as we drop a possible truncation here.  Shall we

I simplified the transformations after deciding to require equal precision.

require (and document) that the result of the pointer difference in
a POINTER_DIFF_EXPR has to be representable in the type of the
expression, otherwise
the behavior is undefined?  Shall we allow an unsigned return type for
differences known to be representable?  Probably not...  Does the behavior
depend on TYPE_OVERFLOW_UNDEFINED?  It would be nice to amend
the generic.texi docs somewhat here.

I was more precise in tree.def, copying details over to generic.texi now.
But the overflow behavior is something I was asking about above (see
citation at the beginning of this email). I am tempted to say that overflow
is always undefined for POINTER_DIFF_EXPR and we should use MINUS_EXPR on
unsigned integers when we want wrapping, but if people prefer to use
POINTER_DIFF_EXPR in all cases and change its behavior depending on
TYPE_OVERFLOW_UNDEFINED that would be ok with me. I probably wasn't
consistent in match.pd and will need to tweak a few details once this is
decided.

Ah, the gimple checking adds some more bits here that clarify things:

+    case POINTER_DIFF_EXPR:
+      {
+       if (!POINTER_TYPE_P (rhs1_type)
+           || !POINTER_TYPE_P (rhs2_type)
+           || !types_compatible_p (rhs1_type, rhs2_type)
+           || TREE_CODE (lhs_type) != INTEGER_TYPE
+           || TYPE_UNSIGNED (lhs_type)
+           || TYPE_PRECISION (lhs_type) != TYPE_PRECISION (rhs1_type))
+         {
+           error ("type mismatch in pointer diff expression");
+           debug_generic_stmt (lhs_type);
+           debug_generic_stmt (rhs1_type);
+           debug_generic_stmt (rhs2_type);
+           return true;
+         }


I think the patch is ok for trunk, the FE bits should get approval
from their maintainers.
Joseph may have an idea about the address-space issue.

With this in place we should be able to fix some of the issues Jakub ran into
with pointer sanitizing?

Some of them should be fixed without doing anything: pointer differences
are not using MINUS_EXPR anymore in C/C++, so they are not sanitized and
don't produce spurious errors when crossing the middle of the address
space. On the other hand, we may want to sanitize POINTER_DIFF_EXPR
specifically.

Also with this in place it would be nice to do the corresponding adjustment to
POINTER_PLUS_EXPR, that is, require a _signed_ offset type that matches
the precision of the other argument.

Maybe next stage1... Too late for this year.

--
Marc Glisse


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]