This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC PATCH] Fix pointer diff (was: -fsanitize=pointer-overflow support (PR sanitizer/80998))


On October 28, 2017 2:53:56 PM GMT+02:00, Marc Glisse <marc.glisse@inria.fr> wrote:
>
>I am sending the new version of the patch in a separate email, to make
>it 
>more visible, and only replying to a few points here.
>
>On Thu, 19 Oct 2017, Richard Biener wrote:
>
>> On Mon, Oct 9, 2017 at 12:55 PM, Marc Glisse <marc.glisse@inria.fr>
>wrote:
>>> On Mon, 3 Jul 2017, Richard Biener wrote:
>>>
>>>> On Sat, 1 Jul 2017, Marc Glisse wrote:
>>>>
>>>>> I wrote a quick prototype to see what the fallout would look like.
>>>>> Surprisingly, it actually passes bootstrap+testsuite on ppc64el
>with all
>>>>> languages with no regression. Sure, it is probably not a complete
>>>>> migration, there are likely a few places still converting to
>ptrdiff_t
>>>>> to perform a regular subtraction, but this seems to indicate that
>the
>>>>> work isn't as bad as using a signed type in pointer_plus_expr for
>>>>> instance.
>>>>
>>>>
>>>> The fold_binary_loc hunk looks dangerous (it'll generate MINUS_EXPR
>>>> from POINTER_MINUS_EXPR in some cases I guess).
>>>>
>>>> The tree code needs documenting in tree.def and generic.texi.
>>>>
>>>> Otherwise ok(*).
>>>>
>>>> Thanks,
>>>> Richard.
>>>>
>>>> (*) ok, just kidding -- or maybe not
>>>
>>>
>>> I updated the prototype a bit. Some things I noticed:
>>>
>>> - the C front-end has support for address spaces that seems to imply
>that
>>> pointer subtraction (and division by the size) may be done in a type
>larger
>>> than ptrdiff_t. Currently, it generates
>>> (ptrdiff_t)(((inttype)q-(inttype)p)/size) for q-p where inttype is
>some type
>>> potentially larger than ptrdiff_t.
>>
>> It uses a ptrdiff_t corresponding to the respective address space,
>yes.
>> That we use sizetype elsewhere unconditionally is a bug :/
>>
>>> I am thus generating a POINTER_DIFF_EXPR
>>> with that type, while I was originally hoping its type would always
>be
>>> ptrdiff_t. It complicates code and I am not sure I didn't break
>address
>>> spaces anyway... (expansion likely needs to do the inttype dance)
>>
>> I think that's fine.  Ideally targets would provide a type to use for
>each
>> respective address space given we have targets that have sizetype
>smaller
>> than ptr_mode.
>>
>>> Are ptrdiff_t (what POINTER_DIFF_EXPR should return) and size_t
>(what
>>> POINTER_PLUS_EXPR takes as second argument) always the same type
>>> signed/unsigned?
>>
>> POINTER_PLUS_EXPR takes 'sizetype', not size_t.
>
>Ah, yes, that's the one I meant...
>
>> So the answer is clearly no.  And yes, that's ugly and broken and
>should be fixed.
>>
>>> Counter-examples: m32c (when !TARGET_A16), visium, darwin,
>>> powerpcspe, s390, vms... and it isn't even always the same that is
>bigger
>>> than the other. That's quite messy.
>>
>> Eh.  Yeah, targets are free to choose 'sizetype' and they do so for
>> efficiency.  IMHO we should get rid of this "feature".
>>
>>> - I had to use @@ in match.pd, not for constants, but because in
>GENERIC we
>>> sometimes ended up with pointers where operand_equal_p said yes but
>>> types_match disagreed.
>>>
>>> - It currently regresses 2 go tests: net/http runtime/debug
>>
>> Those are flaky for me and fail sometimes and sometimes not.
>
>Yes, I noticed that (there are 1 or 2 others in the go testsuite).
>
>> +@item POINTER_DIFF_EXPR
>> +This node represents pointer subtraction.  The two operands always
>> +have pointer/reference type.  The second operand is always an
>unsigned
>> +integer type compatible with sizetype.  It returns a signed integer.
>>
>> the 2nd sentence looks bogusly copied.
>
>Oups, removed.
>
>>
>> +      /* FIXME.  */
>> +      if (code == POINTER_DIFF_EXPR)
>> +       return int_const_binop (MINUS_EXPR,
>> +                               fold_convert (ptrdiff_type_node,
>arg1),
>> +                               fold_convert (ptrdiff_type_node,
>arg2));
>>
>>  wide_int_to_tree (type, wi::to_widest (arg1) - wi::to_widest
>(arg2));
>
>Before your reply, I wrote something similar using offset_int instead
>of 
>widest_int (and handling overflow, mostly for documentation purposes).
>I 
>wasn't sure which one to pick, I can change to widest_int if you think
>it 
>is better...

Offset_int is better. 

>> +    case POINTER_DIFF_EXPR:
>> +      {
>> +       if (!POINTER_TYPE_P (rhs1_type)
>> +           || !POINTER_TYPE_P (rhs2_type)
>> +           // || !useless_type_conversion_p (rhs2_type, rhs1_type)
>>
>> types_compatible_p (rhs1_type, rhs2_type)?
>
>Makes sense.
>
>> +           // || !useless_type_conversion_p (ptrdiff_type_node,
>lhs_type))
>> +           || TREE_CODE (lhs_type) != INTEGER_TYPE
>> +           || TYPE_UNSIGNED (lhs_type))
>> +         {
>> +           error ("type mismatch in pointer diff expression");
>> +           debug_generic_stmt (lhs_type);
>> +           debug_generic_stmt (rhs1_type);
>> +           debug_generic_stmt (rhs2_type);
>> +           return true;
>>
>> there's also verify_expr which could want adjustment for newly
>created
>> tree kinds.
>
>ok.
>
>> So if the precision of the result is larger than that of the pointer 
>> operands we sign-extend the result, right?  So the subtraction is 
>> performed in precision of the pointer operands and then 
>> sign-extended/truncated to the result type? Which means it is _not_ a
>
>> widening subtraction to get around the half-address-space issue.  The
>
>> tree.def documentation should reflect this semantic detail.  Not sure
>if 
>> the RTL expansion follows it.
>
>I actually have no idea what expansion does if the size is different
>(that 
>was one of my comments). Crashing is not unlikely.
>
>I have changed things a bit. Now POINTER_DIFF_EXPR always has the same 
>precision as input and output (so expansion should be fine). And there
>is 
>code in the C/C++ front-ends that uses casts and MINUS_EXPR if we want
>a 
>result wider than pointers, although it might very well be dead code...
>
>My goal was not to help with objects larger than half the address
>space, 
>but to fix things for small objects unlucky enough to straddle the
>middle 
>of the address space. By properly modeling what overflow means for this
>
>operation, I expect we can add more simplifications in match.pd.

Yeah. 

Thanks, 
Richard. 

>> I think that we'd ideally have a single legal INTEGER_TYPE precision
>> per pointer type precision and that those precisions should match...
>> we don't have to follow the mistakes of POINTER_PLUS_EXPR.
>>
>> So ... above verify TYPE_PRECISION (rhs1_type) == TYPE_PRECISION
>(lhs_type)?
>> Some targets have 24bit ptr_mode but no 24bit integer type which
>means the
>> FE likely chooses 32bit int for the difference computation.  But the
>middle-end
>> should be free to create a 24bit precision type with SImode.
>>
>> Otherwise as said before - go ahead, I think this would be great to
>> have for GCC 8.  I'd say
>> ask the maintainers of the above list of targets to do some testing.
>>
>> "Fixing" POINTER_PLUS_EXPR would be very nice as well.  Again,
>matching
>> precision - I'm not sure if we need to force a signed operand, having
>either
>> might be nice given all sizes are usually unsigned.
>>
>> Thanks and sorry for the delay,
>> Richard.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]