This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [RFC PATCH] Fix pointer diff (was: -fsanitize=pointer-overflow support (PR sanitizer/80998))
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: Marc Glisse <marc dot glisse at inria dot fr>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Sat, 28 Oct 2017 18:45:41 +0200
- Subject: Re: [RFC PATCH] Fix pointer diff (was: -fsanitize=pointer-overflow support (PR sanitizer/80998))
- Authentication-results: sourceware.org; auth=none
- References: <20170619182515.GA2123@tucnak> <alpine.LSU.2.20.1706200928120.22867@zhemvz.fhfr.qr> <20170620081348.GE2123@tucnak> <alpine.LSU.2.20.1706201015531.22867@zhemvz.fhfr.qr> <20170621144001.GR2123@tucnak> <alpine.DEB.2.20.1706211657570.2094@stedding.saclay.inria.fr> <alpine.LSU.2.20.1706221022200.23185@zhemvz.fhfr.qr> <alpine.DEB.2.20.1706221105340.2158@stedding.saclay.inria.fr> <alpine.LSU.2.20.1706221142080.23185@zhemvz.fhfr.qr> <alpine.DEB.2.20.1707011757000.1896@stedding.saclay.inria.fr> <alpine.LSU.2.20.1707031436280.23185@zhemvz.fhfr.qr> <alpine.DEB.2.20.1710082250230.3077@stedding.saclay.inria.fr> <CAFiYyc0m9RsAeWGA=uPMwqE8CvKpjO8yHykg-Ng+Eq-XC=6LRg@mail.gmail.com> <alpine.DEB.2.20.1710192143090.2973@stedding.saclay.inria.fr>
On October 28, 2017 2:53:56 PM GMT+02:00, Marc Glisse <marc.glisse@inria.fr> wrote:
>
>I am sending the new version of the patch in a separate email, to make
>it
>more visible, and only replying to a few points here.
>
>On Thu, 19 Oct 2017, Richard Biener wrote:
>
>> On Mon, Oct 9, 2017 at 12:55 PM, Marc Glisse <marc.glisse@inria.fr>
>wrote:
>>> On Mon, 3 Jul 2017, Richard Biener wrote:
>>>
>>>> On Sat, 1 Jul 2017, Marc Glisse wrote:
>>>>
>>>>> I wrote a quick prototype to see what the fallout would look like.
>>>>> Surprisingly, it actually passes bootstrap+testsuite on ppc64el
>with all
>>>>> languages with no regression. Sure, it is probably not a complete
>>>>> migration, there are likely a few places still converting to
>ptrdiff_t
>>>>> to perform a regular subtraction, but this seems to indicate that
>the
>>>>> work isn't as bad as using a signed type in pointer_plus_expr for
>>>>> instance.
>>>>
>>>>
>>>> The fold_binary_loc hunk looks dangerous (it'll generate MINUS_EXPR
>>>> from POINTER_MINUS_EXPR in some cases I guess).
>>>>
>>>> The tree code needs documenting in tree.def and generic.texi.
>>>>
>>>> Otherwise ok(*).
>>>>
>>>> Thanks,
>>>> Richard.
>>>>
>>>> (*) ok, just kidding -- or maybe not
>>>
>>>
>>> I updated the prototype a bit. Some things I noticed:
>>>
>>> - the C front-end has support for address spaces that seems to imply
>that
>>> pointer subtraction (and division by the size) may be done in a type
>larger
>>> than ptrdiff_t. Currently, it generates
>>> (ptrdiff_t)(((inttype)q-(inttype)p)/size) for q-p where inttype is
>some type
>>> potentially larger than ptrdiff_t.
>>
>> It uses a ptrdiff_t corresponding to the respective address space,
>yes.
>> That we use sizetype elsewhere unconditionally is a bug :/
>>
>>> I am thus generating a POINTER_DIFF_EXPR
>>> with that type, while I was originally hoping its type would always
>be
>>> ptrdiff_t. It complicates code and I am not sure I didn't break
>address
>>> spaces anyway... (expansion likely needs to do the inttype dance)
>>
>> I think that's fine. Ideally targets would provide a type to use for
>each
>> respective address space given we have targets that have sizetype
>smaller
>> than ptr_mode.
>>
>>> Are ptrdiff_t (what POINTER_DIFF_EXPR should return) and size_t
>(what
>>> POINTER_PLUS_EXPR takes as second argument) always the same type
>>> signed/unsigned?
>>
>> POINTER_PLUS_EXPR takes 'sizetype', not size_t.
>
>Ah, yes, that's the one I meant...
>
>> So the answer is clearly no. And yes, that's ugly and broken and
>should be fixed.
>>
>>> Counter-examples: m32c (when !TARGET_A16), visium, darwin,
>>> powerpcspe, s390, vms... and it isn't even always the same that is
>bigger
>>> than the other. That's quite messy.
>>
>> Eh. Yeah, targets are free to choose 'sizetype' and they do so for
>> efficiency. IMHO we should get rid of this "feature".
>>
>>> - I had to use @@ in match.pd, not for constants, but because in
>GENERIC we
>>> sometimes ended up with pointers where operand_equal_p said yes but
>>> types_match disagreed.
>>>
>>> - It currently regresses 2 go tests: net/http runtime/debug
>>
>> Those are flaky for me and fail sometimes and sometimes not.
>
>Yes, I noticed that (there are 1 or 2 others in the go testsuite).
>
>> +@item POINTER_DIFF_EXPR
>> +This node represents pointer subtraction. The two operands always
>> +have pointer/reference type. The second operand is always an
>unsigned
>> +integer type compatible with sizetype. It returns a signed integer.
>>
>> the 2nd sentence looks bogusly copied.
>
>Oups, removed.
>
>>
>> + /* FIXME. */
>> + if (code == POINTER_DIFF_EXPR)
>> + return int_const_binop (MINUS_EXPR,
>> + fold_convert (ptrdiff_type_node,
>arg1),
>> + fold_convert (ptrdiff_type_node,
>arg2));
>>
>> wide_int_to_tree (type, wi::to_widest (arg1) - wi::to_widest
>(arg2));
>
>Before your reply, I wrote something similar using offset_int instead
>of
>widest_int (and handling overflow, mostly for documentation purposes).
>I
>wasn't sure which one to pick, I can change to widest_int if you think
>it
>is better...
Offset_int is better.
>> + case POINTER_DIFF_EXPR:
>> + {
>> + if (!POINTER_TYPE_P (rhs1_type)
>> + || !POINTER_TYPE_P (rhs2_type)
>> + // || !useless_type_conversion_p (rhs2_type, rhs1_type)
>>
>> types_compatible_p (rhs1_type, rhs2_type)?
>
>Makes sense.
>
>> + // || !useless_type_conversion_p (ptrdiff_type_node,
>lhs_type))
>> + || TREE_CODE (lhs_type) != INTEGER_TYPE
>> + || TYPE_UNSIGNED (lhs_type))
>> + {
>> + error ("type mismatch in pointer diff expression");
>> + debug_generic_stmt (lhs_type);
>> + debug_generic_stmt (rhs1_type);
>> + debug_generic_stmt (rhs2_type);
>> + return true;
>>
>> there's also verify_expr which could want adjustment for newly
>created
>> tree kinds.
>
>ok.
>
>> So if the precision of the result is larger than that of the pointer
>> operands we sign-extend the result, right? So the subtraction is
>> performed in precision of the pointer operands and then
>> sign-extended/truncated to the result type? Which means it is _not_ a
>
>> widening subtraction to get around the half-address-space issue. The
>
>> tree.def documentation should reflect this semantic detail. Not sure
>if
>> the RTL expansion follows it.
>
>I actually have no idea what expansion does if the size is different
>(that
>was one of my comments). Crashing is not unlikely.
>
>I have changed things a bit. Now POINTER_DIFF_EXPR always has the same
>precision as input and output (so expansion should be fine). And there
>is
>code in the C/C++ front-ends that uses casts and MINUS_EXPR if we want
>a
>result wider than pointers, although it might very well be dead code...
>
>My goal was not to help with objects larger than half the address
>space,
>but to fix things for small objects unlucky enough to straddle the
>middle
>of the address space. By properly modeling what overflow means for this
>
>operation, I expect we can add more simplifications in match.pd.
Yeah.
Thanks,
Richard.
>> I think that we'd ideally have a single legal INTEGER_TYPE precision
>> per pointer type precision and that those precisions should match...
>> we don't have to follow the mistakes of POINTER_PLUS_EXPR.
>>
>> So ... above verify TYPE_PRECISION (rhs1_type) == TYPE_PRECISION
>(lhs_type)?
>> Some targets have 24bit ptr_mode but no 24bit integer type which
>means the
>> FE likely chooses 32bit int for the difference computation. But the
>middle-end
>> should be free to create a 24bit precision type with SImode.
>>
>> Otherwise as said before - go ahead, I think this would be great to
>> have for GCC 8. I'd say
>> ask the maintainers of the above list of targets to do some testing.
>>
>> "Fixing" POINTER_PLUS_EXPR would be very nice as well. Again,
>matching
>> precision - I'm not sure if we need to force a signed operand, having
>either
>> might be nice given all sizes are usually unsigned.
>>
>> Thanks and sorry for the delay,
>> Richard.