This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] have __builtin_object_size handle POINTER_PLUS with non-const offset (pr 77608)
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: Martin Sebor <msebor at gmail dot com>
- Cc: Gcc Patch List <gcc-patches at gcc dot gnu dot org>
- Date: Fri, 2 Dec 2016 10:28:58 +0100
- Subject: Re: [PATCH] have __builtin_object_size handle POINTER_PLUS with non-const offset (pr 77608)
- Authentication-results: sourceware.org; auth=none
- References: <b60bf590-8063-5986-5cec-227f6a273017@gmail.com> <CAFiYyc0Qtb8gp16PLTSB1qXfTDAFEGT3ST6huskRNDfhqfPDkA@mail.gmail.com> <efbe2c67-d56b-7d6d-fa62-e4755647c8a6@gmail.com> <67844d94-a09d-5c51-f45a-f102b150c4a1@gmail.com> <CAFiYyc2NAdegpc01n=e_yJYr2Tige8oe87KF0fFOrS9KjWkq1w@mail.gmail.com> <3abfef62-610d-fd07-1d4a-ac353855f6d3@gmail.com> <CAFiYyc210-eUf888FLvRswqnw-5zAgsh6FGWTpfKizpDejKmUw@mail.gmail.com> <65c4c24d-aca7-7ac7-5e63-606d7146fd68@gmail.com>
On Thu, Dec 1, 2016 at 6:58 PM, Martin Sebor <msebor@gmail.com> wrote:
>> Sure - but then you maybe instead want to check for op being in
>> range [0, max-of-signed-type-of-op] instead? So similar to
>> expr_not_equal_to add a expr_in_range helper?
>>
>> Your function returns true for sizetype vars even if it might be
>> effectively signed, like for
>>
>> sizetype i_1 = -4;
>> i_2 = i_1 + 1;
>>
>> operand_unsigned_p (i) returns true. I suppose you may have
(*)
>> meant
>>
>> +static bool
>> +operand_unsigned_p (tree op)
>> +{
>> + if (TREE_CODE (op) == SSA_NAME)
>> + {
>> + gimple *def = SSA_NAME_DEF_STMT (op);
>> + if (is_gimple_assign (def))
>> + {
>> + tree_code code = gimple_assign_rhs_code (def);
>> + if (code == NOP_EXPR
>> && TYPE_PRECISION (TREE_TYPE (op)) > TYPE_PRECISION
>> (TREE_TYPE (gimple_assign_rhs1 (def))))
>> return tree_expr_nonnegative_p (gimple_assign_rhs1 (def)));
>> + }
>> + }
>> +
>> + return false;
>> +}
>>
>> ? because only if you do see a cast and that cast is widening from an
>> nonnegative number
>> the final one will be unsigned (if interpreted as signed number).
>
>
> I don't think this is what I want. Here's a test case that works
> with my function but not with the suggested modification:
>
> char d[4];
> long f (unsigned long i)
> {
> return __builtin_object_size (d + i + 1, 0);
> }
>
> Here, the size I'm looking for is (at most) 3 no matter what value
> i has. Am I missing a case where my function will do the wrong
> thing?
See above (*)
I know what you are trying to do (retro-actively make value-ranges have
a split variable / constant part). But I don't think that doing it the way you
do it is a reasonable maintainable way. Others may disagree.
>> You might want to use offset_ints here (see mem_ref_offset for example)
>
>
> Okay, I'll see if I can switch to offset_int.
>
>>
>>>>
>>>> + gimple *def = SSA_NAME_DEF_STMT (off);
>>>> + if (is_gimple_assign (def))
>>>> + {
>>>> + tree_code code = gimple_assign_rhs_code (def);
>>>> + if (code == PLUS_EXPR)
>>>> + {
>>>> + /* Handle offset in the form VAR + CST where VAR's
>>>> type
>>>> + is unsigned so the offset must be the greater of
>>>> + OFFRANGE[0] and CST. This assumes the PLUS_EXPR
>>>> + is in a canonical form with CST second. */
>>>> + tree rhs2 = gimple_assign_rhs2 (def);
>>>>
>>>> err, what? What about overflow? Aren't you just trying to decompose
>>>> 'off' into a variable and a constant part here and somehow extracting a
>>>> range for the variable part? So why not just do that?
>>>
>>>
>>>
>>> Sorry, what about overflow?
>>>
>>> The purpose of this code is to handle cases of the form
>>>
>>> & PTR [range (MIN, MAX)] + CST
>>
>>
>> what if MAX + CST overflows?
>
>
> The code doesn't look at MAX, only MIN is considered. It extracts
> both but only actually uses MAX to see if it's dealing with a range
> or a constant. Does that resolve your concern?
But if MAX overflows then it might be smaller than MIN and thus you
cannot conclude that [min, max] + CST is [min+CST, UNKNWON]
but rather it's [UNKNOWN, UNKNOWN] (if you disregard the actual
value of MAX).
>>> char d[7];
>>>
>>> #define bos(p, t) __builtin_object_size (p, t)
>>>
>>> long f (unsigned i)
>>> {
>>> if (2 < i) i = 2;
>>>
>>> char *p = &d[i] + 3;
>>>
>>> return bos (p, 0);
>>> }
>>
>>
>> I'm sure that doesn't work as you match for PLUS_EXPR.
>
>
> Sorry, I'm not sure what you mean.
I mean that p = &d[i] + 3; is not represented as a PLUS_EXPR
but as a POINTER_PLUS_EXPR. All pointer arithmetic is using
POINTER_PLUS_EXPR.
> The above evaluates to 4 with
> the patch because i cannot be less than zero (otherwise &d[i] would
> be invalid/undefined) so the type-0 size we want (the maximum) is
> &d[7] - (&d[0] + 3) or 4.
>
>> Maybe simply ignore VR_ANTI_RANGEs for now then?
>
>
> Yes, that makes sense.
>
>>> The code above is based on the observation that an anti-range is
>>> often used to represent the full subrange of a narrower signed type
>>> like signed char (as ~[128, -129]). I haven't been able to create
>>> an anti-range like ~[5, 9]. When/how would a range like that come
>>> about (so I can test it and implement the above correctly)?
>>
>>
>> if (a < 4)
>> if (a > 8)
>> b = a;
>>
>> then b should have ~[5, 9]
>
>
> Right :) I have figured out by know by know how to create an anti
> range in general. What I meant is that I haven't had luck creating
> them in a way that the tree-object-size pass could see (I'm guessing
> because EVRP doesn't understand relational expressions). So given
> this modified example from above:
>
> char d[9];
>
> #define bos(p, t) __builtin_object_size (p, t)
>
> long f (unsigned a)
> {
> unsigned b = 0;
>
> if (a < 4)
> if (a > 8)
> b = a;
>
> char *p = &d[b];
> return bos (p, 0);
> }
>
> The value ranges after Early VRP are:
>
> _1: VARYING
> b_2: VARYING
> a_4(D): VARYING
> p_6: ~[0B, 0B]
> _8: VARYING
Of course - that's because a) I did a mistake, it shoud be if (a < 4 || a> 8),
b) EVRP does not look at DEFs (all the magic in register_edge_assert_for
is not factored out to be usable by EVRP), c) there's no SSA name for
b having the anti-range but we only have the PHI merging that with 0.
A better testcase would be
long f (unsigned a)
{
unsigned b = 0;
if (a < 4)
goto x;
if (a > 8)
goto x;
goto y;
x:
b = a;
y:;
char *p = &d[b];
return bos (p, 0);
}
but even that fails to create the anti-range. I suppose we have work
to do for EVRP ;)
> But with the removal of the anti-range code this will be a moot
> point.
>
>>> Maybe the poor range info i a consequence of the pass only benefiting
>>> from EVRP and not VRP?
>>
>>
>> The range of 'p' is indeed not known (we only represent integer bound
>> ranges).
>> You seem to want the range of p - &d[0] here, something that is not
>> present
>> in the IL.
>
>
> Yes, that's effectively what this patch does. Approximate pointer
> ranges.
>
>>> It's just something I haven't had time to work on yet and with the
>>> close of stage 1 approaching I wanted to put out this version for
>>> review. Do you view this enhancement as prerequisite for approving
>>> the patch or is it something that you'd be fine with adding later?
>>
>>
>> I find the patch adds quite some ad-hoc ugliness to a pass that is
>> already complex and nearly impossible to understand.
>
>
> I'm sorry it looks ugly to you. I'm afraid I'm not yet familiar
> enough with the code to see the distinction. I'm just happy when
> I get things to work the way I want them to :) On this project
> I feel like I've been working around limitations both in the pass
> itself and those caused by when it runs. I tried to make it run
> later, after VRP, but run into a failure in builtin-object-size-14.c.
> From the history of the test (mostly your comments on bug 59125)
> it seems that there's some delicate interplay between the pass and
> some others that might make changing when it runs tricky business.
> I haven't tried to figure out how to deal with the problems.
Well, me neither -- I still think b_o_s is fundamentally flawed when
facing an optimizing compiler. Esp. in its modes where it tries
to handle sub-objects.
>> I'm leaving it to others if we have to have this improvement for GCC 7
>> (bos has its own share of know issues besides missing features).
>
>
> Okay, thanks for your feedback. I agree that the pass is hard to
> read (and harder still to modify). I also agree that it could be
> improved even beyond this modest enhancement and made more useful.
> Given its complexity it could also do with more documentation.
>
> Unfortunately I'm out of time for GCC 7 to make substantial
> changes to it. But it seems to me that while this patch may not
> be elegant it does work and makes _FORTIFY_SOURCE a little more
> effective.
>
> Martin