[PATCH] Add sanopt support for UBSAN_PTR.
Martin Liška
mliska@suse.cz
Thu Oct 5 13:52:00 GMT 2017
On 10/04/2017 08:06 PM, Jakub Jelinek wrote:
> On Wed, Oct 04, 2017 at 11:05:23AM +0200, Martin Liška wrote:
>> Following patch adds support for optimizing out unnecessary UBSAN_PTR checks.
>> It handles separately positive and negative offsets, zero offset is ignored.
>> Apart from that, we utilize get_inner_reference for local and global variables,
>> that also helps to reduce some.
>
> Thanks for working on it.
>
>> @@ -148,6 +149,61 @@ struct sanopt_tree_triplet_hash : typed_noop_remove <sanopt_tree_triplet>
>> }
>> };
>>
>> +/* Tree couple for ptr_check_map. */
>> +struct sanopt_tree_couple
>> +{
>> + tree ptr;
>> + bool positive;
>
> Maybe pos_p instead, so that it isn't that long?
Hi.
Yes.
>
>> + static inline bool
>> + equal (const sanopt_tree_couple &ref1, const sanopt_tree_couple &ref2)
>> + {
>> + return operand_equal_p (ref1.ptr, ref2.ptr, 0)
>> + && ref1.positive == ref2.positive;
>
> Or this would need to use ()s around the return expression for emacs
> users.
Done.
>
>> +/* Return true when pointer PTR for a given OFFSET is already sanitized
>> + in a given sanitization context CTX. */
>> +
>> +static bool
>> +has_dominating_ubsan_ptr_check (sanopt_ctx *ctx, tree ptr, tree cur_offset)
>
> Any reason why you pass cur_offset as a tree rather than wide_int & or
> offset_int &? If the caller makes sure it is sign extended from
> pointer/sizetype's precision, then:
>
>> +{
>> + int r = get_range_pos_neg (cur_offset);
>
> here you can just wi::neg_p or similar.
>
>> + /* We already have recorded a UBSAN_PTR check for this pointer. Perhaps we
>> + can drop this one. But only if this check doesn't specify larger offset.
>> + */
>> + tree offset = gimple_call_arg (g, 1);
>> + gcc_assert (TREE_CODE (offset) == INTEGER_CST);
>> +
>> + if (positive && tree_int_cst_le (cur_offset, offset))
>> + return true;
>> + else if (!positive && tree_int_cst_le (offset, cur_offset))
>> + return true;
>
> And the comparisons here in wide_int would be cheaper too.
>
>> +
>> +static void
>> +record_ubsan_ptr_check_stmt (sanopt_ctx *ctx, gimple *stmt, tree ptr,
>> + tree offset)
>> +{
>> + int r = get_range_pos_neg (offset);
>
> See above.
Good point! I will use offset_int at all places.
>
>> + gcc_assert (r != 3);
>> + bool positive = r == 1;
>> +
>> + sanopt_tree_couple couple;
>> + couple.ptr = ptr;
>> + couple.positive = positive;
>> +
>> + auto_vec<gimple *> &v = ctx->ptr_check_map.get_or_insert (couple);
>> + v.safe_push (stmt);
>> +}
>> +
>> +/* Optimize away redundant UBSAN_PTR calls. */
>> +
>> +static bool
>> +maybe_optimize_ubsan_ptr_ifn (sanopt_ctx *ctx, gimple *stmt)
>> +{
>> + gcc_assert (gimple_call_num_args (stmt) == 2);
>> + tree ptr = gimple_call_arg (stmt, 0);
>> + tree cur_offset = gimple_call_arg (stmt, 1);
>> +
>> + if (TREE_CODE (cur_offset) != INTEGER_CST)
>> + return false;
>> +
>> + if (integer_zerop (cur_offset))
>> + return true;
>> +
>> + if (has_dominating_ubsan_ptr_check (ctx, ptr, cur_offset))
>> + return true;
>> +
>> + tree base = ptr;
>> + if (TREE_CODE (base) == ADDR_EXPR)
>> + {
>> + base = TREE_OPERAND (base, 0);
>> +
>> + HOST_WIDE_INT bitsize, bitpos;
>> + machine_mode mode;
>> + int volatilep = 0, reversep, unsignedp = 0;
>> + tree offset;
>> + base = get_inner_reference (base, &bitsize, &bitpos, &offset, &mode,
>> + &unsignedp, &reversep, &volatilep);
>> + if (DECL_P (base))
>> + {
>> + gcc_assert (!DECL_REGISTER (base));
>> + /* If BASE is a fixed size automatic variable or
>> + global variable defined in the current TU and bitpos
>> + fits, don't instrument anything. */
>> + if ((offset == NULL_TREE || TREE_CODE (offset) == INTEGER_CST)
>
> Do you really need to handle offset != NULL_TREE?
> If the bit offset is representable in shwi, then it will just be
> in bitpos and offset will be NULL.
For this:
UBSAN_PTR (&MEM[(void *)&b + 9223372036854775807B], 1);
I have offset:
<integer_cst 0x155553ec19f0 type <integer_type 0x155553d81000 sizetype> constant 9223372036854775807>
which is a valid offset.
>
>> + && (VAR_P (base)
>> + || TREE_CODE (base) == PARM_DECL
>> + || TREE_CODE (base) == RESULT_DECL)
>> + && DECL_SIZE (base)
>> + && TREE_CODE (DECL_SIZE (base)) == INTEGER_CST
>
> Why are you testing here DECL_SIZE when you actually then use
> DECL_SIZE_UNIT?
Yep, will fix that.
>
>> + && (!is_global_var (base) || decl_binds_to_current_def_p (base)))
>> + {
>> + HOST_WIDE_INT bytepos = bitpos / BITS_PER_UNIT;
>> + offset_int total_offset = bytepos;
>> + if (offset != NULL_TREE)
>> + total_offset += wi::sext (wi::to_offset (offset), POINTER_SIZE);
>> + total_offset += wi::sext (wi::to_offset (cur_offset),
>> + POINTER_SIZE);
>
> Why are you sign extending it each time? Can't it be sign extended after
> all the additions?
I had problem with this:
UBSAN_PTR (&MEM[(void *)&b + -9223372036854775807B], 9223372036854775805);
when not doing sign extension, I was said that fits_shwi_p(to_offset(-9223372036854775807) + to_offset (9223372036854775805)) is false.
> On the other side, is it safe to add the (bytepos + offset) part with
> the cur_offset part unconditionally?
> If the former is "positive" and cur_offset is "negative", they cancel each
> other and we IMHO shouldn't try to optimize it away. bytepos could be some
> very large integer, outside of bounds of the var, and cur_offset some
> negative constant, where both var + bytepos and (var + bytepos) + cur_offset
> overflow. Or bytepos could be negative, outside of the bounds of the
> variable. I think you need to check separately that bytepos is >= 0 and
> <= DECL_SIZE_UNIT and that bytepos + cur_offset is within that range too.
>
>> + /* New total offset can fit in pointer type. */
>> + if (wi::fits_shwi_p (total_offset))
>
> Why do you need to convert a wide_int to signed HOST_WIDE_INT and back?
> Or do you want to check for some overflows above?
Do I understand it correctly that:
p = b - 9223372036854775807LL; /* pointer overflow check is needed */
p2 = p + 9223372036854775805LL;
should not be handled because: &b + -9223372036854775807B is outsize of address of b?
Am I right?
I will rewrite the usage of offset, to support just offset == NULL_TREE.
>
>> + {
>> + tree toffset = build_int_cst (TREE_TYPE (cur_offset),
>> + total_offset.to_shwi ());
>> + tree ptr2 = build1 (ADDR_EXPR,
>> + build_pointer_type (TREE_TYPE (base)),
>> + base);
>
> Why do you need to create these trees here when you actually don't use them
> (or shouldn't need)
Is it really impossible to catch something by has_dominating_ubsan_ptr_check (ctx, ptr2, total_offset)?
>
>> +
>> + /* If total offset is non-negative and smaller or equal
>> + to size of declaration, then no check is needed. */
>> + if (get_range_pos_neg (toffset) == 1
>> + && tree_int_cst_le (toffset, DECL_SIZE_UNIT (base)))
>> + return true;
>
> above.
>
>> + /* Maybe we already have a dominating check for the BASE. */
>> + else if (has_dominating_ubsan_ptr_check (ctx, ptr2, toffset))
>> + return true;
>
> And, isn't this something that really doesn't depend on whether base is a
> decl (so move it outside of the if (DECL_P ()) if), but rather just on
> whether bytepos and cur_offset have the same "sign"?
> If ptr is say &MEM_REF[whatever + 1024] and 2048 (both positive), then
> it will not overflow if we've checked that UBSAN(whatever, 3072) (or more).
> Similarly for both negative values. For &MEM_REF[whatever + 3072] and -2048,
> previous UBSAN(whatever, 3072) check would be sufficient too, or for
> &MEM_REF[whatever + 1024] and -3072 a previous UBSAN(whatever, -2048) check
> (or more negative). But in any case, you need to compare the signs and if
> they aren't the same, reason what is appropriate and what is not.
Good point, I will work on that!
>> else
>> - gsi_next (&gsi);
>> + {
>> + if (dump_file && (dump_flags & TDF_DETAILS))
>> + {
>> + fprintf (dump_file, "Leaving: ");
>> + print_gimple_stmt (dump_file, stmt, 0, dump_flags);
>> + }
>
> Why this? Printing all other stmts? That doesn't make sense to me.
Ok, it's more debugging dump and I'll remove it.
Thanks for feedback,
Martin
>
> Jakub
>
More information about the Gcc-patches
mailing list