[PATCH] Tree-level fix for PR 69526
Richard Biener
richard.guenther@gmail.com
Fri Oct 14 11:49:00 GMT 2016
On Tue, Sep 20, 2016 at 2:31 PM, Robin Dapp <rdapp@linux.vnet.ibm.com> wrote:
> Hi,
>
>> I meant to do sth like
>>
>> Index: tree-ssa-propagate.c
>> ===================================================================
>> --- tree-ssa-propagate.c (revision 240133)
>> +++ tree-ssa-propagate.c (working copy)
>> @@ -1105,10 +1105,10 @@ substitute_and_fold_dom_walker::before_d
>> /* Replace real uses in the statement. */
>> did_replace |= replace_uses_in (stmt, get_value_fn);
>>
>> - /* If we made a replacement, fold the statement. */
>> - if (did_replace)
>> + /* Fold the statement. */
>> + if (fold_stmt (&i, follow_single_use_edges))
>> {
>> - fold_stmt (&i, follow_single_use_edges);
>> + did_replace = true;
>> stmt = gsi_stmt (i);
>> }
>>
>> this would need compile-time cost evaluation (and avoid the tree-vrp.c
>> folding part
>> of your patch).
>
> Using this causes the simplifications to be performed in ccp1 instead of
> fwprop1. I noticed two tests failing that rely on propagation being
> performed in fwprop. Should these be adapted or rather the patch be changed?
>
>> Heh, but I think duplicating code is even worse.
>
> Ok, I changed extract_range_... after all, it's not so bad as I had
> thought. Imho it should be simplified nevertheless, perhaps I'll do it
> in the future if I find some time.
>
>> + tree cst_inner = wide_int_to_tree (inner_type, cst);
>>
>> What prevents CST being truncated here? It looks like
>> (long)intvar + 0xffff00000000L will be mishandled that way.
>>
>
> Right, it may be truncated here, but the following TYPE_PRECISION ()
> guard prevents the value from being used in that case. I pulled the
> line into the condition for clarity.
>
>> OTOH given that you use this to decide whether you can use a combined constant
>> that doesn't look necessary (if the constant is correct, that is) --
>> what you need
>> to make sure is that the final operation, (T)(A) +- CST, does not overflow
>> if ! TYPE_OVERFLOW_WRAPS and there wasn't any overflow in the
>> original operation. I think that always holds, thus the combine_ovf check looks
>> superfluous to me.
>>
>
> I included this to account for cases like
> return (long)(a + 2) + LONG_MAX
> which should not simplify as opposed to
> return (unsigned long)(a + 2) + LONG_MAX
> where the constant is usable despite the overflow. Do you think it
> should be handled differently?
Well, (long)(a + 2) + LONG_MAX -> (long)a + (LONG_MAX + 2) is
indeed invalid because the resulting expression may overflow. But
that's "easily fixable" by computing it in unsigned arithmetic, that is
doing
(long)(a + 2) + LONG_MAX -> (long)((unsigned long)a + (LONG_MAX + 2))
which I believe is still desired? Anyway, since the desired transform in the
end is (long)a + CST -> (long)(a + CST) this constant will not fit
here anyway...
Few comments on patch details:
+ if (TYPE_PRECISION (type) >= TYPE_PRECISION (inner_type))
please put the precision tests as a pattern if, thus
(for outer_op (plus minus)
+ (for inner_op (plus minus)
+ (simplify
+ (outer_op (convert (inner_op@3 @0 INTEGER_CST@1)) INTEGER_CST@2)
(if (TYPE_PRECISION (type) >= TYPE_PRECISION (TREE_TYPE (@3)))
(with
{
...
that makes it more obvious you are matching a widening conversion.
+ if (@0 == NULL_TREE || @1 == NULL_TREE
+ || inner_type == NULL_TREE
captures cannot be NULL_TREE nor can their type be NULL_TREE. If you run
into such cases sth is broken elsewhere.
You have tests against INTEGER_TYPE - please put those alongsize (even before)
the precision test. Note that you probably want to use INTEGRAL_TYPE_P there
to also catch ENUM_TYPE and BOOLEAN_TYPE.
+ (outer_op (convert (inner_op@3 @0 INTEGER_CST@1)) INTEGER_CST@2)
you can use @0 to get at the type of inner_op.
For my taste there is too many temporaries, like 'check_inner_ovf'
where at the single
point of use a !TYPE_OVERFLOW_UNDEFINED (inner_type); would be much more
descriptive.
+ /* Convert to TYPE keeping possible signedness even when
+ dealing with unsigned types. */
+ wide_int v1;
+ v1 = v1.from (w1, w2.get_precision(), tree_int_cst_sgn (@1) ?
+ SIGNED : UNSIGNED);
better to comment it as /* Extend @1 to TYPE according to its sign. */
I also think you want to use TYPE_SIGN (inner_type) instead of
tree_int_cst_sgn (@1) just to simplify the code. You also want to
use TYPE_PRECISION (inner_type) instead of w2.get_precision ().
+ if (inner_op == MINUS_EXPR)
+ w1 = wi::neg (w1);
I think you want to negate v1 here? (better not introduce v1 but use w1
for the extension?)
+ /* Combine in outer, larger type. */
+ combined_cst = wi::add (v1, w2
+ TYPE_SIGN (type), &combine_ovf);
So now we know that for (T)(X + CST1) + CST2, (T)CST1 + CST2
does not overflow. But we do not really care for that, we want to know
whether (T)X + CST' might invoke undefined behavior when the original
expression did not. This involves range information on X. I don't
see how we ensure this here.
Richard.
>
> Revised version attached.
>
> Regards
> Robin
>
> --
>
> gcc/ChangeLog:
>
> 2016-09-20 Robin Dapp <rdapp@linux.vnet.ibm.com>
>
> PR middle-end/69526
> This enables combining of wrapped binary operations and fixes
> the tree level part of the PR.
>
> * match.pd: Introduce patterns to simplify binary operations
> wrapped inside a cast.
> * tree-vrp.h: Export extract_range_from_binary_expr ().
> * tree-ssa-propagate.c
> (substitute_and_fold_dom_walker::before_dom_children):
> Try to fold regardless of prior use propagations.
> * tree-vrp.c (extract_range_from_binary_expr_1):
> Add overflow check arguments and wrapper function without them.
> (extract_range_from_binary_expr):
> Add wrapper function.
>
>
> gcc/testsuite/ChangeLog:
>
> 2016-09-20 Robin Dapp <rdapp@linux.vnet.ibm.com>
>
> * gcc.dg/wrapped-binop-simplify-signed-1.c: New test.
> * gcc.dg/wrapped-binop-simplify-signed-2.c: New test.
> * gcc.dg/wrapped-binop-simplify-unsigned-1.c: New test.
> * gcc.dg/wrapped-binop-simplify-unsigned-2.c: New test.
More information about the Gcc-patches
mailing list