[RFC][IPA-VRP] Early VRP Implementation
kugan
kugan.vivekanandarajah@linaro.org
Tue Jul 26 12:27:00 GMT 2016
Hi Riachard,
Thanks for the review. Here is an updated patch with comments below.
> +/* Restore/Pop all the old VRs maintained in the cond_stack. */
> +
> +void evrp_dom_walker::finalize_dom_walker ()
> +{
> + while (!cond_stack.is_empty ())
> + {
> + tree var = cond_stack.last ().second;
> + pop_value_range (var);
> + cond_stack.pop ();
> + }
>
> why is this necessary at all? Looks like a waste of time (and the
> stack should be empty here
> anyway). I think you leak cond_stack as well, I suppose using
> auto_vec might work here,
> you have to check.
Done.
>
>>>
>>> + /* Discover VR when condition is true. */
>>> + if (te == e
>>> + && !TREE_OVERFLOW_P (op0)
>>> + && !TREE_OVERFLOW_P (op1))
>>
>>
>> This is mainly to handle gcc.dg/pr38934.c. This would result in ICE from
>> set_value_range otherwise:
>>
>> gcc_assert ((!TREE_OVERFLOW_P (min) || is_overflow_infinity (min))
>> && (!TREE_OVERFLOW_P (max) || is_overflow_infinity
>> (max)));
>
> Ok, instead make sure to remove the overflow flag via
>
> if (TREE_OVERFLOW_P (op0))
> op0 = drop_tree_overflow (op0);
...
Done.
>
> it looks like you want to merge the if ( & EDGE_TRUE_VALUE) and FALSE_VALUE
> cases and only invert the tree comparison for the false edge.
Done.
>
> + tree cond = build2 (code, boolean_type_node, op0, op1);
> + extract_range_for_var_from_comparison_expr (&vr, op0, cond);
>
> no wasteful tree building here please. Instead of cond pass in code,
> op0 and op1
> as separate arguments.
Done.
>
> + /* If we found any usable VR, set the VR to ssa_name and create a
> + PUSH old value in the cond_stack with the old VR. */
> + if (vr.type == VR_RANGE || vr.type == VR_ANTI_RANGE)
> + {
> + value_range *new_vr = vrp_value_range_pool.allocate ();
> + memset (new_vr, 0, sizeof (*new_vr));
> + *new_vr = vr;
> + cond_stack.safe_push (std::make_pair (bb, op0));
>
> the memset looks redundant given you copy-assing from vr anyway.
Done.
>
> You seem to miss the chance to register ranges for x_2 in i_1 < x_2 where
> both i_1 and x_2 might have ranges that are useful.
I will address this once the basic structure is in place if that is OK
with you.
>
> push and pop_value_range should be methods in the dom walker class
> and vrp_stack and cond_stack should be a single stack. You can omit
> basic_block if you use a "magic" NULL_TREE var as marker (simply
> push a NULL_TREE, NULL pair at the end of before_dom_children).
>
Done.
>>>
>>> you can use e->flags & EDGE_TRUE_VALUE/EDGE_FALSE_VALUE
>>>
>>> why do you need those TREE_OVERFLOW checks?
>>>
>>> + tree cond = build2 (code, boolean_type_node, op0, op1);
>>> + tree a = build2 (ASSERT_EXPR, TREE_TYPE (op0), op0, cond);
>>> + extract_range_from_assert (&vr, a);
>>>
>>> so I was hoping that the "refactoring" patch in the series would expose a
>>> more
>>> useful interface than extract_range_from_assert ... namely one that can
>>> extract a range from the comparison directly and does not require building
>>> a scratch ASSERT_EXPR.
>>
>>
>> I have split extract_range_from_assert into
>> extract_range_for_var_from_comparison_expr and using it now. Is that better?
>
> See above.
>
>>>
>>>
>>> + /* If we found any usable VR, set the VR to ssa_name and create
>>> a
>>> + restore point in the cond_stack with the old VR. */
>>> + if (vr.type == VR_RANGE || vr.type == VR_ANTI_RANGE)
>>> + {
>>> + value_range *new_vr = XCNEW (value_range);
>>> + *new_vr = vr;
>>> + cond_stack.safe_push (std::make_pair (bb,
>>> + std::make_pair (op0,
>>> +
>>> old_vr)));
>>> + change_value_range (op0, new_vr);
>>>
>>> I don't like 'change_value_range' as interface, please integrate that into
>>> a push/pop_value_range style interface instead.
>>
>>
>> Tried using push/pop interface.
>>>
>>>
>>> + vrp_visit_stmt (stmt, &taken_edge_p, &output_p);
>>> + }
>>> +
>>> + return NULL;
>>>
>>> you should return taken_edge_p (misnamed as it isn't a pointer) and take
>>> advantage of EDGE_EXECUTABLE. Again see DOM/SCCVN (might want to
>>> defer this as a followup improvement).
>>
>>
>> I have added a TODO to this effect and will comeback to it.
>>>
>>>
>>> Note that the advantage of a DOM-based VRP is that backtracking is easy
>>> to implement (you don't do that yet). That is, after DEF got a (better)
>>> value-range you can simply re-visit all the defs of its uses (and
>>> recursively).
>>> I think you have to be careful with stmts that might prematurely leave a
>>> BB
>>> though (like via EH). So sth for a followup as well.
>>
>>
>> Is this OK now. Bootstrapped and regression tested on x86_64-linux with no
>> new regressions.
>
> Better, still not perfect.
>
> I'm also arguing on the pass placement now - you put it where it may make sense
> for IPA VRP (but then IPA VRP could simply do this in its analysis phase) but
> not so much where it makes sense during the early pipeline. I think it makes
> sense after FRE.
>
> Looking at how you finalize I see several issues with simply re-using
> vrp_finalize.
>
> First of all the loop doing the set_range_info will turn up with
> nothing as you've
> popped off all value-ranges from the lattice. Same for substitute-and-fold (or
> jump-threading).
I am not sure understanding you. I am poping the value ranges for scope
when we leave them. Other value ranges which lives in all the regions
will remain.
>
> What you instead need to do with the DOM scheme is at the point you
> call vrp_visit_stmt do sth like
>
> vrp_visit_stmt (....);
> if (fold_stmt (&gsi, follow_single_use_edges))
> update_stmt ();
I have added this. I think this will be good as we would be able to
optimize with value ranges that is valid within the scope.
> tree def = SINGLE_SSA_DEF_OPERAND (stmt, SSA_OP_DEF);
> if (def
> && INTEGRAL_TYPE_P (TREE_TYPE (def))
> && (TREE_CODE (vr_value[i]->min) == INTEGER_CST)
> && (TREE_CODE (vr_value[i]->max) == INTEGER_CST)
> && (vr_value[i]->type == VR_RANGE
> || vr_value[i]->type == VR_ANTI_RANGE))
> set_range_info (name, vr_value[i]->type, vr_value[i]->min,
> vr_value[i]->max);
>
I am not sure if this is needed. I dont know if my push/pop value range
implementation is not what you wanted. If you still prefer, I am happy
to add this.
> thus please split vrp_finalize into two parts, one of it with the memory
> freeing which is the one you'd call.
>
> Note that EVRP is not enabled by default at any optimization level
> it seems so bootstrap / test of it would be useless (did you only
> test with the full series? I never looked at the IPA VRP part)
>
I have:
+ftree-evrp
+Common Report Var(flag_tree_early_vrp) Init(1) Optimization
+Perform Early Value Range Propagation on trees.
I would like to run this only for -O2 and above but for now I am using
this to test.
I have tested the last set of patch separately.
I will do more testing on this patch based on your feedback. Does this
look better?
Thanks,
Kugan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0005-Add-early-vrp.patch
Type: text/x-patch
Size: 27024 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20160726/ff8104b9/attachment.bin>
More information about the Gcc-patches
mailing list