[RFC][IPA-VRP] Early VRP Implementation

kugan kugan.vivekanandarajah@linaro.org
Tue Jul 26 12:27:00 GMT 2016



Hi Riachard,

Thanks for the review. Here is an updated patch with comments below.

> +/* Restore/Pop all the old VRs maintained in the cond_stack.  */
> +
> +void evrp_dom_walker::finalize_dom_walker ()
> +{
> +  while (!cond_stack.is_empty ())
> +    {
> +      tree var = cond_stack.last ().second;
> +      pop_value_range (var);
> +      cond_stack.pop ();
> +    }
>
> why is this necessary at all?  Looks like a waste of time (and the
> stack should be empty here
> anyway).  I think you leak cond_stack as well, I suppose using
> auto_vec might work here,
> you have to check.
Done.

>
>>>
>>> +         /* Discover VR when condition is true.  */
>>> +         if (te == e
>>> +             && !TREE_OVERFLOW_P (op0)
>>> +             && !TREE_OVERFLOW_P (op1))
>>
>>
>> This is mainly to handle gcc.dg/pr38934.c. This would result in ICE from
>> set_value_range otherwise:
>>
>> gcc_assert ((!TREE_OVERFLOW_P (min) || is_overflow_infinity (min))
>>                   && (!TREE_OVERFLOW_P (max) || is_overflow_infinity
>> (max)));
>
> Ok, instead make sure to remove the overflow flag via
>
>   if (TREE_OVERFLOW_P (op0))
>     op0 = drop_tree_overflow (op0);
  ...
Done.

>
> it looks like you want to merge the if ( & EDGE_TRUE_VALUE) and FALSE_VALUE
> cases and only invert the tree comparison for the false edge.
Done.

>
> +             tree cond = build2 (code, boolean_type_node, op0, op1);
> +             extract_range_for_var_from_comparison_expr (&vr, op0, cond);
>
> no wasteful tree building here please.  Instead of cond pass in code,
> op0 and op1
> as separate arguments.
Done.

>
> +         /* If we found any usable VR, set the VR to ssa_name and create a
> +            PUSH old value in the cond_stack with the old VR.  */
> +         if (vr.type == VR_RANGE || vr.type == VR_ANTI_RANGE)
> +           {
> +             value_range *new_vr = vrp_value_range_pool.allocate ();
> +             memset (new_vr, 0, sizeof (*new_vr));
> +             *new_vr = vr;
> +             cond_stack.safe_push (std::make_pair (bb, op0));
>
> the memset looks redundant given you copy-assing from vr anyway.
Done.

>
> You seem to miss the chance to register ranges for x_2 in i_1 < x_2 where
> both i_1 and x_2 might have ranges that are useful.
I will address this once the basic structure is in place if that is OK 
with you.

>
> push and pop_value_range should be methods in the dom walker class
> and vrp_stack and cond_stack should be a single stack.  You can omit
> basic_block if you use a "magic" NULL_TREE var as marker (simply
> push a NULL_TREE, NULL pair at the end of before_dom_children).
>
Done.

>>>
>>> you can use e->flags & EDGE_TRUE_VALUE/EDGE_FALSE_VALUE
>>>
>>> why do you need those TREE_OVERFLOW checks?
>>>
>>> +             tree cond = build2 (code, boolean_type_node, op0, op1);
>>> +             tree a = build2 (ASSERT_EXPR, TREE_TYPE (op0), op0, cond);
>>> +             extract_range_from_assert (&vr, a);
>>>
>>> so I was hoping that the "refactoring" patch in the series would expose a
>>> more
>>> useful interface than extract_range_from_assert ... namely one that can
>>> extract a range from the comparison directly and does not require building
>>> a scratch ASSERT_EXPR.
>>
>>
>> I have split extract_range_from_assert into
>> extract_range_for_var_from_comparison_expr and using it now. Is that better?
>
> See above.
>
>>>
>>>
>>> +         /* If we found any usable VR, set the VR to ssa_name and create
>>> a
>>> +            restore point in the cond_stack with the  old VR. */
>>> +         if (vr.type == VR_RANGE || vr.type == VR_ANTI_RANGE)
>>> +           {
>>> +             value_range *new_vr = XCNEW (value_range);
>>> +             *new_vr = vr;
>>> +             cond_stack.safe_push (std::make_pair (bb,
>>> +                                                   std::make_pair (op0,
>>> +
>>> old_vr)));
>>> +             change_value_range (op0, new_vr);
>>>
>>> I don't like 'change_value_range' as interface, please integrate that into
>>> a push/pop_value_range style interface instead.
>>
>>
>> Tried using push/pop interface.
>>>
>>>
>>> +       vrp_visit_stmt (stmt, &taken_edge_p, &output_p);
>>> +    }
>>> +
>>> +  return NULL;
>>>
>>> you should return taken_edge_p (misnamed as it isn't a pointer) and take
>>> advantage of EDGE_EXECUTABLE.  Again see DOM/SCCVN (might want to
>>> defer this as a followup improvement).
>>
>>
>> I have added a TODO to this effect and will comeback to it.
>>>
>>>
>>> Note that the advantage of a DOM-based VRP is that backtracking is easy
>>> to implement (you don't do that yet).  That is, after DEF got a (better)
>>> value-range you can simply re-visit all the defs of its uses (and
>>> recursively).
>>> I think you have to be careful with stmts that might prematurely leave a
>>> BB
>>> though (like via EH).  So sth for a followup as well.
>>
>>
>> Is this OK now. Bootstrapped and regression tested on x86_64-linux  with no
>> new regressions.
>
> Better, still not perfect.
>
> I'm also arguing on the pass placement now - you put it where it may make sense
> for IPA VRP (but then IPA VRP could simply do this in its analysis phase) but
> not so much where it makes sense during the early pipeline.  I think it makes
> sense after FRE.
>
> Looking at how you finalize I see several issues with simply re-using
> vrp_finalize.
>
> First of all the loop doing the set_range_info will turn up with
> nothing as you've
> popped off all value-ranges from the lattice.  Same for substitute-and-fold (or
> jump-threading).
I am not sure understanding you. I am poping the value ranges for scope 
when we leave them. Other value ranges which lives in all the regions 
will remain.

>
> What you instead need to do with the DOM scheme is at the point you
> call vrp_visit_stmt do sth like
>
>      vrp_visit_stmt (....);
>      if (fold_stmt (&gsi, follow_single_use_edges))
>        update_stmt ();
I have added this. I think this will be good as we would be able to 
optimize with value ranges that is valid within the scope.

>      tree def = SINGLE_SSA_DEF_OPERAND (stmt, SSA_OP_DEF);
>      if (def
>          && INTEGRAL_TYPE_P (TREE_TYPE (def))
>          && (TREE_CODE (vr_value[i]->min) == INTEGER_CST)
>           && (TREE_CODE (vr_value[i]->max) == INTEGER_CST)
>           && (vr_value[i]->type == VR_RANGE
>               || vr_value[i]->type == VR_ANTI_RANGE))
>         set_range_info (name, vr_value[i]->type, vr_value[i]->min,
>                         vr_value[i]->max);
>
I am not sure if this is needed. I dont know if my push/pop value range 
implementation is not what you wanted. If you still prefer, I am happy 
to add this.

> thus please split vrp_finalize into two parts, one of it with the memory
> freeing which is the one you'd call.
>
> Note that EVRP is not enabled by default at any optimization level
> it seems so bootstrap / test of it would be useless (did you only
> test with the full series?  I never looked at the IPA VRP part)
>
I have:

+ftree-evrp
+Common Report Var(flag_tree_early_vrp) Init(1) Optimization
+Perform Early Value Range Propagation on trees.

I would like to run this only for -O2 and above but for now I am using 
this to test.

I  have tested the last set of patch separately.

I will do more testing on this patch based on your feedback. Does this 
look better?

Thanks,
Kugan


-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0005-Add-early-vrp.patch
Type: text/x-patch
Size: 27024 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20160726/ff8104b9/attachment.bin>


More information about the Gcc-patches mailing list