This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC][IPA-VRP] Early VRP Implementation


On Fri, Jul 22, 2016 at 2:10 PM, kugan
<kugan.vivekanandarajah@linaro.org> wrote:
> Hi Richard,
>
> Thanks for the review.
>
> On 18/07/16 21:51, Richard Biener wrote:
>>
>> On Fri, Jul 15, 2016 at 9:33 AM, kugan
>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>
>>> Hi Andrew,
>>>
>>> On 15/07/16 17:28, Andrew Pinski wrote:
>>>>
>>>>
>>>> On Fri, Jul 15, 2016 at 12:08 AM, kugan
>>>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>>>
>>>>>
>>>>> Hi Andrew,
>>>>>
>>>>>> Why separate out early VRP from tree-vrp?  Just a little curious.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> It is based on the discussion in
>>>>> https://gcc.gnu.org/ml/gcc/2016-01/msg00069.html.
>>>>> In summary, conclusion (based on my understanding) was to implement a
>>>>> simplified VRP algorithm that doesn't require ASSERT_EXPR insertion.
>>>>
>>>>
>>>>
>>>> But I don't see why you are moving it from tree-vrp.c .  That was my
>>>> question, you pointing to that discussion does not say to split it
>>>> into a new file and expose these interfaces.
>>>>
>>>
>>> Are you saying that I should keep this part of tree-vrp.c. I am happy to
>>> do
>>> that if this is considered the best approach.
>>
>>
>> Yes, I think that's the best approach.
>>
> Thanks. Moved it into tree-vrp.c now.
>
>> Can you, as a refactoring before your patch, please change VRP to use
>> an alloc-pool
>> for allocating value_range?  The DOM-based VRP will add a lot of
>> malloc/free churn
>> otherwise.
>>
>> Generally watch coding-style such as  missed function comments.
>>
>> As you do a non-iterating VRP and thus do not visit back-edges you don't
>> need
>> to initialize loops or SCEV nor do you need loop-closed SSA.
>>
>> As you do a DOM-based VRP using SSA propagator stuff like ssa_prop_result
>> doesn't make any sense.
>
>
> I have removed it.
>>
>>
>> +edge evrp_dom_walker::before_dom_children (basic_block bb)
>> +{
>> +  /* If we are going out of scope, restore the old VR.  */
>> +  while (!cond_stack.is_empty ()
>> +        && !dominated_by_p (CDI_DOMINATORS, bb, cond_stack.last
>> ().first))
>> +    {
>> +      tree var = cond_stack.last ().second.first;
>> +      value_range *vr = cond_stack.last ().second.second;
>> +      value_range *vr_to_del = get_value_range (var);
>> +      XDELETE (vr_to_del);
>> +      change_value_range (var, vr);
>> +      cond_stack.pop ();
>> +    }
>>
>> that should be in after_dom_children I think and use a marker instead
>> of a DOM query.
>> See other examples like DOM itself or SCCVN.
>>
>
> Done.

+/* Restore/Pop all the old VRs maintained in the cond_stack.  */
+
+void evrp_dom_walker::finalize_dom_walker ()
+{
+  while (!cond_stack.is_empty ())
+    {
+      tree var = cond_stack.last ().second;
+      pop_value_range (var);
+      cond_stack.pop ();
+    }

why is this necessary at all?  Looks like a waste of time (and the
stack should be empty here
anyway).  I think you leak cond_stack as well, I suppose using
auto_vec might work here,
you have to check.

>>
>> +         /* Discover VR when condition is true.  */
>> +         if (te == e
>> +             && !TREE_OVERFLOW_P (op0)
>> +             && !TREE_OVERFLOW_P (op1))
>
>
> This is mainly to handle gcc.dg/pr38934.c. This would result in ICE from
> set_value_range otherwise:
>
> gcc_assert ((!TREE_OVERFLOW_P (min) || is_overflow_infinity (min))
>                   && (!TREE_OVERFLOW_P (max) || is_overflow_infinity
> (max)));

Ok, instead make sure to remove the overflow flag via

  if (TREE_OVERFLOW_P (op0))
    op0 = drop_tree_overflow (op0);
...

it looks like you want to merge the if ( & EDGE_TRUE_VALUE) and FALSE_VALUE
cases and only invert the tree comparison for the false edge.

+             tree cond = build2 (code, boolean_type_node, op0, op1);
+             extract_range_for_var_from_comparison_expr (&vr, op0, cond);

no wasteful tree building here please.  Instead of cond pass in code,
op0 and op1
as separate arguments.

+         /* If we found any usable VR, set the VR to ssa_name and create a
+            PUSH old value in the cond_stack with the old VR.  */
+         if (vr.type == VR_RANGE || vr.type == VR_ANTI_RANGE)
+           {
+             value_range *new_vr = vrp_value_range_pool.allocate ();
+             memset (new_vr, 0, sizeof (*new_vr));
+             *new_vr = vr;
+             cond_stack.safe_push (std::make_pair (bb, op0));

the memset looks redundant given you copy-assing from vr anyway.

You seem to miss the chance to register ranges for x_2 in i_1 < x_2 where
both i_1 and x_2 might have ranges that are useful.

push and pop_value_range should be methods in the dom walker class
and vrp_stack and cond_stack should be a single stack.  You can omit
basic_block if you use a "magic" NULL_TREE var as marker (simply
push a NULL_TREE, NULL pair at the end of before_dom_children).

>>
>> you can use e->flags & EDGE_TRUE_VALUE/EDGE_FALSE_VALUE
>>
>> why do you need those TREE_OVERFLOW checks?
>>
>> +             tree cond = build2 (code, boolean_type_node, op0, op1);
>> +             tree a = build2 (ASSERT_EXPR, TREE_TYPE (op0), op0, cond);
>> +             extract_range_from_assert (&vr, a);
>>
>> so I was hoping that the "refactoring" patch in the series would expose a
>> more
>> useful interface than extract_range_from_assert ... namely one that can
>> extract a range from the comparison directly and does not require building
>> a scratch ASSERT_EXPR.
>
>
> I have split extract_range_from_assert into
> extract_range_for_var_from_comparison_expr and using it now. Is that better?

See above.

>>
>>
>> +         /* If we found any usable VR, set the VR to ssa_name and create
>> a
>> +            restore point in the cond_stack with the  old VR. */
>> +         if (vr.type == VR_RANGE || vr.type == VR_ANTI_RANGE)
>> +           {
>> +             value_range *new_vr = XCNEW (value_range);
>> +             *new_vr = vr;
>> +             cond_stack.safe_push (std::make_pair (bb,
>> +                                                   std::make_pair (op0,
>> +
>> old_vr)));
>> +             change_value_range (op0, new_vr);
>>
>> I don't like 'change_value_range' as interface, please integrate that into
>> a push/pop_value_range style interface instead.
>
>
> Tried using push/pop interface.
>>
>>
>> +       vrp_visit_stmt (stmt, &taken_edge_p, &output_p);
>> +    }
>> +
>> +  return NULL;
>>
>> you should return taken_edge_p (misnamed as it isn't a pointer) and take
>> advantage of EDGE_EXECUTABLE.  Again see DOM/SCCVN (might want to
>> defer this as a followup improvement).
>
>
> I have added a TODO to this effect and will comeback to it.
>>
>>
>> Note that the advantage of a DOM-based VRP is that backtracking is easy
>> to implement (you don't do that yet).  That is, after DEF got a (better)
>> value-range you can simply re-visit all the defs of its uses (and
>> recursively).
>> I think you have to be careful with stmts that might prematurely leave a
>> BB
>> though (like via EH).  So sth for a followup as well.
>
>
> Is this OK now. Bootstrapped and regression tested on x86_64-linux  with no
> new regressions.

Better, still not perfect.

I'm also arguing on the pass placement now - you put it where it may make sense
for IPA VRP (but then IPA VRP could simply do this in its analysis phase) but
not so much where it makes sense during the early pipeline.  I think it makes
sense after FRE.

Looking at how you finalize I see several issues with simply re-using
vrp_finalize.

First of all the loop doing the set_range_info will turn up with
nothing as you've
popped off all value-ranges from the lattice.  Same for substitute-and-fold (or
jump-threading).

What you instead need to do with the DOM scheme is at the point you
call vrp_visit_stmt do sth like

     vrp_visit_stmt (....);
     if (fold_stmt (&gsi, follow_single_use_edges))
       update_stmt ();
     tree def = SINGLE_SSA_DEF_OPERAND (stmt, SSA_OP_DEF);
     if (def
         && INTEGRAL_TYPE_P (TREE_TYPE (def))
         && (TREE_CODE (vr_value[i]->min) == INTEGER_CST)
          && (TREE_CODE (vr_value[i]->max) == INTEGER_CST)
          && (vr_value[i]->type == VR_RANGE
              || vr_value[i]->type == VR_ANTI_RANGE))
        set_range_info (name, vr_value[i]->type, vr_value[i]->min,
                        vr_value[i]->max);

thus please split vrp_finalize into two parts, one of it with the memory
freeing which is the one you'd call.

Note that EVRP is not enabled by default at any optimization level
it seems so bootstrap / test of it would be useless (did you only
test with the full series?  I never looked at the IPA VRP part)

Thanks,
Richard.


> Thanks,
> Kugan
>
> gcc/ChangeLog:
>
> 2016-07-22  Kugan Vivekanandarajah  <kuganv@linaro.org>
>
>         * common.opt: New option -ftree-evrp.
>         * doc/invoke.texi: Document -ftree-evrp.
>         * passes.def: Define new pass_early_vrp.
>         * timevar.def: Define new TV_TREE_EARLY_VRP.
>         * tree-pass.h (make_pass_early_vrp): New.
>         * tree-vrp.c (extract_range_for_var_from_comparison_expr): New.
>         (extract_range_from_assert): Call
>         extract_range_for_var_from_comparison_expr.
>         (push_value_range): New.
>         (pop_value_range): Likewise.
>         (evrp_visit_phi_node_local): Likewise.
>         (edge evrp_dom_walker::before_dom_children): Likewise.
>         (void evrp_dom_walker::after_dom_children): Likewise.
>         (void evrp_dom_walker::finalize_dom_walker): Likewise.
>         (execute_early_vrp): Likewise.
>         (make_pass_early_vrp): Likewise.
>
>
> gcc/testsuite/ChangeLog:
>
> 2016-07-22  Kugan Vivekanandarajah  <kuganv@linaro.org>
>
>         * gcc.dg/tree-ssa/evrp1.c: New test.
>         * gcc.dg/tree-ssa/evrp2.c: New test.
>         * gcc.dg/tree-ssa/evrp3.c: New test.
>         * g++.dg/tree-ssa/pr31146-2.C: Run with -fno-tree-evrp as evrp also
>         does the same transformation.
>         * gcc.dg/tree-ssa/pr20318.c: Likewise.
>         * gcc.dg/tree-ssa/pr25382.c: Likewise.
>         * gcc.dg/tree-ssa/pr68431.c: Likewise.
>         * gcc.dg/tree-ssa/vrp19.c: Likewise.
>         * gcc.dg/tree-ssa/vrp23.c: Likewise.
>         * gcc.dg/tree-ssa/vrp24.c: Likewise.
>         * gcc.dg/tree-ssa/vrp58.c: Likewise.
>         * gcc.dg/tree-ssa/vrp67.c: Likewise.
>         * gcc.dg/tree-ssa/vrp98.c: Likewise.
>         * gcc.dg/tree-ssa/vrp19.c: Likewise.
>         * gcc.dg/tree-ssa/vrp23.c: Likewise.
>         * gcc.dg/tree-ssa/vrp24.c: Likewise.
>         * gcc.dg/tree-ssa/vrp58.c: Likewise.
>         * gcc.dg/tree-ssa/vrp67.c: Likewise.
>         * gcc.dg/tree-ssa/vrp98.c: Likewise.
>         * gcc.dg/tree-ssa/pr22117.c: Run with -fno-tree-evrp as predicate is
>         optimized away before vrp1.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]