[RFC][IPA-VRP] Early VRP Implementation

Richard Biener richard.guenther@gmail.com
Mon Sep 19 13:30:00 GMT 2016


On Sun, Sep 18, 2016 at 10:50 PM, kugan
<kugan.vivekanandarajah@linaro.org> wrote:
> Hi Richard,
>
>
>
> On 16/09/16 20:21, Richard Biener wrote:
>>
>> On Fri, Sep 16, 2016 at 7:59 AM, kugan
>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>
>>> Hi Richard,
>>>
>>> Thanks for the review.
>>>
>>> On 14/09/16 22:04, Richard Biener wrote:
>>>>
>>>>
>>>> On Tue, Aug 23, 2016 at 4:11 AM, Kugan Vivekanandarajah
>>>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>> On 19 August 2016 at 21:41, Richard Biener <richard.guenther@gmail.com>
>>>>> wrote:
>>>>>>
>>>>>>
>>>>>> On Tue, Aug 16, 2016 at 9:45 AM, kugan
>>>>>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>>>>>
>>>>>>>
>>>>>>> Hi Richard,
>>>
>>>
>>>
>>>>>>> I am now having -ftree-evrp which is enabled all the time. But This
>>>>>>> will
>>>>>>> only be used for disabling the early-vrp. That is, early-vrp will be
>>>>>>> run
>>>>>>> when ftree-vrp is enabled and ftree-evrp is not explicitly disabled.
>>>>>>> Is
>>>>>>> this
>>>>>>> OK?
>>>>>>
>>>>>>
>>>>>>
>>>>>> Why would one want to disable early-vrp?  I see you do this in the
>>>>>> testsuite
>>>>>> for non-early VRP unit-tests but using -fdisable-tree-evrp1 there
>>>>>> would be ok as well.
>>>>>
>>>>>
>>>>>
>>>>> Removed it altogether. I though that you wanted a way to disable
>>>>> early-vrp for testing purposes.
>>>>
>>>>
>>>>
>>>> But there is via the generic -fdisable-tree-DUMPFILE way.
>>>
>>>
>>>
>>> OK. I didnt know about that.
>>>
>>>
>>>>>> Note that you want to have a custom valueize function instead of just
>>>>>> follow_single_use_edges as you want to valueize all SSA names
>>>>>> according
>>>>>> to their lattice value (if it has a single value).  You can use
>>>>>> vrp_valueize
>>>>>> for this though that gets you non-single-use edge following as well.
>>>>>> Eventually it's going to be cleaner to do what the SSA propagator does
>>>>>> and
>>>>>> before folding do
>>>>>>
>>>>>>    did_replace = replace_uses_in (stmt, vrp_valueize);
>>>>>>    if (fold_stmt (&gsi, follow_single_use_edges)
>>>>>>        || did_replace)
>>>>>>      update_stmt (gsi_stmt (gsi));
>>>>>>
>>>>>> exporting replace_uses_in for this is ok.  I guess I prefer this for
>>>>>> now.
>>>>>
>>>>>
>>>>>
>>>>> I also added the above.  I noticed that I need
>>>>> recompute_tree_invariant_for_addr_expr as in ssa_propagate. My initial
>>>>> implementation also had gimple_purge_all_dead_eh_edges and
>>>>> fixup_noreturn_call as in ssa_propagat but I thinj that is not needed
>>>>> as it would be done at the end of the pass.
>>>>
>>>>
>>>>
>>>> I don't see this being done at the end of the pass.  So please
>>>> re-instantiate
>>>> that parts.
>>>
>>>
>>>
>>> I have copied these part as well.
>>>
>>>>> With this I noticed more stmts are folded before vrp1. This required
>>>>> me to adjust some testcases.
>>>>>
>>>>>>
>>>>>> Overall this now looks good apart from the folding and the
>>>>>> VR_INITIALIZER thing.
>>>>>>
>>>>>> You can commit the already approved refactoring changes and combine
>>>>>> this
>>>>>> patch with the struct value_range move, this way I can more easily
>>>>>> look
>>>>>> into
>>>>>> issues with the UNDEFINED thing if you can come up with a testcase
>>>>>> that
>>>>>> doesn't work.
>>>>>>
>>>>>
>>>>> I have now committed all the dependent patches.
>>>>>
>>>>> Attached patch passes regression and bootstrap except pr33738.C. This
>>>>> is an unrelated issue as discussed in
>>>>> https://gcc.gnu.org/ml/gcc-patches/2016-08/msg01386.html
>>>>>
>>>>> Is this OK?
>>>>
>>>>
>>>>
>>>> +/* Initialize local data structures for VRP.  If DOM_P is true,
>>>> +   we will be calling this from early_vrp where value range propagation
>>>> +   is done by visiting stmts in dominator tree.  ssa_propagate engine
>>>> +   is not used in this case and that part of the ininitialization will
>>>> +   be skipped.  */
>>>> +
>>>> +static void
>>>> +vrp_initialize ()
>>>>
>>>> comment needs updating now.
>>>>
>>> Done.
>>>
>>>>
>>>>  static void
>>>> -extract_range_from_phi_node (gphi *phi, value_range *vr_result)
>>>> +extract_range_from_phi_node (gphi *phi, value_range *vr_result,
>>>> +                            bool early_vrp_p)
>>>>  {
>>>>
>>>>
>>>> I don't think you need this changes now that you have
>>>> stmt_visit_phi_node_in_dom_p
>>>> guarding its call.
>>>
>>>
>>>
>>> OK removed it. That also mean I had to put scev_* in the early_vrp.
>>>
>>>
>>>
>>>> +static bool
>>>> +stmt_visit_phi_node_in_dom_p (gphi *phi)
>>>> +{
>>>> +  ssa_op_iter iter;
>>>> +  use_operand_p oprnd;
>>>> +  tree op;
>>>> +  value_range *vr;
>>>> +  FOR_EACH_PHI_ARG (oprnd, phi, iter, SSA_OP_USE)
>>>> +    {
>>>> +      op = USE_FROM_PTR (oprnd);
>>>> +      if (TREE_CODE (op) == SSA_NAME)
>>>> +       {
>>>> +         vr = get_value_range (op);
>>>> +         if (vr->type == VR_UNDEFINED)
>>>> +           return false;
>>>> +       }
>>>> +    }
>>>>
>>>> I think this is overly conservative in never allowing UNDEFINED on PHI
>>>> node args (even if the def was actually visited).  I think that the most
>>>> easy way to improve this bit would be to actually track visited blocks.
>>>> You already set the EDGE_EXECUTABLE flag on edges so you could
>>>> clear BB_VISITED on all blocks and set it in the before_dom_children
>>>> hook (at the end).  Then the above can be folded into the PHI visiting:
>>>>
>>>>     bool has_unvisited_pred = false;
>>>>     FOR_EACH_EDGE (e, ei, bb->preds)
>>>>        if (!(e->src->flags & BB_VISITED))
>>>>          {
>>>>             has_unvisited_preds = true;
>>>>             break;
>>>>          }
>>>>
>>> OK done.
>>>
>>> I also had to check for uninitialized variables that will have
>>> VR_UNDEFINED
>>> as range. We do not visit GIMPLE_NOP.
>>
>>
>> But VR_UNDEFINED of uninitialized variables is fine to use.
>
>
> Indeed. I was really trying to fix another problem with this.
>
> The real problem I am facing is:
>
> When we have a PHI stmt with one argument as symbolic VR_RANGE and another
> as VR_UNDEFINED, we will copy VR_RANGE to the PHI result.
>
> When we fold the uses of the PHI result with vrp_valueize, we will assign
> the symbol from VR_RANGE if that is of the form [a, a].
>
> However, in replace_uses_in, we dont see if the SSA definition dominates the
> gimple that uses it. This some times results in ICE.

Indeed -- ccp_lattice_meet avoids this by not merging UNDEF and SSA to SSA.

VRP uses op_with_constant_singleton_value_range for the valueization at
substitute_and_fold time which only substitutes constants.

> For now, in the vrp_valueize, I have commented out the SSA_NAME part (as
> shown in the attached patch). With that I can bootstrap and regression test
> the patch.
>
> The fix is to either:
>
> 1. Remove SSA_NAME from vrp_valueize. Currently we use vrp_valueize with
> gimple_fold_stmt_to_constant_1 and accepts only is_gimple_min_invariant.
> Therefore maybe SSA_NAME is not needed?

Yeah, technically it is not needed.

> 2. Or, in replace_uses_in, see if the stmt dominates operand definition
> before replacing.

We need to treat the valueization as a "value number" and separately
keep track of available DEFs we can substitute.  See for example how
FRE/PRE handle this in eliminate_dom_walker.  The substitute-and-fold
DOM walker would need to be adjusted similarly (and your copy of it
in DOM VRP).

Short-cutting this at vrp_valueize will remove some valid foldings to
constants but I think you can simply use op_with_constant_singleton_value_range
for the replace_uses_in call.

The patch is ok with that change.

Thanks,
Richard.

>>>
>>>> +  /* Visit PHI stmts and discover any new VRs possible.  */
>>>> +  gimple_stmt_iterator gsi;
>>>> +  for (gphi_iterator gpi = gsi_start_phis (bb);
>>>> +       !gsi_end_p (gpi); gsi_next (&gpi))
>>>> +    {
>>>> +      gphi *phi = gpi.phi ();
>>>> +      tree lhs = PHI_RESULT (phi);
>>>> +      value_range vr_result = VR_INITIALIZER;
>>>> +      if (! has_unvisived_preds
>>>>            && stmt_interesting_for_vrp (phi)
>>>> +         && stmt_visit_phi_node_in_dom_p (phi))
>>
>>
>> failed to remove this call to stmt_visit_phi_node_in_dom_p -- whether we
>> need to
>> drop to varying is a property that is the same for all PHI nodes in a
>> block.
>>
> Done.
>
>>>> +       extract_range_from_phi_node (phi, &vr_result, true);
>>>> +      else
>>>> +       set_value_range_to_varying (&vr_result);
>>>> +      update_value_range (lhs, &vr_result);
>>>> +    }
>>>>
>>>> due to a bug in IRA you need to make sure to un-set BB_VISITED after
>>>> early-vrp is finished again.
>>>
>>>
>>> OK. Done.
>>
>>
>> You set BB_VISITED in after_dom_children -- that is too late, please
>> set it at the end
>> of before_dom_children.  Otherwise it pessimizes handling of the PHIs
>> in the merge
>> block of a diamond in case the PHI args are defined in the immediate
>> dominator.
>>
>> As said you need to clear BB_VISITED at the start of evrp as well
>> (clearing at the end
>> is just a workaround for a IRA bug).
>
> Done.
>
>>>>
>>>> +         /* Try folding stmts with the VR discovered.  */
>>>> +         bool did_replace = replace_uses_in (stmt, evrp_valueize);
>>>> +         if (fold_stmt (&gsi, follow_single_use_edges)
>>>> +             || did_replace)
>>>> +           update_stmt (gsi_stmt (gsi));
>>>>
>>>> you should be able to re-use vrp_valueize here.
>>>
>>>
>>> This issue is vrp_valueize accepts ranges such as [VAR + CST, VAR + CST]
>>> which we can not set.
>>
>>
>> Oh - that looks like sth we need to fix anyway then.  May I suggest to
>> change
>> vrp_valueize to do
>>
>>          && (TREE_CODE (vr->min) == SSA_NAME
>>                 || is_gimple_min_invariant (TREE_CODE (vr->min)))
>>
>> which also allows [&a, &a] like constants.
>
>
> Please see the error above.
>
>
>>>>
>>>> +         def_operand_p def_p = SINGLE_SSA_DEF_OPERAND (stmt,
>>>> SSA_OP_DEF);
>>>> +         /* Set the SSA with the value range.  */
>>>> +         if (def_p
>>>> +             && TREE_CODE (DEF_FROM_PTR (def_p)) == SSA_NAME
>>>> +             && INTEGRAL_TYPE_P (TREE_TYPE (DEF_FROM_PTR (def_p))))
>>>> +           {
>>>> +             tree def = DEF_FROM_PTR (def_p);
>>>> +             unsigned ver = SSA_NAME_VERSION (def);
>>>> +             if ((vr_value[ver]->type == VR_RANGE
>>>>
>>>> Use get_value_range () please, not direct access to vr_value.
>>>>
>>> Done.
>>>
>>>> +                  || vr_value[ver]->type == VR_ANTI_RANGE)
>>>> +                 && (TREE_CODE (vr_value[ver]->min) == INTEGER_CST)
>>>> +                 && (TREE_CODE (vr_value[ver]->max) == INTEGER_CST))
>>>> +               set_range_info (def, vr_value[ver]->type,
>>>> vr_value[ver]->min,
>>>> +                               vr_value[ver]->max);
>>>> +           }
>>>>
>>>> Otherwise the patch looks good now (with a lot of improvement
>>>> possibilities of course).
>>>
>>>
>>> I will work on the improvement after this goes in.
>>>
>>> Bootstrapped and regression tested on x86_64-linux-gnu. Does this looks
>>> OK?
>>
>>
>> Please remove no-op changes like
>
> Done.
>
>
>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr22117.c
>> b/gcc/testsuite/gcc.dg/tree-ssa/pr22117.c
>> index 7efdd63..3a433d6 100644
>> --- a/gcc/testsuite/gcc.dg/tree-ssa/pr22117.c
>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr22117.c
>> @@ -3,7 +3,7 @@
>>     known to be zero after entering the first two "if" statements.  */
>>
>>  /* { dg-do compile } */
>> -/* { dg-options "-O2 -fdump-tree-vrp1" } */
>> +/* { dg-options "-O2  -fdump-tree-vrp1" } */
>>
>>  void link_error (void);
>>
>> @@ -21,4 +21,4 @@ foo (int *p, int q)
>>      }
>>  }
>>
>> -/* { dg-final { scan-tree-dump-times "Folding predicate r_.* != 0B to
>> 0" 1 "vrp1" } } */
>> +/* { dg-final { scan-tree-dump-times "link_error" 0 "vrp1" } } */
>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr25382.c
>> b/gcc/testsuite/gcc.dg/tree-ssa/pr25382.c
>> index dcf9148..c4fda8b 100644
>> --- a/gcc/testsuite/gcc.dg/tree-ssa/pr25382.c
>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr25382.c
>> @@ -3,7 +3,7 @@
>>     Check that VRP now gets ranges from BIT_AND_EXPRs.  */
>>
>>  /* { dg-do compile } */
>> -/* { dg-options "-O2 -fno-tree-ccp -fdump-tree-vrp1" } */
>> +/* { dg-options "-O2 -fno-tree-ccp -fdump-tree-vrp" } */
>>
>>  int
>>  foo (int a)
>>
>>
>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp46.c
>> b/gcc/testsuite/gcc.dg/tree-ssa/vrp46.c
>> index d3c9ed1..5b279a1 100644
>> --- a/gcc/testsuite/gcc.dg/tree-ssa/vrp46.c
>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp46.c
>> @@ -27,6 +27,5 @@ func_18 ( int t )
>>      }
>>  }
>>
>> -/* There should be a single if left.  */
>>
>> -/* { dg-final { scan-tree-dump-times "if" 1 "vrp1" } } */
>> +/* { dg-final { scan-tree-dump-times "if" 0 "vrp1" } } */
>>
>> I'm curious -- this is not a dg-run testcase but did you investigate this
>> isn't generating wrong code now?  At least I can't see how
>> the if (1 & (t % rhs)) test could vanish.
>
> Indeed.This was a mistake and fixed it.
>
>
>> I hope we'll get GIMPLE unit testing finished for GCC 7 so we can add
>> separate
>> unit-tests for VRP and EVRP.
>
>
> I will have a look at it.
>
> Thanks,
> Kugan
>
>
>> Thanks,
>> Richard.
>>
>>
>>> Thanks,
>>> Kugan
>>>
>>>
>>>
>>>>
>>>> Thanks and sorry for the delay,
>>>> Richard.
>>>>
>>>>> Thanks,
>>>>> Kugan
>>>>>
>>>>>
>>>>>> Thanks,
>>>>>> Richard.
>>>>>>
>>>>>>> I also noticed that g++.dg/warn/pr33738.C testcase is now failing.
>>>>>>> This
>>>>>>> is
>>>>>>> because, with early-vrp setting value range ccp2 is optimizing
>>>>>>> without
>>>>>>> issuing a warning. I will look into it.
>>>>>>>
>>>>>>> bootstrap and regression testing is in progress.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Kugan



More information about the Gcc-patches mailing list