[RFC][IPA-VRP] Early VRP Implementation
Richard Biener
richard.guenther@gmail.com
Mon Sep 19 13:30:00 GMT 2016
On Sun, Sep 18, 2016 at 10:50 PM, kugan
<kugan.vivekanandarajah@linaro.org> wrote:
> Hi Richard,
>
>
>
> On 16/09/16 20:21, Richard Biener wrote:
>>
>> On Fri, Sep 16, 2016 at 7:59 AM, kugan
>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>
>>> Hi Richard,
>>>
>>> Thanks for the review.
>>>
>>> On 14/09/16 22:04, Richard Biener wrote:
>>>>
>>>>
>>>> On Tue, Aug 23, 2016 at 4:11 AM, Kugan Vivekanandarajah
>>>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>> On 19 August 2016 at 21:41, Richard Biener <richard.guenther@gmail.com>
>>>>> wrote:
>>>>>>
>>>>>>
>>>>>> On Tue, Aug 16, 2016 at 9:45 AM, kugan
>>>>>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>>>>>
>>>>>>>
>>>>>>> Hi Richard,
>>>
>>>
>>>
>>>>>>> I am now having -ftree-evrp which is enabled all the time. But This
>>>>>>> will
>>>>>>> only be used for disabling the early-vrp. That is, early-vrp will be
>>>>>>> run
>>>>>>> when ftree-vrp is enabled and ftree-evrp is not explicitly disabled.
>>>>>>> Is
>>>>>>> this
>>>>>>> OK?
>>>>>>
>>>>>>
>>>>>>
>>>>>> Why would one want to disable early-vrp? I see you do this in the
>>>>>> testsuite
>>>>>> for non-early VRP unit-tests but using -fdisable-tree-evrp1 there
>>>>>> would be ok as well.
>>>>>
>>>>>
>>>>>
>>>>> Removed it altogether. I though that you wanted a way to disable
>>>>> early-vrp for testing purposes.
>>>>
>>>>
>>>>
>>>> But there is via the generic -fdisable-tree-DUMPFILE way.
>>>
>>>
>>>
>>> OK. I didnt know about that.
>>>
>>>
>>>>>> Note that you want to have a custom valueize function instead of just
>>>>>> follow_single_use_edges as you want to valueize all SSA names
>>>>>> according
>>>>>> to their lattice value (if it has a single value). You can use
>>>>>> vrp_valueize
>>>>>> for this though that gets you non-single-use edge following as well.
>>>>>> Eventually it's going to be cleaner to do what the SSA propagator does
>>>>>> and
>>>>>> before folding do
>>>>>>
>>>>>> did_replace = replace_uses_in (stmt, vrp_valueize);
>>>>>> if (fold_stmt (&gsi, follow_single_use_edges)
>>>>>> || did_replace)
>>>>>> update_stmt (gsi_stmt (gsi));
>>>>>>
>>>>>> exporting replace_uses_in for this is ok. I guess I prefer this for
>>>>>> now.
>>>>>
>>>>>
>>>>>
>>>>> I also added the above. I noticed that I need
>>>>> recompute_tree_invariant_for_addr_expr as in ssa_propagate. My initial
>>>>> implementation also had gimple_purge_all_dead_eh_edges and
>>>>> fixup_noreturn_call as in ssa_propagat but I thinj that is not needed
>>>>> as it would be done at the end of the pass.
>>>>
>>>>
>>>>
>>>> I don't see this being done at the end of the pass. So please
>>>> re-instantiate
>>>> that parts.
>>>
>>>
>>>
>>> I have copied these part as well.
>>>
>>>>> With this I noticed more stmts are folded before vrp1. This required
>>>>> me to adjust some testcases.
>>>>>
>>>>>>
>>>>>> Overall this now looks good apart from the folding and the
>>>>>> VR_INITIALIZER thing.
>>>>>>
>>>>>> You can commit the already approved refactoring changes and combine
>>>>>> this
>>>>>> patch with the struct value_range move, this way I can more easily
>>>>>> look
>>>>>> into
>>>>>> issues with the UNDEFINED thing if you can come up with a testcase
>>>>>> that
>>>>>> doesn't work.
>>>>>>
>>>>>
>>>>> I have now committed all the dependent patches.
>>>>>
>>>>> Attached patch passes regression and bootstrap except pr33738.C. This
>>>>> is an unrelated issue as discussed in
>>>>> https://gcc.gnu.org/ml/gcc-patches/2016-08/msg01386.html
>>>>>
>>>>> Is this OK?
>>>>
>>>>
>>>>
>>>> +/* Initialize local data structures for VRP. If DOM_P is true,
>>>> + we will be calling this from early_vrp where value range propagation
>>>> + is done by visiting stmts in dominator tree. ssa_propagate engine
>>>> + is not used in this case and that part of the ininitialization will
>>>> + be skipped. */
>>>> +
>>>> +static void
>>>> +vrp_initialize ()
>>>>
>>>> comment needs updating now.
>>>>
>>> Done.
>>>
>>>>
>>>> static void
>>>> -extract_range_from_phi_node (gphi *phi, value_range *vr_result)
>>>> +extract_range_from_phi_node (gphi *phi, value_range *vr_result,
>>>> + bool early_vrp_p)
>>>> {
>>>>
>>>>
>>>> I don't think you need this changes now that you have
>>>> stmt_visit_phi_node_in_dom_p
>>>> guarding its call.
>>>
>>>
>>>
>>> OK removed it. That also mean I had to put scev_* in the early_vrp.
>>>
>>>
>>>
>>>> +static bool
>>>> +stmt_visit_phi_node_in_dom_p (gphi *phi)
>>>> +{
>>>> + ssa_op_iter iter;
>>>> + use_operand_p oprnd;
>>>> + tree op;
>>>> + value_range *vr;
>>>> + FOR_EACH_PHI_ARG (oprnd, phi, iter, SSA_OP_USE)
>>>> + {
>>>> + op = USE_FROM_PTR (oprnd);
>>>> + if (TREE_CODE (op) == SSA_NAME)
>>>> + {
>>>> + vr = get_value_range (op);
>>>> + if (vr->type == VR_UNDEFINED)
>>>> + return false;
>>>> + }
>>>> + }
>>>>
>>>> I think this is overly conservative in never allowing UNDEFINED on PHI
>>>> node args (even if the def was actually visited). I think that the most
>>>> easy way to improve this bit would be to actually track visited blocks.
>>>> You already set the EDGE_EXECUTABLE flag on edges so you could
>>>> clear BB_VISITED on all blocks and set it in the before_dom_children
>>>> hook (at the end). Then the above can be folded into the PHI visiting:
>>>>
>>>> bool has_unvisited_pred = false;
>>>> FOR_EACH_EDGE (e, ei, bb->preds)
>>>> if (!(e->src->flags & BB_VISITED))
>>>> {
>>>> has_unvisited_preds = true;
>>>> break;
>>>> }
>>>>
>>> OK done.
>>>
>>> I also had to check for uninitialized variables that will have
>>> VR_UNDEFINED
>>> as range. We do not visit GIMPLE_NOP.
>>
>>
>> But VR_UNDEFINED of uninitialized variables is fine to use.
>
>
> Indeed. I was really trying to fix another problem with this.
>
> The real problem I am facing is:
>
> When we have a PHI stmt with one argument as symbolic VR_RANGE and another
> as VR_UNDEFINED, we will copy VR_RANGE to the PHI result.
>
> When we fold the uses of the PHI result with vrp_valueize, we will assign
> the symbol from VR_RANGE if that is of the form [a, a].
>
> However, in replace_uses_in, we dont see if the SSA definition dominates the
> gimple that uses it. This some times results in ICE.
Indeed -- ccp_lattice_meet avoids this by not merging UNDEF and SSA to SSA.
VRP uses op_with_constant_singleton_value_range for the valueization at
substitute_and_fold time which only substitutes constants.
> For now, in the vrp_valueize, I have commented out the SSA_NAME part (as
> shown in the attached patch). With that I can bootstrap and regression test
> the patch.
>
> The fix is to either:
>
> 1. Remove SSA_NAME from vrp_valueize. Currently we use vrp_valueize with
> gimple_fold_stmt_to_constant_1 and accepts only is_gimple_min_invariant.
> Therefore maybe SSA_NAME is not needed?
Yeah, technically it is not needed.
> 2. Or, in replace_uses_in, see if the stmt dominates operand definition
> before replacing.
We need to treat the valueization as a "value number" and separately
keep track of available DEFs we can substitute. See for example how
FRE/PRE handle this in eliminate_dom_walker. The substitute-and-fold
DOM walker would need to be adjusted similarly (and your copy of it
in DOM VRP).
Short-cutting this at vrp_valueize will remove some valid foldings to
constants but I think you can simply use op_with_constant_singleton_value_range
for the replace_uses_in call.
The patch is ok with that change.
Thanks,
Richard.
>>>
>>>> + /* Visit PHI stmts and discover any new VRs possible. */
>>>> + gimple_stmt_iterator gsi;
>>>> + for (gphi_iterator gpi = gsi_start_phis (bb);
>>>> + !gsi_end_p (gpi); gsi_next (&gpi))
>>>> + {
>>>> + gphi *phi = gpi.phi ();
>>>> + tree lhs = PHI_RESULT (phi);
>>>> + value_range vr_result = VR_INITIALIZER;
>>>> + if (! has_unvisived_preds
>>>> && stmt_interesting_for_vrp (phi)
>>>> + && stmt_visit_phi_node_in_dom_p (phi))
>>
>>
>> failed to remove this call to stmt_visit_phi_node_in_dom_p -- whether we
>> need to
>> drop to varying is a property that is the same for all PHI nodes in a
>> block.
>>
> Done.
>
>>>> + extract_range_from_phi_node (phi, &vr_result, true);
>>>> + else
>>>> + set_value_range_to_varying (&vr_result);
>>>> + update_value_range (lhs, &vr_result);
>>>> + }
>>>>
>>>> due to a bug in IRA you need to make sure to un-set BB_VISITED after
>>>> early-vrp is finished again.
>>>
>>>
>>> OK. Done.
>>
>>
>> You set BB_VISITED in after_dom_children -- that is too late, please
>> set it at the end
>> of before_dom_children. Otherwise it pessimizes handling of the PHIs
>> in the merge
>> block of a diamond in case the PHI args are defined in the immediate
>> dominator.
>>
>> As said you need to clear BB_VISITED at the start of evrp as well
>> (clearing at the end
>> is just a workaround for a IRA bug).
>
> Done.
>
>>>>
>>>> + /* Try folding stmts with the VR discovered. */
>>>> + bool did_replace = replace_uses_in (stmt, evrp_valueize);
>>>> + if (fold_stmt (&gsi, follow_single_use_edges)
>>>> + || did_replace)
>>>> + update_stmt (gsi_stmt (gsi));
>>>>
>>>> you should be able to re-use vrp_valueize here.
>>>
>>>
>>> This issue is vrp_valueize accepts ranges such as [VAR + CST, VAR + CST]
>>> which we can not set.
>>
>>
>> Oh - that looks like sth we need to fix anyway then. May I suggest to
>> change
>> vrp_valueize to do
>>
>> && (TREE_CODE (vr->min) == SSA_NAME
>> || is_gimple_min_invariant (TREE_CODE (vr->min)))
>>
>> which also allows [&a, &a] like constants.
>
>
> Please see the error above.
>
>
>>>>
>>>> + def_operand_p def_p = SINGLE_SSA_DEF_OPERAND (stmt,
>>>> SSA_OP_DEF);
>>>> + /* Set the SSA with the value range. */
>>>> + if (def_p
>>>> + && TREE_CODE (DEF_FROM_PTR (def_p)) == SSA_NAME
>>>> + && INTEGRAL_TYPE_P (TREE_TYPE (DEF_FROM_PTR (def_p))))
>>>> + {
>>>> + tree def = DEF_FROM_PTR (def_p);
>>>> + unsigned ver = SSA_NAME_VERSION (def);
>>>> + if ((vr_value[ver]->type == VR_RANGE
>>>>
>>>> Use get_value_range () please, not direct access to vr_value.
>>>>
>>> Done.
>>>
>>>> + || vr_value[ver]->type == VR_ANTI_RANGE)
>>>> + && (TREE_CODE (vr_value[ver]->min) == INTEGER_CST)
>>>> + && (TREE_CODE (vr_value[ver]->max) == INTEGER_CST))
>>>> + set_range_info (def, vr_value[ver]->type,
>>>> vr_value[ver]->min,
>>>> + vr_value[ver]->max);
>>>> + }
>>>>
>>>> Otherwise the patch looks good now (with a lot of improvement
>>>> possibilities of course).
>>>
>>>
>>> I will work on the improvement after this goes in.
>>>
>>> Bootstrapped and regression tested on x86_64-linux-gnu. Does this looks
>>> OK?
>>
>>
>> Please remove no-op changes like
>
> Done.
>
>
>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr22117.c
>> b/gcc/testsuite/gcc.dg/tree-ssa/pr22117.c
>> index 7efdd63..3a433d6 100644
>> --- a/gcc/testsuite/gcc.dg/tree-ssa/pr22117.c
>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr22117.c
>> @@ -3,7 +3,7 @@
>> known to be zero after entering the first two "if" statements. */
>>
>> /* { dg-do compile } */
>> -/* { dg-options "-O2 -fdump-tree-vrp1" } */
>> +/* { dg-options "-O2 -fdump-tree-vrp1" } */
>>
>> void link_error (void);
>>
>> @@ -21,4 +21,4 @@ foo (int *p, int q)
>> }
>> }
>>
>> -/* { dg-final { scan-tree-dump-times "Folding predicate r_.* != 0B to
>> 0" 1 "vrp1" } } */
>> +/* { dg-final { scan-tree-dump-times "link_error" 0 "vrp1" } } */
>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr25382.c
>> b/gcc/testsuite/gcc.dg/tree-ssa/pr25382.c
>> index dcf9148..c4fda8b 100644
>> --- a/gcc/testsuite/gcc.dg/tree-ssa/pr25382.c
>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr25382.c
>> @@ -3,7 +3,7 @@
>> Check that VRP now gets ranges from BIT_AND_EXPRs. */
>>
>> /* { dg-do compile } */
>> -/* { dg-options "-O2 -fno-tree-ccp -fdump-tree-vrp1" } */
>> +/* { dg-options "-O2 -fno-tree-ccp -fdump-tree-vrp" } */
>>
>> int
>> foo (int a)
>>
>>
>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp46.c
>> b/gcc/testsuite/gcc.dg/tree-ssa/vrp46.c
>> index d3c9ed1..5b279a1 100644
>> --- a/gcc/testsuite/gcc.dg/tree-ssa/vrp46.c
>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp46.c
>> @@ -27,6 +27,5 @@ func_18 ( int t )
>> }
>> }
>>
>> -/* There should be a single if left. */
>>
>> -/* { dg-final { scan-tree-dump-times "if" 1 "vrp1" } } */
>> +/* { dg-final { scan-tree-dump-times "if" 0 "vrp1" } } */
>>
>> I'm curious -- this is not a dg-run testcase but did you investigate this
>> isn't generating wrong code now? At least I can't see how
>> the if (1 & (t % rhs)) test could vanish.
>
> Indeed.This was a mistake and fixed it.
>
>
>> I hope we'll get GIMPLE unit testing finished for GCC 7 so we can add
>> separate
>> unit-tests for VRP and EVRP.
>
>
> I will have a look at it.
>
> Thanks,
> Kugan
>
>
>> Thanks,
>> Richard.
>>
>>
>>> Thanks,
>>> Kugan
>>>
>>>
>>>
>>>>
>>>> Thanks and sorry for the delay,
>>>> Richard.
>>>>
>>>>> Thanks,
>>>>> Kugan
>>>>>
>>>>>
>>>>>> Thanks,
>>>>>> Richard.
>>>>>>
>>>>>>> I also noticed that g++.dg/warn/pr33738.C testcase is now failing.
>>>>>>> This
>>>>>>> is
>>>>>>> because, with early-vrp setting value range ccp2 is optimizing
>>>>>>> without
>>>>>>> issuing a warning. I will look into it.
>>>>>>>
>>>>>>> bootstrap and regression testing is in progress.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Kugan
More information about the Gcc-patches
mailing list