[PATCH][RFC] tree-optimization/100801 - perform final value replacement from VRP
Richard Biener
rguenther@suse.de
Tue Jun 1 10:01:13 GMT 2021
On Mon, 31 May 2021, Andrew MacLeod wrote:
> On 5/28/21 11:25 AM, Richard Biener wrote:
> > This makes sure to perform final value replacement of constants
> > when we also are sure to propagate those, like in VRP. This avoids
> > spurious diagnostics when doing so only from within SCCP which can
> > leave unreachable loops in the IL triggering bogus diagnostics.
> >
> > The choice is VRP because it already defers to SCEV for PHI nodes
> > so the following makes it analyze final values for loop exit PHIs.
> > To make that work it arranges for loop-closed SSA to be built only
> > after inserting ASSERT_EXPRs.
> >
> > Bootstrapped and tested on x86_64-unknown-linux-gnu.
> >
> > It does
> >
> > FAIL: gcc.dg/ubsan/pr94423.c -O2 (internal compiler error)
> >
> > where VRP somehow propagates abnormals in a bogus way.
OK, so I analyzed this some more and it results from the hunk moving
loop-closed SSA construction after ASSERT_EXPR insertion in
execute_vrp. The motivation for this is that we end up splitting
the loop exit edge when inserting the ASSERT_EXPR, creating
non-loop-closed SSA and thus fail to pick up the final value.
Now with swapping insert and loop-closed SSA build we get LC
SSA PHIs on an abnormal loop exit in the above testcase which
messes up assert expr removal which does
/* Propagate the RHS into every use of the LHS. For SSA names
also propagate abnormals as it merely restores the original
IL in this case (an replace_uses_by would assert). */
in remove_range_assertions, explicitely ignoring constraints around
abnormals. But since LC SSA PHIs remain we fail IL verification.
Note the LC SSA PHI is only required because we insert a SSA def
via the ASSERT_EXPR in the loop body. We can fix up the IL detail
by marking the ASSERT_EXPR source appropriately via
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index 450926d5f9b..705e2489eb1 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -3809,6 +3809,8 @@ vrp_asserts::remove_range_assertions ()
FOR_EACH_IMM_USE_STMT (use_stmt, iter, lhs)
FOR_EACH_IMM_USE_ON_STMT (use_p, iter)
SET_USE (use_p, var);
+ if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs))
+ SSA_NAME_OCCURS_IN_ABNORMAL_PHI (var) = 1;
}
else
replace_uses_by (lhs, var);
but we also never get rid of a SSA_NAME_OCCURS_IN_ABNORMAL_PHI marking.
One option would be to keep the order as-is but fixup assert expr
insertion to update/honor loop-closed SSA. But then - how far are you
with removing all this ASSERT_EXPR stuff?
Thanks,
Richard.
> > It also
> > needs adjustment for a couple of fails like
> >
> > FAIL: gcc.dg/vect/no-scevccp-outer-11.c scan-tree-dump-times vect "OUTER
> > LOOP VECTORIZED." 1
> >
> > where the testcases do -fno-tree-scev-cprop but that no longer has
> > the desired effect then (might just guard the final value replacement
> > in VRP with this).
> >
> > Any comments? I probably can plug final_value_at_exit elsewhere
> > than VRP (to avoid the issues with asserts) but it somewhat felt
> > naturally because that already uses SCEV.
>
> l think its OK. I'll keep track this change as we may need to incorporate the
> changes into fold_using_range::range_of_phi() when we move towards being on
> par with VRP, but the testsuite will show a difference if we miss it then
> anyway.
>
>
> > Thanks,
> > Richard.
> >
> > 2021-05-28 Richard Biener <rguenther@suse.de>
> >
> > PR tree-optimization/100801
> > * tree-scalar-evolution.h (final_value_at_exit): Declare.
> > * tree-scalar-evolution.c (final_value_at_exit): New API,
> > split out from ...
> > (final_value_replacement_loop): ... here.
> > * tree-vrp.c (execute_vrp): Build loop-closed SSA only
> > after inserting assert expressions.
> > (vrp_asserts::process_assert_insertions_for): Avoid inserting
> > asserts for RESULT_DECLs, when building loop-closed SSA
> > after assert insertion we're not able to propagate out all
> > PHIs and thus trigger IL validation.
> > * vr-values.c (vr_values::extract_range_from_phi_node):
> > For loop exit PHIs also perform final value analysis using
> > SCEV.
> >
> > * gcc.target/i386/pr100801.c: New testcase.
> > ---
> > gcc/testsuite/gcc.target/i386/pr100801.c | 29 ++++++++++++++++++++++++
> > gcc/tree-scalar-evolution.c | 28 +++++++++++++++--------
> > gcc/tree-scalar-evolution.h | 1 +
> > gcc/tree-vrp.c | 12 ++++++++--
> > gcc/vr-values.c | 23 +++++++++++++++----
> > 5 files changed, 77 insertions(+), 16 deletions(-)
> > create mode 100644 gcc/testsuite/gcc.target/i386/pr100801.c
> >
> > diff --git a/gcc/testsuite/gcc.target/i386/pr100801.c
> > b/gcc/testsuite/gcc.target/i386/pr100801.c
> > new file mode 100644
> > index 00000000000..c58f9be6898
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/i386/pr100801.c
> > @@ -0,0 +1,29 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2 -mavx" } */
> > +
> > +#include <x86intrin.h>
> > +
> > +void copy_32_unaligned(unsigned int* dest, const unsigned int* src, size_t
> > count)
> > +{
> > + // invariant/nop
> > + __m128i shufmask = _mm_set_epi8(15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5,
> > 4, 3, 2, 1, 0);
> > +
> > + size_t i;
> > + for (i = 0; i + 4 <= count; i += 4) {
> > + __m128i input = _mm_loadu_si128((const __m128i*)(&src[i]));
> > + __m128i output = input;
> > + // no warning without the shuffle:
> > + output = _mm_shuffle_epi8(input, shufmask);
> > + _mm_storeu_si128((__m128i*)(&dest[i]), output);
> > + }
> > + for (; i < count; ++i) { // handle residual elements
> > + dest[i] = src[i]; /* { dg-bogus "invokes undefined" } */
> > + }
> > +}
> > +
> > +void test(unsigned int* buf1, unsigned int* buf2)
> > +{
> > + copy_32_unaligned(buf2, buf1,
> > + // multiples of 4 and greater or equal then 12 trigger it:
> > + 12);
> > +}
> > diff --git a/gcc/tree-scalar-evolution.c b/gcc/tree-scalar-evolution.c
> > index b22d49a0ab6..6917b8c9dab 100644
> > --- a/gcc/tree-scalar-evolution.c
> > +++ b/gcc/tree-scalar-evolution.c
> > @@ -3491,6 +3491,23 @@ expression_expensive_p (tree expr)
> > || expanded_size > cache.elements ());
> > }
> >
> > +/* Compute the final value of DEF, a DEF inside the loop E exits from. */
> > +
> > +tree
> > +final_value_at_exit (tree def, edge e, bool *folded_casts)
> > +{
> > + class loop *loop = e->src->loop_father;
> > + gcc_assert (loop_exit_edge_p (loop, e));
> > + class loop *ex_loop
> > + = superloop_at_depth (loop, loop_depth (e->dest->loop_father) + 1);
> > + tree ev = analyze_scalar_evolution_in_loop (ex_loop, loop, def,
> > folded_casts);
> > + ev = compute_overall_effect_of_inner_loop (ex_loop, ev);
> > + if (!tree_does_not_contain_chrecs (def)
> > + || chrec_contains_symbols_defined_in_loop (ev, ex_loop->num))
> > + ev = chrec_dont_know;
> > + return ev;
> > +}
> > +
> > /* Do final value replacement for LOOP, return true if we did anything.
> > */
> >
> > bool
> > @@ -3513,10 +3530,6 @@ final_value_replacement_loop (class loop *loop)
> > /* Set stmt insertion pointer. All stmts are inserted before this
> > point. */
> > gimple_stmt_iterator gsi = gsi_after_labels (exit->dest);
> > - class loop *ex_loop
> > - = superloop_at_depth (loop,
> > - loop_depth (exit->dest->loop_father) + 1);
> > -
> > bool any = false;
> > gphi_iterator psi;
> > for (psi = gsi_start_phis (exit->dest); !gsi_end_p (psi); )
> > @@ -3538,11 +3551,8 @@ final_value_replacement_loop (class loop *loop)
> > }
> >
> > bool folded_casts;
> > - def = analyze_scalar_evolution_in_loop (ex_loop, loop, def,
> > - &folded_casts);
> > - def = compute_overall_effect_of_inner_loop (ex_loop, def);
> > - if (!tree_does_not_contain_chrecs (def)
> > - || chrec_contains_symbols_defined_in_loop (def, ex_loop->num)
> > + def = final_value_at_exit (def, exit, &folded_casts);
> > + if (def == chrec_dont_know
> > /* Moving the computation from the loop may prolong life range
> > of some ssa names, which may cause problems if they appear
> > on abnormal edges. */
> > diff --git a/gcc/tree-scalar-evolution.h b/gcc/tree-scalar-evolution.h
> > index d679f7285b3..18186c32d3d 100644
> > --- a/gcc/tree-scalar-evolution.h
> > +++ b/gcc/tree-scalar-evolution.h
> > @@ -33,6 +33,7 @@ extern tree analyze_scalar_evolution (class loop *, tree);
> > extern tree instantiate_scev (edge, class loop *, tree);
> > extern tree resolve_mixers (class loop *, tree, bool *);
> > extern void gather_stats_on_scev_database (void);
> > +extern tree final_value_at_exit (tree, edge, bool *);
> > extern bool final_value_replacement_loop (class loop *);
> > extern unsigned int scev_const_prop (void);
> > extern bool expression_expensive_p (tree);
> > diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
> > index 450926d5f9b..b3d937cb27f 100644
> > --- a/gcc/tree-vrp.c
> > +++ b/gcc/tree-vrp.c
> > @@ -3416,6 +3416,13 @@ vrp_asserts::process_assert_insertions_for (tree
> > name, assert_locus *loc)
> > if (loc->expr == loc->val)
> > return false;
> > + /* Do not insert assertions for RESULT_DECLs, those are asserted to be
> > + read-only in the IL and are always non-NULL. */
> > + if (SSA_NAME_VAR (name)
> > + && TREE_CODE (SSA_NAME_VAR (name)) == RESULT_DECL
> > + && DECL_BY_REFERENCE (SSA_NAME_VAR (name)))
> > + return false;
> > +
> > cond = build2 (loc->comp_code, boolean_type_node, loc->expr, loc->val);
> > assert_stmt = build_assert_expr_for (cond, name);
> > if (loc->e)
> > @@ -4471,8 +4478,6 @@ static unsigned int
> > execute_vrp (struct function *fun, bool warn_array_bounds_p)
> > {
> > loop_optimizer_init (LOOPS_NORMAL | LOOPS_HAVE_RECORDED_EXITS);
> > - rewrite_into_loop_closed_ssa (NULL, TODO_update_ssa);
> > - scev_initialize ();
> >
> > /* ??? This ends up using stale EDGE_DFS_BACK for liveness computation.
> > Inserting assertions may split edges which will invalidate
> > @@ -4480,6 +4485,9 @@ execute_vrp (struct function *fun, bool
> > warn_array_bounds_p)
> > vrp_asserts assert_engine (fun);
> > assert_engine.insert_range_assertions ();
> > + rewrite_into_loop_closed_ssa (NULL, TODO_update_ssa);
> > + scev_initialize ();
> > +
> > /* For visiting PHI nodes we need EDGE_DFS_BACK computed. */
> > mark_dfs_back_edges ();
> > diff --git a/gcc/vr-values.c b/gcc/vr-values.c
> > index 3d0be8edb3b..1ce421e7bef 100644
> > --- a/gcc/vr-values.c
> > +++ b/gcc/vr-values.c
> > @@ -2882,7 +2882,7 @@ vr_values::extract_range_from_phi_node (gphi *phi,
> > goto infinite_check;
> > }
> > - goto update_range;
> > + goto scev_check;
> >
> > varying:
> > vr_result->set_varying (TREE_TYPE (lhs));
> > @@ -2892,10 +2892,23 @@ scev_check:
> > scev_check can be reached from two paths, one is a fall through from
> > above
> > "varying" label, the other is direct goto from code block which tries
> > to
> > avoid infinite simulation. */
> > - if (scev_initialized_p ()
> > - && (l = loop_containing_stmt (phi))
> > - && l->header == gimple_bb (phi))
> > - adjust_range_with_scev (vr_result, l, phi, lhs);
> > + if (scev_initialized_p ())
> > + {
> > + edge e;
> > + if ((l = loop_containing_stmt (phi))
> > + && l->header == gimple_bb (phi))
> > + adjust_range_with_scev (vr_result, l, phi, lhs);
> > + else if (single_pred_p (gimple_bb (phi))
> > + && ((e = single_pred_edge (gimple_bb (phi))), true)
> > + && loop_exit_edge_p (e->src->loop_father, e))
> > + {
> > + bool folded_casts;
> > + tree ev = final_value_at_exit (PHI_ARG_DEF_FROM_EDGE (phi, e), e,
> > + &folded_casts);
> > + if (is_gimple_min_invariant (ev) || TREE_CODE (ev) == SSA_NAME)
> > + vr_result->update (ev, ev, VR_RANGE);
> > + }
> > + }
> >
> > infinite_check:
> > /* If we will end up with a (-INF, +INF) range, set it to
>
>
>
--
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
More information about the Gcc-patches
mailing list