[Bug tree-optimization/79460] gcc fails to optimise out a trivial additive loop for seemingly arbitrary numbers of iterations
rguenther at suse dot de
gcc-bugzilla@gcc.gnu.org
Tue Feb 14 09:30:00 GMT 2017
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79460
--- Comment #6 from rguenther at suse dot de <rguenther at suse dot de> ---
On Mon, 13 Feb 2017, amker at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79460
>
> --- Comment #5 from amker at gcc dot gnu.org ---
> (In reply to Jakub Jelinek from comment #4)
> > (In reply to Richard Biener from comment #3)
> > > In this case it is complete unrolling that can estimate the non-vector code
> > > to constant fold but not the vectorized code. OTOH it's quite excessive
> > > work done by the unroller when doing this for large N...
> > >
> > > And yes, SCEV final value replacement doesn't know how to handle float
> > > reductions
> > > (we have a different PR for that).
> >
> > Doesn't handle float reductions nor vector (integer or vector) reductions.
> > Even the vector ones would be useful, if e.g. to a vector every iteration
> > adds a VECTOR_CST or similar, then it could be still nicely optimized.
> Integer version should have already been supported now.
>
> >
> > For the 202 case, it seems we are generating a scalar loop epilogue (not
> > needed for 200) and somehow it seems something in the vector is actually
> > able to figure out the floating point final value, because we get:
> > # p_2 = PHI <2.01e+2(5), p_12(7)>
> > # i_3 = PHI <200(5), i_13(7)>
> > on the scalar loop epilogue. So if something in the vectorizer is able to
> > figure it out, why can't it just use that even in the case where no epilogue
> > loop is needed?
> IIUC, scev-ccp should be made query based interface so that it can be called
> for each loop closed phi at different compilation stage. It also needs to be
> extended to cover basic floating point case like this. Effectively, it need to
> do the same transformation as vectorizer does now, but just thought it might be
> a better place to do that.
Yeah, the vectorizer does this in vect_update_ivs_after_vectorizer
by accident I think - it sees the float "IV" and replaces the prologue
loop init by init + niter * step which is on the border of invalid
(without -ffp-contract=on/fast). At least if the vectorizer can do this
then final value replacement can do so as well with
Index: gcc/tree-scalar-evolution.c
===================================================================
--- gcc/tree-scalar-evolution.c (revision 245417)
+++ gcc/tree-scalar-evolution.c (working copy)
@@ -3718,13 +3718,6 @@ final_value_replacement_loop (struct loo
continue;
}
- if (!POINTER_TYPE_P (TREE_TYPE (def))
- && !INTEGRAL_TYPE_P (TREE_TYPE (def)))
- {
- gsi_next (&psi);
- continue;
- }
-
bool folded_casts;
def = analyze_scalar_evolution_in_loop (ex_loop, loop, def,
&folded_casts);
(rather than removing the condition replace it with a validity check -
like FP contraction? etc...).
But ideally SCEV itself would contain those (or compute exact results
with rounding effects).
Like maybe simply
Index: gcc/tree-scalar-evolution.c
===================================================================
--- gcc/tree-scalar-evolution.c (revision 245417)
+++ gcc/tree-scalar-evolution.c (working copy)
@@ -3718,8 +3718,10 @@ final_value_replacement_loop (struct loo
continue;
}
- if (!POINTER_TYPE_P (TREE_TYPE (def))
- && !INTEGRAL_TYPE_P (TREE_TYPE (def)))
+ if (! (POINTER_TYPE_P (TREE_TYPE (def))
+ || INTEGRAL_TYPE_P (TREE_TYPE (def))
+ || (FLOAT_TYPE_P (TREE_TYPE (def))
+ && flag_fp_contract_mode == FP_CONTRACT_FAST)))
{
gsi_next (&psi);
continue;
Richard.
More information about the Gcc-bugs
mailing list