This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [patch] Predictive commoning
- From: Dorit Nuzman <DORIT at il dot ibm dot com>
- To: Zdenek Dvorak <rakdver at atrey dot karlin dot mff dot cuni dot cz>
- Cc: gcc-patches at gcc dot gnu dot org, "Daniel Berlin" <dberlin at dberlin dot org>
- Date: Tue, 6 Feb 2007 22:25:54 +0200
- Subject: Re: [patch] Predictive commoning
> Hello,
>
> this patch implements the predictive commoning optimization (i.e.,
> reusing the values computed in previous iterations of the loop in the
> following iterations). The "canonical" example of this transformation
> the following optimization of the computation of Fibonacci numbers:
>
...
> Index: passes.c
> ===================================================================
> *** passes.c (revision 120837)
> --- passes.c (working copy)
...
> *************** init_optimization_passes (void)
> *** 622,627 ****
> --- 624,630 ----
> NEXT_PASS (pass_tree_loop_init);
> NEXT_PASS (pass_copy_prop);
> NEXT_PASS (pass_lim);
> + NEXT_PASS (pass_predcom);
> NEXT_PASS (pass_tree_unswitch);
> NEXT_PASS (pass_scev_cprop);
> NEXT_PASS (pass_empty_loop);
(this whole discussion on scheduling of complete-unrolling pass relative to
vectorization had me thinking): Have you considered scheduling
predictive-commoning pass after vectorization? It seems to me that it can
transform loops, that are otherwise easy to vectorize, into loops with
cross iteration dependencies, which are hard to vectorize. e.g. take this
loop as an example:
for (i=0; i<n; i++)
c[i] += a[i] * a[i+2];
this should be easily vectorizable, unless predictive-commoning is applied
beforehand.
Actually, predictive commoning would have been useful after vectorization
for cases like misaligned accesses - where we need to load from two
consecutive locations in each iteration to extract the relevant data. We
actually sort of do predictive-commoning in the vectorizer for this case
(if a target defines a realign_load).
By the way, I tried out a similar testcase:
for (i=0; i<n; i++)
c[i] += a[i] * a[i+1];
and this one currently doesn't get vectorized - looks like pre is already
doing predictive-commoning for this case before vectorization... any way we
could defer this particular pre transformation till after vectorization?
thanks,
dorit