This is the mail archive of the
mailing list for the GCC project.
Re: [PATCH] Fixing PR59006 and PR58921 by delaying loop invariant hoisting in vectorizer.
- From: Richard Biener <rguenther at suse dot de>
- To: Jakub Jelinek <jakub at redhat dot com>
- Cc: Cong Hou <congh at google dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Wed, 27 Nov 2013 13:23:58 +0100 (CET)
- Subject: Re: [PATCH] Fixing PR59006 and PR58921 by delaying loop invariant hoisting in vectorizer.
- Authentication-results: sourceware.org; auth=none
- References: <CAK=A3=3m-dK_HHR8K5CF0n+uTnu5k4=swXHf3SzdH3KHoLzuTw at mail dot gmail dot com> <alpine dot LNX dot 2 dot 00 dot 1311271020320 dot 8615 at zhemvz dot fhfr dot qr> <20131127113730 dot GZ892 at tucnak dot redhat dot com> <alpine dot LNX dot 2 dot 00 dot 1311271247470 dot 8615 at zhemvz dot fhfr dot qr> <20131127121512 dot GA892 at tucnak dot redhat dot com>
On Wed, 27 Nov 2013, Jakub Jelinek wrote:
> On Wed, Nov 27, 2013 at 12:54:14PM +0100, Richard Biener wrote:
> > On Wed, 27 Nov 2013, Jakub Jelinek wrote:
> > > On Wed, Nov 27, 2013 at 10:53:56AM +0100, Richard Biener wrote:
> > > > Hmm. I'm still thinking that we should handle this during the regular
> > > > transform step.
> > >
> > > I wonder if it can't be done instead just in vectorizable_load,
> > > if LOOP_REQUIRES_VERSIONING_FOR_ALIAS (loop_vinfo) and the load is
> > > invariant, just emit the (broadcasted) load not inside of the loop, but on
> > > the loop preheader edge.
> > It is safe even for !LOOP_REQUIRES_VERSIONING_FOR_ALIAS. It's just
> > a missed optimization I even noted when originally implementing
> > support for invariant loads ...
> True, but only for non-simd loops, or if we proved it by looking at all
> relevant LOOP_VINFO_DDRSs. But, if it is not a simd loop, and
> not !LOOP_REQUIRES_VERSIONING_FOR_ALIAS, wouldn't previous optimizations
> hoist the load before the loop already?
Well - there is the case of
*p = g;
where we know in the _vectorized_ loop body that they cannot alias
(because we have at least two vector elements). But it seems
we lost that optimization at some point as
void foo (int *p, int n)
for (i = 0; i < n; ++i)
*p = g;
uses alias versioning. Bah ;) Maybe I'm just dreaming that
I implemented that as well.
> > Ick. I hate this behind-the-back stuff - so safelen doesn't mean
> > that a[i] and a do not alias.
> My initial understanding of the SIMD loops was also that it allows the
> the up to safelen consecutive iterations to be randomly reordered or
> intermixed without affecting valid programs, but further mails from Tobias
> and others on this topic plus testcases changed my understanding of it.
> Note that we don't purge LOOP_VINFO_DDRSs in any way for loop->safelen,
> just don't add versioning for aliasor punt if there is some possible (or
> proven) aliasing. Perhaps we could add a bool flag to loop_vinfo which
> would tell us whether the loop has no data dependencies at all (i.e.
> either for non-safelen is !LOOP_REQUIRES_VERSIONING_FOR_ALIAS, or
> with safelen non-zero would be !LOOP_REQUIRES_VERSIONING_FOR_ALIAS).
> Then we could hoist if that flag is set or
> LOOP_REQUIRES_VERSIONING_FOR_ALIAS (because then the runtime test
> checks the dependency).
> > Note that this will break with
> > SLP stuff at least as that will re-order reads/writes. Not sure
> > how safelen applies to SLP though. That is
> > a[i] = i;
> > b[i] = a;
> > a[i+1] = i+1;
> > b[i+1] = a;
> > will eventually end up re-ordering reads/writes in non-obvious
> > ways.
> You mean SLP inside of loop vectorization, right? Because for normal SLP
> outside of loop vectorizer simdlen is ignored and normal data ref is
> performed without any bypassing.
Yes, SLP inside of a loop.