This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Scheduling an early complete loop unrolling pass?
- From: Zdenek Dvorak <rakdver at atrey dot karlin dot mff dot cuni dot cz>
- To: Dorit Nuzman <DORIT at il dot ibm dot com>
- Cc: Ira Rosen <IRAR at il dot ibm dot com>, gcc at gcc dot gnu dot org, Paolo Bonzini <paolo dot bonzini at lu dot unisi dot ch>, Richard Guenther <rguenther at suse dot de>
- Date: Tue, 6 Feb 2007 20:34:47 +0100
- Subject: Re: Scheduling an early complete loop unrolling pass?
- References: <OF0EF73797.E1AE2BE2-ONC225727A.0035B561-C225727A.0035F3AC@LocalDomain> <OF82FB61A0.EC87E773-ONC225727A.003638E1-C225727A.003682D3@il.ibm.com>
Hello,
> Ira Rosen/Haifa/IBM wrote on 06/02/2007 11:49:17:
>
> > Dorit Nuzman/Haifa/IBM wrote on 05/02/2007 21:13:40:
> >
> > > Richard Guenther <rguenther@suse.de> wrote on 05/02/2007 17:59:00:
> > >
> ...
> > >
> > > That's going to change once this project goes in: "(3.2) Straight-
> > > line code vectorization" from http://gcc.gnu.
> > > org/wiki/AutovectBranchOptimizations. In fact, I think in autovect-
> > > branch, if you unroll the above loop it should get vectorized
> > > already. Ira - is that really the case?
> >
> > The completely unrolled loop will not get vectorized because the
> > code will not be inside any loop (and our SLP implementation will
> > focus, at least as a first step, on loops).
>
> Ah, right... I wonder if we can keep the loop structure in place, even
> after completely unrolling the loop - I mean the 'struct loop' in
> 'current_loops' (not the actual CFG), so that the "SLP in loops" would have
> a chance to at least consider vectorizing this "loop". Zdenek - what do you
> say?
I do not think this is a good idea -- making the structures inconsistent
just to "fix" a pass that can be easily fixed in other way.
Zdenek
> thanks,
> dorit
>
> > The following will get vectorized (without permutation on autovect
> > branch, and with redundant permutation on mainline):
> >
> > for (i = 0; i < n; i++)
> > {
> > v[4*i] = 0.0;
> > v[4*i + 1] = 0.0;
> > v[4*i + 2] = 0.0;
> > v[4*i + 3] = 0.0;
> > }
> >
> > The original completely unrolled loop will get vectorized if it is
> > encapsulated in an outer-loop, like so:
> >
> > for (j=0; j<n; j++)
> > {
> > for (i = 0; i < 4; i++)
> > v[i] = 0.0;
> > v += 4;
> > }
> >
> > Ira