This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: gcc's autovectorizer tests
- From: Dorit Naishlos <DORIT at il dot ibm dot com>
- To: Uros Bizjak <uros at kss-loka dot si>
- Cc: gcc at gcc dot gnu dot org
- Date: Wed, 9 Feb 2005 20:50:16 +0200
- Subject: Re: gcc's autovectorizer tests
The problem in tests 2 and 3 is that the potential for simdization there is
within an iteration rather than across iterations, because each memory
reference reads/writes non-consecutive data elements in consecutive loop
iterations:
for (i=0; i<1000; i++) {
a [i].x = b [i].x + c [i].x;
a [i].y = b [i].y + c [i].y;
a [i].z = b [i].z + c [i].z;
a [i].w = b [i].w + c [i].w;
}
Right now, we only look for opportunities across loop iterations (which is
what a classic loop vectorizer does) - i.e. for each reference, e.g.
a[i].x, we try to see if the data it reads/writes in consecutive loop
iterations {a[i].x,a[i+1].x,a[i+2].x,a[i+3].x} can be replaced with a
vector read/write. We're totally blind to the fact that
{a[i].x,a[i].y,a[i].z,a[i].w} within the same iteration can be replaced
with a vector access. SLP can do this, and I am thinking of doing something
like SLP, within loops (to catch cases like this, and cases of unrolled
loops).
dorit
gcc-owner@gcc.gnu.org wrote on 09/02/2005 17:08:33:
> Hello!
>
> Here are some test sources to exercise gcc's autovectorization feature.
> These sources are in fact demonstration code to show the abilities of
> VectorC:
>
> http://www.codeplay.com/vectorc/feat-vec2.html
>
> I have tried to compile test1.c - 'simple source' , test2.c - 'medium
> source' and test3.c - 'complex source' with current mainline gcc:
> gcc version 4.0.0 20050209 (experimental)
> with compile flags:
> -O2 -msse2 -mfpmath=sse -ftree-vectorize -fdump-tree-vect-stats
> -funroll-all-loops
>
> gcc was able to vectorize only test1.c, 'simple source' to:
>
> .L2:
> leal 0(,%edx,4), %eax
> addl $4, %edx
> movaps c(%eax), %xmm0
> cmpl $1000, %edx
> addps b(%eax), %xmm0
> movaps %xmm0, a(%eax)
> jne .L2
>
> Other tests were not vectorized:
>
> test2.c:9: note: not vectorized: unhandled data ref: D.1452_8 = b[i_5].x
> test2.c:6: note: vectorized 0 loops in function.
>
> and
>
> test3.c:8: note: not vectorized: unhandled data ref: D.1455_10 =
b[i_35].R
> test3.c:6: note: vectorized 0 loops in function.
>
> Here are some other benchmarks to play with:
> http://www.codeplay.com/vectorc/bench.html
>
> Uros.