This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: auto vectorization in gcc
- From: "Ayal Zaks" <ZAKS at il dot ibm dot com>
- To: law at redhat dot com
- Cc: "Dorit Naishlos" <DORIT at il dot ibm dot com>, gcc at gcc dot gnu dot org, dnovillo at redhat dot com, david Edelsohn <dje at watson dot ibm dot com>
- Date: Thu, 17 Jul 2003 23:23:43 +0200
- Subject: Re: auto vectorization in gcc
> >> like "substract and saturate" if they are supported, and unrolling
> >> or blocking the iterations according to the vector length, etc)?
> >>
> >Something along those lines, yes.
>I haven't thought much about saturation in years :-) I thought when I
>left the embedded world behind saturation wouldn't be a big issue :-)
Well, judging from the following quotation, if-converting "subtract and
saturate" is crucial for vectorizing not only in the embedded world ...
Question is, should we if-convert and vectorize such a loop in tree(ssa)
level if we're not sure the target has an appropriate `psubusw' ---
risking to fall back and generate original if structure if not.
Ayal.
"
In the integer benchmark 164.gzip, about 8% of the execution time is spent
in code that has the
following form for an unsigned short array 'head'.
for (n = 0; n < HASH_SIZE; n++) {
m = head[n];
head[n] = (Pos)(m >= 32768 ? m-32768 : 0);
}
The Intel® C++/Fortran compiler recognizes such saturation idioms and
implements the loop-body
with a single SIMD instruction 'psubusw', which boosts the performance of
the loop substantially,
contributing to an extra improvement of the benchmark as a whole.
"
Quotation taken from [http://www.liacs.nl/home/ajcbik/pub.html]:
A.J.C. Bik, M. Girkar, P.M. Grey, and X. Tian. Experiments with Automatic
Vectorization for the Pentium® 4 Processor, 9th Workshop on Compilers for
Parallel Computers, June 27-29, Edinburgh, Scotland, UK, 2001 (download at
http://www.icsa.informatics.ed.ac.uk/cpc2001/Speakers/bik.html).