This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: auto vectorization in gcc

> >> like "substract and saturate" if they are supported, and unrolling
> >> or blocking the iterations according to the vector length, etc)?
> >>
> >Something along those lines, yes.
>I haven't thought much about saturation in years :-)  I thought when I
>left the embedded world behind saturation wouldn't be a big issue :-)

Well, judging from the following quotation, if-converting "subtract and
saturate" is crucial for vectorizing not only in the embedded world ...

Question is, should we if-convert and vectorize such a loop in tree(ssa)
level if we're not sure the target has an appropriate `psubusw' ---
risking to fall back and generate original if structure if not.


In the integer benchmark 164.gzip, about 8% of the execution time is spent
in code that has the
following form for an unsigned short array 'head'.
  for (n = 0; n < HASH_SIZE; n++) {
    m = head[n];
    head[n] = (Pos)(m >= 32768 ? m-32768 : 0);
The Intel® C++/Fortran compiler recognizes such saturation idioms and
implements the loop-body
with a single SIMD instruction 'psubusw', which boosts the performance of
the loop substantially,
contributing to an extra improvement of the benchmark as a whole.

Quotation taken from []:

A.J.C. Bik, M. Girkar, P.M. Grey, and X. Tian. Experiments with Automatic
Vectorization for the Pentium® 4 Processor, 9th Workshop on Compilers for
Parallel Computers, June 27-29, Edinburgh, Scotland, UK, 2001 (download at

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]