This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: SSE (Pentium 3) - Is this correct?
- From: Revital1 Eres <ERES at il dot ibm dot com>
- To: "mal content" <artifact dot one at googlemail dot com>
- Cc: gcc at gcc dot gnu dot org
- Date: Mon, 8 Jan 2007 09:00:13 +0200
- Subject: Re: SSE (Pentium 3) - Is this correct?
> The C code:
>
> float *vector_add4f(float va[4], const float vb[4])
> {
> va[0] += vb[0];
> va[1] += vb[1];
> va[2] += vb[2];
> va[3] += vb[3];
> return va;
> }
> Now, uh, isn't that four additions? Do I need to do something
gcc-specific
> to get it to use the 'add-packed-single' instruction to turn those four
> additions into one?
-ftree-vectorize flag is missing.
(see http://gcc.gnu.org/projects/tree-ssa/vectorization.html for more info
about
the flags you should use)
Also, currently the vectorizer is applied only on loops. (please see the
Auto-vectorization
page for examples)
Revital