This is the mail archive of the libstdc++@gcc.gnu.org mailing list for the libstdc++ project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
On Tue, Aug 28, 2012 at 8:33 AM, Marc Glisse <marc.glisse@inria.fr> wrote:Actually, it looks to me like most of it can be rewritten using gcc's vector extensions. _mm_mul_pd(a,b) is just a*b, a[0] gives the first element,
It does? How?
If I use
typedef double v2df __attribute__ ((vector_size (16))); v2df a;
then I cannot use a[0].
gcc has an arch-specific builtin __builtin_ia32_vec_ext_v2df(). There is nothing else I have found which can take its place.
The horizontal add you already noticed. What is also missing is control over memory accesses. I.e., loads and stores from scalar pointers that are possibly unaligned.
There are big gaps in the vector extensions. In other functions sqrt is needed, masking is required, etc. None of this works (at least with the same efficiency) as using the Intel intrinsics. Waiting for the compiler to catch up isn't good IMO.
As you have seen in the code, it should be easy enough for the SSE code to be converted to use the gcc vector extensions if and when they catch up.
-- Marc Glisse
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |