This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug middle-end/29756] SSE intrinsics hard to use without redundant temporaries appearing
- From: "timday at bottlenose dot demon dot co dot uk" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 8 Nov 2006 10:01:12 -0000
- Subject: [Bug middle-end/29756] SSE intrinsics hard to use without redundant temporaries appearing
- References: <bug-29756-13527@http.gcc.gnu.org/bugzilla/>
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
------- Comment #3 from timday at bottlenose dot demon dot co dot uk 2006-11-08 10:01 -------
I've just tried an alternative version (will upload later) replacing the union
with a single
__v4sf _rep,
and implementing the [] operators using e.g
(reinterpret_cast<const float*>(&_rep))[i];
However the code generated by the two transform implementations remains the
same (20 and 32 instructions anyway; haven't checked the details yet).
Maybe not surprising as it's just moving the problem around.
The big difference between the two methods is perhaps primarily that the bad
one involves a __v4sf->float->__vfs4 conversion, while the good one uses __v4sf
throughout by using the mul_compN methods. I'll try and prepare a more concise
test case based on the premise that bad handling of __v4sf <-> float is the
real issue.
--
timday at bottlenose dot demon dot co dot uk changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |timday at bottlenose dot
| |demon dot co dot uk
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29756