This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug middle-end/29756] SSE intrinsics hard to use without redundant temporaries appearing



------- Comment #3 from timday at bottlenose dot demon dot co dot uk  2006-11-08 10:01 -------
I've just tried an alternative version (will upload later) replacing the union
with a single
  __v4sf _rep,
and implementing the [] operators using e.g
  (reinterpret_cast<const float*>(&_rep))[i];
However the code generated by the two transform implementations remains the
same (20 and 32 instructions anyway; haven't checked the details yet).
Maybe not surprising as it's just moving the problem around.

The big difference between the two methods is perhaps primarily that the bad
one involves a __v4sf->float->__vfs4 conversion, while the good one uses __v4sf
throughout by using the mul_compN methods.  I'll try and prepare a more concise
test case based on the premise that bad handling of __v4sf <-> float is the
real issue.


-- 

timday at bottlenose dot demon dot co dot uk changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |timday at bottlenose dot
                   |                            |demon dot co dot uk


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29756


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]