[Bug target/68655] SSE2 cannot vec_perm of low and high part

jakub at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Thu Dec 3 09:30:00 GMT 2015


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68655

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2015-12-03
     Ever confirmed|0                           |1

--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Well, doing something like that at the optabs.c level wouldn't be really
helpful, as i?86 has tons of different permutation instructions and for many
permutations different sequence lengths.

So, the question is, does any supported CPU have some extra reinterpretation
costs if we use a different integral vector mode (I believe there is some cost
for some CPU when reinterpreting an integral vector as float vector and back,
vice versa, or perhaps even float vector as double vector and vice versa)?
If not, then the easiest fix is IMHO to change either
ix86_expand_vec_perm_const_1
or both
ix86_expand_vec_perm_const and ix86_vectorize_vec_perm_const_ok
to detect the case when V*{QI,HI,SI} permutation is doable in a wider unit mode
same whole vector size mode and just transform it to that case unconditionally.
If there is some cost, then we'd perhaps should do that at the end of
expand_vec_perm_1 (if everything else failed for single instruction), but then
the question is what to do with the 2-5 long sequences, we'd need to repeat
that at all the other spots.


More information about the Gcc-bugs mailing list