[PATCH] rs6000: Fix _mm_movemask_pi8 emulation for 32 bit
Segher Boessenkool
segher@kernel.crashing.org
Sat Mar 23 17:46:00 GMT 2019
David noticed this failing on AIX. It doesn't work on any 32-bit BE,
not so easily noticed because BE Linux will not run these tests (many
of the tests require a Power8 although the test does not need it, and
there aren't many BE Linux Power8 installations).
It turns out the 32-bit implementation of this function is for LE only.
This patch fixes it. Tested on Linux, and by David on AIX. Installing.
Segher
2019-03-23 Segher Boessenkool <segher@kernel.crashing.org>
* config/rs6000/xmmintrin.h (_mm_movemask_pi8): Implement for 32-bit
big endian.
---
gcc/config/rs6000/xmmintrin.h | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/gcc/config/rs6000/xmmintrin.h b/gcc/config/rs6000/xmmintrin.h
index 71e4bd4..f9474b6 100644
--- a/gcc/config/rs6000/xmmintrin.h
+++ b/gcc/config/rs6000/xmmintrin.h
@@ -1586,9 +1586,15 @@ _mm_movemask_pi8 (__m64 __A)
#endif
return __builtin_bpermd (p, __A);
#else
+#ifdef __LITTLE_ENDIAN__
unsigned int mask = 0x20283038UL;
unsigned int r1 = __builtin_bpermd (mask, __A) & 0xf;
unsigned int r2 = __builtin_bpermd (mask, __A >> 32) & 0xf;
+#else
+ unsigned int mask = 0x38302820UL;
+ unsigned int r1 = __builtin_bpermd (mask, __A >> 32) & 0xf;
+ unsigned int r2 = __builtin_bpermd (mask, __A) & 0xf;
+#endif
return (r2 << 4) | r1;
#endif
}
--
1.8.3.1
More information about the Gcc-patches
mailing list