[PATCH 1/3][rs6000] x86-compat vector intrinsics fixes for BE, 32bit

Tue Dec 4 20:34:00 GMT 2018

On 12/04/2018 02:16 PM, Segher Boessenkool wrote:
> Hi!
> 
> On Tue, Dec 04, 2018 at 08:59:03AM -0600, Paul Clarke wrote:
>> Fix general endian and 32-bit mode issues found in the
>> compatibility implementations of the x86 vector intrinsics when running the
>> associated test suite tests.  (The tests had been inadvertently made to PASS
>> without actually running the test code.  A later patch fixes this issue.)
>>
>> In a few cases, the opportunity was taken to update the vector API used in
>> the implementations to the preferred functions from the
>> OpenPOWER 64-Bit ELF V2 ABI Specification.
>>
>> [gcc]
>>
>> 2018-12-03  Paul A. Clarke  <pc@us.ibm.com>
>>
>> 	PR target/88316
>> 	* config/rs6000/mmintrin.h (_mm_unpackhi_pi8): Fix for big-endian.
>> 	(_mm_unpacklo_pi8): Likewise.
>> 	(_mm_mulhi_pi16): Likewise.
>> 	(_mm_packs_pi16): Fix for big-endian. Use preferred API.
>> 	(_mm_packs_pi32): Likewise.
>> 	(_mm_packs_pu16): Likewise.
>> 	* config/rs6000/xmmintrin.h (_mm_cvtss_si32): Fix for big-endian.
>> 	(_mm_cvtss_si64): Likewise.
>> 	(_mm_cvtpi32x2_ps): Likewise.
>> 	(_mm_shuffle_ps): Likewise.
>> 	(_mm_movemask_pi8): Likewise.
>> 	(_mm_mulhi_pu16): Likewise.
>> 	(_mm_sad_pu8): Likewise.
>> 	(_mm_sad_pu8): Likewise.
>> 	(_mm_cvtpu16_ps): Fix for big-endian. Use preferred API.
>> 	(_mm_cvtpu8_ps): Likewise.
>> 	* config/rs6000/emmintrin.h (_mm_movemask_pd): Fix for big-endian.
>> 	(_mm_mul_epu32): Likewise.
>> 	(_mm_bsrli_si128): Likewise.
>> 	(_mm_movemask_epi8): Likewise.
>> 	(_mm_shufflehi_epi16): Likewise.
>> 	(_mm_shufflelo_epi16): Likewise.
>> 	(_mm_shuffle_epi32): Likewise.
>> 	* config/rs6000/pmmintrin.h (_mm_hadd_ps): Fix for big-endian.
>> 	(_mm_sub_ps): Likewise.
>> 	* config/rs6000/mmintrin.h (_mm_cmpeq_pi8): Fix for 32-bit mode.
> 
> 
>> @@ -1612,7 +1608,8 @@ _mm_bsrli_si128 (__m128i __A, const int __N)
>>    const __v16qu zeros = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 };
>>  
>>    if (__N < 16)
>> -    if (__builtin_constant_p(__N))
>> +    if (__builtin_constant_p(__N) &&
>> +        __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__)
> 
> Please just use __LITTLE_ENDIAN__, as the rest of these files already does.
> (More times in this patch; also BIG).

OK.  I was using the ORDER macros based on the GCC documentation at https://gcc.gnu.org/onlinedocs/cpp/Common-Predefined-Macros.html, which mentions them and does not mention the shorter ENDIAN boolean macros.

> Okay for trunk with that fixed.  Thanks!

Thanks for the review!

> Do you have new testcases, too?  Or is all this caught by existing
> testcases?

Same testcases.  They catch a lot more bugs when they actually run.

PC