The following test case: void testcase(signed short a, signed short *x, signed short *y) { unsigned long i; for (i = 0; i < 1024; i++) y[i] = a * x[i] + y[i]; } has redundant swaps on the way in and out: lxvd2x 32,8,9 lxvd2x 33,10,9 xxpermdi 32,32,32,2 xxpermdi 33,33,33,2 vmladduhm 0,13,0,1 xxpermdi 0,32,32,2 stxvd2x 0,10,9
Mine.
And confirmed.
This is simple enough. We have code to allow splats in pure-SIMD ranges, but we are missing a pattern that performs a splat and a truncate in the same operation. Should have a patch to submit today.
Author: wschmidt Date: Fri Sep 16 21:28:52 2016 New Revision: 240199 URL: https://gcc.gnu.org/viewcvs?rev=240199&root=gcc&view=rev Log: [gcc] 2016-09-16 Bill Schmidt <wschmidt@linux.vnet.ibm.com> PR target/77613 * config/rs6000/rs6000.c (rtx_is_swappable_p): Add support for splat with truncate. [gcc/testsuite] 2016-09-16 Bill Schmidt <wschmidt@linux.vnet.ibm.com> PR target/77613 * gcc.target/powerpc/swaps-p8-25.c: New. Added: trunk/gcc/testsuite/gcc.target/powerpc/swaps-p8-25.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/rs6000/rs6000.c trunk/gcc/testsuite/ChangeLog
Fixed.