In looking at bug 80697, I noticed on power8, there were loads to a GPR register and move directs to vector registers. I tracked this down to the load with splat instruction only taking indirect or indexed loads, while the original address is an offsettable load. So the register allocator decides to load up a GPR and do the transfer over to the vector register to do the vec_duplicate operation. I.e.: vector double foo (double *p) { return (vector double) { p[4], p[4] }; } generates: foo: ld 9,32(3) mtvsrd 34,9 xxpermdi 34,34,34,0 blr I tested adding a combiner pattern to support offsettable loads, and it generates: foo: li 9,32 lxvdsx 34,3,9 blr
Author: meissner Date: Fri May 12 19:48:54 2017 New Revision: 247994 URL: https://gcc.gnu.org/viewcvs?rev=247994&root=gcc&view=rev Log: Rework pr 80718 Modified: branches/ibm/meissner-work/gcc/ChangeLog.meissner branches/ibm/meissner-work/gcc/config/rs6000/vsx.md
Author: meissner Date: Fri May 12 19:54:03 2017 New Revision: 247995 URL: https://gcc.gnu.org/viewcvs?rev=247995&root=gcc&view=rev Log: Rework pr 80718 Modified: branches/ibm/meissner-work/gcc/config/rs6000/vsx.md
Author: meissner Date: Mon May 22 22:44:45 2017 New Revision: 248352 URL: https://gcc.gnu.org/viewcvs?rev=248352&root=gcc&view=rev Log: [gcc] 2017-05-22 Michael Meissner <meissner@linux.vnet.ibm.com> PR target/80718 * config/rs6000/vsx.md (vsx_splat_<mode>, VSX_D iterator): Split V2DF/V2DI splat into two separate patterns, one that handles registers, and the other that only handles memory. Drop support for splatting from a GPR on ISA 2.07 and then splitting the splat into direct move and splat. (vsx_splat_<mode>_reg): Likewise. (vsx_splat_<mode>_mem): Likewise. [gcc/testsuite] 2017-05-22 Michael Meissner <meissner@linux.vnet.ibm.com> PR target/80718 * gcc.target/powerpc/pr80718.c: New test. Added: trunk/gcc/testsuite/gcc.target/powerpc/pr80718.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/rs6000/vsx.md trunk/gcc/testsuite/ChangeLog
Author: meissner Date: Tue Jun 6 22:27:13 2017 New Revision: 248936 URL: https://gcc.gnu.org/viewcvs?rev=248936&root=gcc&view=rev Log: Back port from mainline [gcc] 2017-05-19 Michael Meissner <meissner@linux.vnet.ibm.com> PR target/80718 * config/rs6000/vsx.md (vsx_splat_<mode>, VSX_D iterator): Prefer VSX registers over GPRs, particularly on ISA 2.07 which does not have the MTVSRDD instruction. [gcc/testsuite] 2017-05-19 Michael Meissner <meissner@linux.vnet.ibm.com> PR target/80718 * gcc.target/powerpc/pr80718.c: New test. Added: branches/gcc-7-branch/gcc/testsuite/gcc.target/powerpc/pr80718.c - copied unchanged from r248902, trunk/gcc/testsuite/gcc.target/powerpc/pr80718.c Modified: branches/gcc-7-branch/gcc/ChangeLog branches/gcc-7-branch/gcc/config/rs6000/vsx.md branches/gcc-7-branch/gcc/testsuite/ChangeLog
Author: meissner Date: Wed Jun 21 18:02:37 2017 New Revision: 249466 URL: https://gcc.gnu.org/viewcvs?rev=249466&root=gcc&view=rev Log: [gcc] 2017-06-21 Michael Meissner <meissner@linux.vnet.ibm.com> Back port from mainline 2017-05-19 Michael Meissner <meissner@linux.vnet.ibm.com> PR target/80718 * config/rs6000/vsx.md (vsx_splat_<mode>, VSX_D iterator): Prefer VSX registers over GPRs, particularly on ISA 2.07 which does not have the MTVSRDD instruction. Back port from mainline 2017-05-18 Michael Meissner <meissner@linux.vnet.ibm.com> PR target/80510 * config/rs6000/predicates.md (simple_offsettable_mem_operand): New predicate. * config/rs6000/rs6000.md (ALTIVEC_DFORM): New iterator. (define_peephole2 for Altivec d-form load): Add peepholes to catch cases where the register allocator uses a move and an offsettable memory operation to/from a FPR register on ISA 2.06/2.07. (define_peephole2 for Altivec d-form store): Likewise. Back port from mainline 2017-05-09 Michael Meissner <meissner@linux.vnet.ibm.com> PR target/68163 * config/rs6000/rs6000.md (f32_lr): Delete mode attributes that are now unused after splitting mov{sf,sd}_hardfloat. (f32_lr2): Likewise. (f32_lm): Likewise. (f32_lm2): Likewise. (f32_li): Likewise. (f32_li2): Likewise. (f32_lv): Likewise. (f32_sr): Likewise. (f32_sr2): Likewise. (f32_sm): Likewise. (f32_sm2): Likewise. (f32_si): Likewise. (f32_si2): Likewise. (f32_sv): Likewise. (f32_dm): Likewise. (f32_vsx): Likewise. (f32_av): Likewise. (mov<mode>_hardfloat): Split into separate movsf and movsd pieces. For movsf, order stores so the VSX stores occur before the GPR store which encourages the register allocator to use a traditional FPR instead of a GPR. For movsd, order the stores so that the GPR store comes before the VSX stores to allow the power6 to work. This is due to the power6 not having a 32-bit integer store instruction from a FPR. (movsf_hardfloat): Likewise. (movsd_hardfloat): Likewise. [gcc/testsuite] 2017-06-21 Michael Meissner <meissner@linux.vnet.ibm.com> Back port from mainline 2017-05-19 Michael Meissner <meissner@linux.vnet.ibm.com> PR target/80718 * gcc.target/powerpc/pr80718.c: New test. Back port from mainline 2017-05-18 Michael Meissner <meissner@linux.vnet.ibm.com> PR target/80510 * gcc.target/powerpc/pr80510-1.c: New test. * gcc.target/powerpc/pr80510-2.c: Likewise. Back port from mainline 2017-05-09 Michael Meissner <meissner@linux.vnet.ibm.com> PR target/68163 * gcc.target/powerpc/pr68163.c: New test. Added: branches/gcc-6-branch/gcc/testsuite/gcc.target/powerpc/pr68163.c - copied unchanged from r249041, trunk/gcc/testsuite/gcc.target/powerpc/pr68163.c branches/gcc-6-branch/gcc/testsuite/gcc.target/powerpc/pr80510-1.c branches/gcc-6-branch/gcc/testsuite/gcc.target/powerpc/pr80510-2.c branches/gcc-6-branch/gcc/testsuite/gcc.target/powerpc/pr80718.c Modified: branches/gcc-6-branch/gcc/ChangeLog branches/gcc-6-branch/gcc/config/rs6000/predicates.md branches/gcc-6-branch/gcc/config/rs6000/rs6000.md branches/gcc-6-branch/gcc/config/rs6000/vsx.md branches/gcc-6-branch/gcc/testsuite/ChangeLog
Fix back ported to gcc 7/6 branches.