Created attachment 36624 [details] Sample program to show the problem. If you select -mcpu=power8, and create a program that has more than 32 live single precision values, the compiler will not use the stxsspx instruction to store values in the Altivec registers. Instead it will do a xscvdpspn instruction to convert the internal format to vector form, then a mfvsrd instruction to move the value into a GPR, and finally stw instruction to store the 32-bit word. If you change the type from float to double, generating the stxsdx instruction.
Created attachment 40691 [details] Proposed patch to fix the problem. I believe this patch fixes the problem. Note, I am going on vacation, and won't return until the end of February, so I won't be submitting the patch until I get back (unless somebody else wants to verify that it works and submits it).
Author: meissner Date: Tue Apr 18 17:08:16 2017 New Revision: 246974 URL: https://gcc.gnu.org/viewcvs?rev=246974&root=gcc&view=rev Log: Add initial patch for pr 68163 Added: branches/ibm/meissner-gcc8/gcc/testsuite/gcc.target/powerpc/pr68163.c - copied unchanged from r246956, branches/ibm/meissner-work/gcc/testsuite/gcc.target/powerpc/pr68163.c Modified: branches/ibm/meissner-gcc8/gcc/ChangeLog.meissner branches/ibm/meissner-gcc8/gcc/config/rs6000/rs6000.md branches/ibm/meissner-gcc8/gcc/testsuite/ChangeLog.meissner
Author: meissner Date: Tue May 9 21:25:23 2017 New Revision: 247819 URL: https://gcc.gnu.org/viewcvs?rev=247819&root=gcc&view=rev Log: [gcc] 2017-05-09 Michael Meissner <meissner@linux.vnet.ibm.com> PR target/68163 * config/rs6000/rs6000.md (f32_lr): Delete mode attributes that are now unused after splitting mov{sf,sd}_hardfloat. (f32_lr2): Likewise. (f32_lm): Likewise. (f32_lm2): Likewise. (f32_li): Likewise. (f32_li2): Likewise. (f32_lv): Likewise. (f32_sr): Likewise. (f32_sr2): Likewise. (f32_sm): Likewise. (f32_sm2): Likewise. (f32_si): Likewise. (f32_si2): Likewise. (f32_sv): Likewise. (f32_dm): Likewise. (f32_vsx): Likewise. (f32_av): Likewise. (mov<mode>_hardfloat): Split into separate movsf and movsd pieces. For movsf, order stores so the VSX stores occur before the GPR store which encourages the register allocator to use a traditional FPR instead of a GPR. For movsd, order the stores so that the GPR store comes before the VSX stores to allow the power6 to work. This is due to the power6 not having a 32-bit integer store instruction from a FPR. (movsf_hardfloat): Likewise. (movsd_hardfloat): Likewise. [gcc/testsuite] 2017-05-09 Michael Meissner <meissner@linux.vnet.ibm.com> PR target/68163 * gcc.target/powerpc/pr68163.c: New test. Added: trunk/gcc/testsuite/gcc.target/powerpc/pr68163.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/rs6000/rs6000.md trunk/gcc/testsuite/ChangeLog
Author: meissner Date: Fri May 26 01:52:24 2017 New Revision: 248480 URL: https://gcc.gnu.org/viewcvs?rev=248480&root=gcc&view=rev Log: [gcc] 2017-05-25 Michael Meissner <meissner@linux.vnet.ibm.com> Backport from trunk 2017-05-18 Michael Meissner <meissner@linux.vnet.ibm.com> PR target/80510 * config/rs6000/predicates.md (simple_offsettable_mem_operand): New predicate. * config/rs6000/rs6000.md (ALTIVEC_DFORM): New iterator. (define_peephole2 for Altivec d-form load): Add peepholes to catch cases where the register allocator uses a move and an offsettable memory operation to/from a FPR register on ISA 2.06/2.07. (define_peephole2 for Altivec d-form store): Likewise. Backport from trunk 2017-05-09 Michael Meissner <meissner@linux.vnet.ibm.com> PR target/68163 * config/rs6000/rs6000.md (f32_lr): Delete mode attributes that are now unused after splitting mov{sf,sd}_hardfloat. (f32_lr2): Likewise. (f32_lm): Likewise. (f32_lm2): Likewise. (f32_li): Likewise. (f32_li2): Likewise. (f32_lv): Likewise. (f32_sr): Likewise. (f32_sr2): Likewise. (f32_sm): Likewise. (f32_sm2): Likewise. (f32_si): Likewise. (f32_si2): Likewise. (f32_sv): Likewise. (f32_dm): Likewise. (f32_vsx): Likewise. (f32_av): Likewise. (mov<mode>_hardfloat): Split into separate movsf and movsd pieces. For movsf, order stores so the VSX stores occur before the GPR store which encourages the register allocator to use a traditional FPR instead of a GPR. For movsd, order the stores so that the GPR store comes before the VSX stores to allow the power6 to work. This is due to the power6 not having a 32-bit integer store instruction from a FPR. (movsf_hardfloat): Likewise. (movsd_hardfloat): Likewise. [gcc/testsuite] 2017-05-25 Michael Meissner <meissner@linux.vnet.ibm.com> Backport from trunk 2017-05-18 Michael Meissner <meissner@linux.vnet.ibm.com> PR target/80510 * gcc.target/powerpc/pr80510-1.c: New test. * gcc.target/powerpc/pr80510-2.c: Likewise. Backport from trunk 2017-05-09 Michael Meissner <meissner@linux.vnet.ibm.com> PR target/68163 * gcc.target/powerpc/pr68163.c: New test. Added: branches/gcc-7-branch/gcc/testsuite/gcc.target/powerpc/pr68163.c - copied unchanged from r248471, trunk/gcc/testsuite/gcc.target/powerpc/pr68163.c branches/gcc-7-branch/gcc/testsuite/gcc.target/powerpc/pr80510-1.c - copied unchanged from r248471, trunk/gcc/testsuite/gcc.target/powerpc/pr80510-1.c branches/gcc-7-branch/gcc/testsuite/gcc.target/powerpc/pr80510-2.c - copied unchanged from r248471, trunk/gcc/testsuite/gcc.target/powerpc/pr80510-2.c Modified: branches/gcc-7-branch/gcc/ChangeLog branches/gcc-7-branch/gcc/config/rs6000/predicates.md branches/gcc-7-branch/gcc/config/rs6000/rs6000.md branches/gcc-7-branch/gcc/testsuite/ChangeLog
Author: meissner Date: Wed Jun 21 18:02:37 2017 New Revision: 249466 URL: https://gcc.gnu.org/viewcvs?rev=249466&root=gcc&view=rev Log: [gcc] 2017-06-21 Michael Meissner <meissner@linux.vnet.ibm.com> Back port from mainline 2017-05-19 Michael Meissner <meissner@linux.vnet.ibm.com> PR target/80718 * config/rs6000/vsx.md (vsx_splat_<mode>, VSX_D iterator): Prefer VSX registers over GPRs, particularly on ISA 2.07 which does not have the MTVSRDD instruction. Back port from mainline 2017-05-18 Michael Meissner <meissner@linux.vnet.ibm.com> PR target/80510 * config/rs6000/predicates.md (simple_offsettable_mem_operand): New predicate. * config/rs6000/rs6000.md (ALTIVEC_DFORM): New iterator. (define_peephole2 for Altivec d-form load): Add peepholes to catch cases where the register allocator uses a move and an offsettable memory operation to/from a FPR register on ISA 2.06/2.07. (define_peephole2 for Altivec d-form store): Likewise. Back port from mainline 2017-05-09 Michael Meissner <meissner@linux.vnet.ibm.com> PR target/68163 * config/rs6000/rs6000.md (f32_lr): Delete mode attributes that are now unused after splitting mov{sf,sd}_hardfloat. (f32_lr2): Likewise. (f32_lm): Likewise. (f32_lm2): Likewise. (f32_li): Likewise. (f32_li2): Likewise. (f32_lv): Likewise. (f32_sr): Likewise. (f32_sr2): Likewise. (f32_sm): Likewise. (f32_sm2): Likewise. (f32_si): Likewise. (f32_si2): Likewise. (f32_sv): Likewise. (f32_dm): Likewise. (f32_vsx): Likewise. (f32_av): Likewise. (mov<mode>_hardfloat): Split into separate movsf and movsd pieces. For movsf, order stores so the VSX stores occur before the GPR store which encourages the register allocator to use a traditional FPR instead of a GPR. For movsd, order the stores so that the GPR store comes before the VSX stores to allow the power6 to work. This is due to the power6 not having a 32-bit integer store instruction from a FPR. (movsf_hardfloat): Likewise. (movsd_hardfloat): Likewise. [gcc/testsuite] 2017-06-21 Michael Meissner <meissner@linux.vnet.ibm.com> Back port from mainline 2017-05-19 Michael Meissner <meissner@linux.vnet.ibm.com> PR target/80718 * gcc.target/powerpc/pr80718.c: New test. Back port from mainline 2017-05-18 Michael Meissner <meissner@linux.vnet.ibm.com> PR target/80510 * gcc.target/powerpc/pr80510-1.c: New test. * gcc.target/powerpc/pr80510-2.c: Likewise. Back port from mainline 2017-05-09 Michael Meissner <meissner@linux.vnet.ibm.com> PR target/68163 * gcc.target/powerpc/pr68163.c: New test. Added: branches/gcc-6-branch/gcc/testsuite/gcc.target/powerpc/pr68163.c - copied unchanged from r249041, trunk/gcc/testsuite/gcc.target/powerpc/pr68163.c branches/gcc-6-branch/gcc/testsuite/gcc.target/powerpc/pr80510-1.c branches/gcc-6-branch/gcc/testsuite/gcc.target/powerpc/pr80510-2.c branches/gcc-6-branch/gcc/testsuite/gcc.target/powerpc/pr80718.c Modified: branches/gcc-6-branch/gcc/ChangeLog branches/gcc-6-branch/gcc/config/rs6000/predicates.md branches/gcc-6-branch/gcc/config/rs6000/rs6000.md branches/gcc-6-branch/gcc/config/rs6000/vsx.md branches/gcc-6-branch/gcc/testsuite/ChangeLog
Patch applied to trunk, gcc 7, and gcc 6 branches.