Bug 68163 - GCC on power8 does not issue the stxsspx instruction on power8
Summary: GCC on power8 does not issue the stxsspx instruction on power8
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 6.0
: P3 normal
Target Milestone: ---
Assignee: Michael Meissner
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-10-30 17:24 UTC by Michael Meissner
Modified: 2017-06-21 18:05 UTC (History)
3 users (show)

See Also:
Host:
Target: powerpc64*-*-linux*
Build:
Known to work:
Known to fail:
Last reconfirmed: 2015-10-30 00:00:00


Attachments
Sample program to show the problem. (970 bytes, text/plain)
2015-10-30 17:24 UTC, Michael Meissner
Details
Proposed patch to fix the problem. (2.05 KB, patch)
2017-02-07 21:23 UTC, Michael Meissner
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Meissner 2015-10-30 17:24:51 UTC
Created attachment 36624 [details]
Sample program to show the problem.

If you select -mcpu=power8, and create a program that has more than 32 live single precision values, the compiler will not use the stxsspx instruction to store values in the Altivec registers.  Instead it will do a xscvdpspn instruction to convert the internal format to vector form, then a mfvsrd instruction to move the value into a GPR, and finally stw instruction to store the 32-bit word.

If you change the type from float to double, generating the stxsdx instruction.
Comment 1 Michael Meissner 2017-02-07 21:23:23 UTC
Created attachment 40691 [details]
Proposed patch to fix the problem.

I believe this patch fixes the problem.

Note, I am going on vacation, and won't return until the end of February, so I won't be submitting the patch until I get back (unless somebody else wants to verify that it works and submits it).
Comment 2 Michael Meissner 2017-04-18 17:08:48 UTC
Author: meissner
Date: Tue Apr 18 17:08:16 2017
New Revision: 246974

URL: https://gcc.gnu.org/viewcvs?rev=246974&root=gcc&view=rev
Log:
Add initial patch for pr 68163

Added:
    branches/ibm/meissner-gcc8/gcc/testsuite/gcc.target/powerpc/pr68163.c
      - copied unchanged from r246956, branches/ibm/meissner-work/gcc/testsuite/gcc.target/powerpc/pr68163.c
Modified:
    branches/ibm/meissner-gcc8/gcc/ChangeLog.meissner
    branches/ibm/meissner-gcc8/gcc/config/rs6000/rs6000.md
    branches/ibm/meissner-gcc8/gcc/testsuite/ChangeLog.meissner
Comment 3 Michael Meissner 2017-05-09 21:25:56 UTC
Author: meissner
Date: Tue May  9 21:25:23 2017
New Revision: 247819

URL: https://gcc.gnu.org/viewcvs?rev=247819&root=gcc&view=rev
Log:
[gcc]
2017-05-09  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/68163
	* config/rs6000/rs6000.md (f32_lr): Delete mode attributes that
	are now unused after splitting mov{sf,sd}_hardfloat.
	(f32_lr2): Likewise.
	(f32_lm): Likewise.
	(f32_lm2): Likewise.
	(f32_li): Likewise.
	(f32_li2): Likewise.
	(f32_lv): Likewise.
	(f32_sr): Likewise.
	(f32_sr2): Likewise.
	(f32_sm): Likewise.
	(f32_sm2): Likewise.
	(f32_si): Likewise.
	(f32_si2): Likewise.
	(f32_sv): Likewise.
	(f32_dm): Likewise.
	(f32_vsx): Likewise.
	(f32_av): Likewise.
	(mov<mode>_hardfloat): Split into separate movsf and movsd pieces.
	For movsf, order stores so the VSX stores occur before the GPR
	store which encourages the register allocator to use a traditional
	FPR instead of a GPR.  For movsd, order the stores so that the GPR
	store comes before the VSX stores to allow the power6 to work.
	This is due to the power6 not having a 32-bit integer store
	instruction from a FPR.
	(movsf_hardfloat): Likewise.
	(movsd_hardfloat): Likewise.

[gcc/testsuite]
2017-05-09  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/68163
	* gcc.target/powerpc/pr68163.c: New test.


Added:
    trunk/gcc/testsuite/gcc.target/powerpc/pr68163.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/rs6000/rs6000.md
    trunk/gcc/testsuite/ChangeLog
Comment 4 Michael Meissner 2017-05-26 01:52:57 UTC
Author: meissner
Date: Fri May 26 01:52:24 2017
New Revision: 248480

URL: https://gcc.gnu.org/viewcvs?rev=248480&root=gcc&view=rev
Log:
[gcc]
2017-05-25  Michael Meissner  <meissner@linux.vnet.ibm.com>

	Backport from trunk
	2017-05-18  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/80510
	* config/rs6000/predicates.md (simple_offsettable_mem_operand):
	New predicate.

	* config/rs6000/rs6000.md (ALTIVEC_DFORM): New iterator.
	(define_peephole2 for Altivec d-form load): Add peepholes to catch
	cases where the register allocator uses a move and an offsettable
	memory operation to/from a FPR register on ISA 2.06/2.07.
	(define_peephole2 for Altivec d-form store): Likewise.

	Backport from trunk
	2017-05-09  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/68163
	* config/rs6000/rs6000.md (f32_lr): Delete mode attributes that
	are now unused after splitting mov{sf,sd}_hardfloat.
	(f32_lr2): Likewise.
	(f32_lm): Likewise.
	(f32_lm2): Likewise.
	(f32_li): Likewise.
	(f32_li2): Likewise.
	(f32_lv): Likewise.
	(f32_sr): Likewise.
	(f32_sr2): Likewise.
	(f32_sm): Likewise.
	(f32_sm2): Likewise.
	(f32_si): Likewise.
	(f32_si2): Likewise.
	(f32_sv): Likewise.
	(f32_dm): Likewise.
	(f32_vsx): Likewise.
	(f32_av): Likewise.
	(mov<mode>_hardfloat): Split into separate movsf and movsd pieces.
	For movsf, order stores so the VSX stores occur before the GPR
	store which encourages the register allocator to use a traditional
	FPR instead of a GPR.  For movsd, order the stores so that the GPR
	store comes before the VSX stores to allow the power6 to work.
	This is due to the power6 not having a 32-bit integer store
	instruction from a FPR.
	(movsf_hardfloat): Likewise.
	(movsd_hardfloat): Likewise.

[gcc/testsuite]
2017-05-25  Michael Meissner  <meissner@linux.vnet.ibm.com>

	Backport from trunk
	2017-05-18  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/80510
	* gcc.target/powerpc/pr80510-1.c: New test.
	* gcc.target/powerpc/pr80510-2.c: Likewise.

	Backport from trunk
	2017-05-09  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/68163
	* gcc.target/powerpc/pr68163.c: New test.


Added:
    branches/gcc-7-branch/gcc/testsuite/gcc.target/powerpc/pr68163.c
      - copied unchanged from r248471, trunk/gcc/testsuite/gcc.target/powerpc/pr68163.c
    branches/gcc-7-branch/gcc/testsuite/gcc.target/powerpc/pr80510-1.c
      - copied unchanged from r248471, trunk/gcc/testsuite/gcc.target/powerpc/pr80510-1.c
    branches/gcc-7-branch/gcc/testsuite/gcc.target/powerpc/pr80510-2.c
      - copied unchanged from r248471, trunk/gcc/testsuite/gcc.target/powerpc/pr80510-2.c
Modified:
    branches/gcc-7-branch/gcc/ChangeLog
    branches/gcc-7-branch/gcc/config/rs6000/predicates.md
    branches/gcc-7-branch/gcc/config/rs6000/rs6000.md
    branches/gcc-7-branch/gcc/testsuite/ChangeLog
Comment 5 Michael Meissner 2017-06-21 18:03:10 UTC
Author: meissner
Date: Wed Jun 21 18:02:37 2017
New Revision: 249466

URL: https://gcc.gnu.org/viewcvs?rev=249466&root=gcc&view=rev
Log:
[gcc]
2017-06-21  Michael Meissner  <meissner@linux.vnet.ibm.com>

	Back port from mainline
	2017-05-19  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/80718
	* config/rs6000/vsx.md (vsx_splat_<mode>, VSX_D iterator): Prefer
	VSX registers over GPRs, particularly on ISA 2.07 which does not
	have the MTVSRDD instruction.

	Back port from mainline
	2017-05-18  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/80510
	* config/rs6000/predicates.md (simple_offsettable_mem_operand):
	New predicate.

	* config/rs6000/rs6000.md (ALTIVEC_DFORM): New iterator.
	(define_peephole2 for Altivec d-form load): Add peepholes to catch
	cases where the register allocator uses a move and an offsettable
	memory operation to/from a FPR register on ISA 2.06/2.07.
	(define_peephole2 for Altivec d-form store): Likewise.

	Back port from mainline
	2017-05-09  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/68163
	* config/rs6000/rs6000.md (f32_lr): Delete mode attributes that
	are now unused after splitting mov{sf,sd}_hardfloat.
	(f32_lr2): Likewise.
	(f32_lm): Likewise.
	(f32_lm2): Likewise.
	(f32_li): Likewise.
	(f32_li2): Likewise.
	(f32_lv): Likewise.
	(f32_sr): Likewise.
	(f32_sr2): Likewise.
	(f32_sm): Likewise.
	(f32_sm2): Likewise.
	(f32_si): Likewise.
	(f32_si2): Likewise.
	(f32_sv): Likewise.
	(f32_dm): Likewise.
	(f32_vsx): Likewise.
	(f32_av): Likewise.
	(mov<mode>_hardfloat): Split into separate movsf and movsd pieces.
	For movsf, order stores so the VSX stores occur before the GPR
	store which encourages the register allocator to use a traditional
	FPR instead of a GPR.  For movsd, order the stores so that the GPR
	store comes before the VSX stores to allow the power6 to work.
	This is due to the power6 not having a 32-bit integer store
	instruction from a FPR.
	(movsf_hardfloat): Likewise.
	(movsd_hardfloat): Likewise.

[gcc/testsuite]
2017-06-21  Michael Meissner  <meissner@linux.vnet.ibm.com>

	Back port from mainline
	2017-05-19  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/80718
	* gcc.target/powerpc/pr80718.c: New test.

	Back port from mainline
	2017-05-18  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/80510
	* gcc.target/powerpc/pr80510-1.c: New test.
	* gcc.target/powerpc/pr80510-2.c: Likewise.

	Back port from mainline
	2017-05-09  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/68163
	* gcc.target/powerpc/pr68163.c: New test.


Added:
    branches/gcc-6-branch/gcc/testsuite/gcc.target/powerpc/pr68163.c
      - copied unchanged from r249041, trunk/gcc/testsuite/gcc.target/powerpc/pr68163.c
    branches/gcc-6-branch/gcc/testsuite/gcc.target/powerpc/pr80510-1.c
    branches/gcc-6-branch/gcc/testsuite/gcc.target/powerpc/pr80510-2.c
    branches/gcc-6-branch/gcc/testsuite/gcc.target/powerpc/pr80718.c
Modified:
    branches/gcc-6-branch/gcc/ChangeLog
    branches/gcc-6-branch/gcc/config/rs6000/predicates.md
    branches/gcc-6-branch/gcc/config/rs6000/rs6000.md
    branches/gcc-6-branch/gcc/config/rs6000/vsx.md
    branches/gcc-6-branch/gcc/testsuite/ChangeLog
Comment 6 Michael Meissner 2017-06-21 18:05:35 UTC
Patch applied to trunk, gcc 7, and gcc 6 branches.