Bug 80718 - GCC generates slow code for offsettable vec_duplicate
Summary: GCC generates slow code for offsettable vec_duplicate
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 8.0
: P3 normal
Target Milestone: ---
Assignee: Michael Meissner
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-05-12 02:44 UTC by Michael Meissner
Modified: 2017-06-21 18:04 UTC (History)
3 users (show)

See Also:
Host:
Target: powerpc64le-*-*
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Meissner 2017-05-12 02:44:25 UTC
In looking at bug 80697, I noticed on power8, there were loads to a GPR register and move directs to vector registers.

I tracked this down to the load with splat instruction only taking indirect or indexed loads, while the original address is an offsettable load.  So the register allocator decides to load up a GPR and do the transfer over to the vector register to do the vec_duplicate operation.

I.e.:
vector double foo (double *p) { return (vector double) { p[4], p[4] }; }

generates:
foo:
        ld 9,32(3)
        mtvsrd 34,9
        xxpermdi 34,34,34,0
        blr

I tested adding a combiner pattern to support offsettable loads, and it generates:
foo:
        li 9,32
        lxvdsx 34,3,9
        blr
Comment 1 Michael Meissner 2017-05-12 19:49:26 UTC
Author: meissner
Date: Fri May 12 19:48:54 2017
New Revision: 247994

URL: https://gcc.gnu.org/viewcvs?rev=247994&root=gcc&view=rev
Log:
Rework pr 80718

Modified:
    branches/ibm/meissner-work/gcc/ChangeLog.meissner
    branches/ibm/meissner-work/gcc/config/rs6000/vsx.md
Comment 2 Michael Meissner 2017-05-12 19:54:34 UTC
Author: meissner
Date: Fri May 12 19:54:03 2017
New Revision: 247995

URL: https://gcc.gnu.org/viewcvs?rev=247995&root=gcc&view=rev
Log:
Rework pr 80718

Modified:
    branches/ibm/meissner-work/gcc/config/rs6000/vsx.md
Comment 3 Michael Meissner 2017-05-22 22:45:18 UTC
Author: meissner
Date: Mon May 22 22:44:45 2017
New Revision: 248352

URL: https://gcc.gnu.org/viewcvs?rev=248352&root=gcc&view=rev
Log:
[gcc]
2017-05-22  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/80718
	* config/rs6000/vsx.md (vsx_splat_<mode>, VSX_D iterator): Split
	V2DF/V2DI splat into two separate patterns, one that handles
	registers, and the other that only handles memory.  Drop support
	for splatting from a GPR on ISA 2.07 and then splitting the
	splat into direct move and splat.
	(vsx_splat_<mode>_reg): Likewise.
	(vsx_splat_<mode>_mem): Likewise.

[gcc/testsuite]
2017-05-22  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/80718
	* gcc.target/powerpc/pr80718.c: New test.


Added:
    trunk/gcc/testsuite/gcc.target/powerpc/pr80718.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/rs6000/vsx.md
    trunk/gcc/testsuite/ChangeLog
Comment 4 Michael Meissner 2017-06-06 22:27:45 UTC
Author: meissner
Date: Tue Jun  6 22:27:13 2017
New Revision: 248936

URL: https://gcc.gnu.org/viewcvs?rev=248936&root=gcc&view=rev
Log:
Back port from mainline

[gcc]
2017-05-19  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/80718
	* config/rs6000/vsx.md (vsx_splat_<mode>, VSX_D iterator): Prefer
	VSX registers over GPRs, particularly on ISA 2.07 which does not
	have the MTVSRDD instruction.

[gcc/testsuite]
2017-05-19  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/80718
	* gcc.target/powerpc/pr80718.c: New test.



Added:
    branches/gcc-7-branch/gcc/testsuite/gcc.target/powerpc/pr80718.c
      - copied unchanged from r248902, trunk/gcc/testsuite/gcc.target/powerpc/pr80718.c
Modified:
    branches/gcc-7-branch/gcc/ChangeLog
    branches/gcc-7-branch/gcc/config/rs6000/vsx.md
    branches/gcc-7-branch/gcc/testsuite/ChangeLog
Comment 5 Michael Meissner 2017-06-21 18:03:09 UTC
Author: meissner
Date: Wed Jun 21 18:02:37 2017
New Revision: 249466

URL: https://gcc.gnu.org/viewcvs?rev=249466&root=gcc&view=rev
Log:
[gcc]
2017-06-21  Michael Meissner  <meissner@linux.vnet.ibm.com>

	Back port from mainline
	2017-05-19  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/80718
	* config/rs6000/vsx.md (vsx_splat_<mode>, VSX_D iterator): Prefer
	VSX registers over GPRs, particularly on ISA 2.07 which does not
	have the MTVSRDD instruction.

	Back port from mainline
	2017-05-18  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/80510
	* config/rs6000/predicates.md (simple_offsettable_mem_operand):
	New predicate.

	* config/rs6000/rs6000.md (ALTIVEC_DFORM): New iterator.
	(define_peephole2 for Altivec d-form load): Add peepholes to catch
	cases where the register allocator uses a move and an offsettable
	memory operation to/from a FPR register on ISA 2.06/2.07.
	(define_peephole2 for Altivec d-form store): Likewise.

	Back port from mainline
	2017-05-09  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/68163
	* config/rs6000/rs6000.md (f32_lr): Delete mode attributes that
	are now unused after splitting mov{sf,sd}_hardfloat.
	(f32_lr2): Likewise.
	(f32_lm): Likewise.
	(f32_lm2): Likewise.
	(f32_li): Likewise.
	(f32_li2): Likewise.
	(f32_lv): Likewise.
	(f32_sr): Likewise.
	(f32_sr2): Likewise.
	(f32_sm): Likewise.
	(f32_sm2): Likewise.
	(f32_si): Likewise.
	(f32_si2): Likewise.
	(f32_sv): Likewise.
	(f32_dm): Likewise.
	(f32_vsx): Likewise.
	(f32_av): Likewise.
	(mov<mode>_hardfloat): Split into separate movsf and movsd pieces.
	For movsf, order stores so the VSX stores occur before the GPR
	store which encourages the register allocator to use a traditional
	FPR instead of a GPR.  For movsd, order the stores so that the GPR
	store comes before the VSX stores to allow the power6 to work.
	This is due to the power6 not having a 32-bit integer store
	instruction from a FPR.
	(movsf_hardfloat): Likewise.
	(movsd_hardfloat): Likewise.

[gcc/testsuite]
2017-06-21  Michael Meissner  <meissner@linux.vnet.ibm.com>

	Back port from mainline
	2017-05-19  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/80718
	* gcc.target/powerpc/pr80718.c: New test.

	Back port from mainline
	2017-05-18  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/80510
	* gcc.target/powerpc/pr80510-1.c: New test.
	* gcc.target/powerpc/pr80510-2.c: Likewise.

	Back port from mainline
	2017-05-09  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/68163
	* gcc.target/powerpc/pr68163.c: New test.


Added:
    branches/gcc-6-branch/gcc/testsuite/gcc.target/powerpc/pr68163.c
      - copied unchanged from r249041, trunk/gcc/testsuite/gcc.target/powerpc/pr68163.c
    branches/gcc-6-branch/gcc/testsuite/gcc.target/powerpc/pr80510-1.c
    branches/gcc-6-branch/gcc/testsuite/gcc.target/powerpc/pr80510-2.c
    branches/gcc-6-branch/gcc/testsuite/gcc.target/powerpc/pr80718.c
Modified:
    branches/gcc-6-branch/gcc/ChangeLog
    branches/gcc-6-branch/gcc/config/rs6000/predicates.md
    branches/gcc-6-branch/gcc/config/rs6000/rs6000.md
    branches/gcc-6-branch/gcc/config/rs6000/vsx.md
    branches/gcc-6-branch/gcc/testsuite/ChangeLog
Comment 6 Michael Meissner 2017-06-21 18:04:07 UTC
Fix back ported to gcc 7/6 branches.