This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH], Fix PR 68163, PowerPC power8 sometimes generating move direct to GPR to store 32-bit float


This patch fixes PR 68163, in which on systems with direct move but without the
ISA 3.0 altivec reg+offset scalar load/store instructions (i.e. power8).  If
the compiler has a 32-bit floating point value in a traditional Altivec
register, and it wants to do a reg+offset store, it decides to move the value
to a GPR to do the store.  Unfortunately on the PowerPC architecture, it takes
3 instructions to do the direct move.

I tracked it down to the fact that the store from GPR occurs before the store
from traditional FPR register.  So the register allocator does a move, and
picks the GPR because it is first.  I reordered the arguments, but I discovered
on ISA 2.05 (power6), they did not have a store integer 32-bit instruction,
which is needed by movsd.  I solved this by specifying movsf and movsd as
separate moves.

I bootstrapped the compiler and there were no regressions.  I ran Spec 2006,
and there were 3 benchmarks (gromacs, namd, and soplex) with very slight
gains.

This code does stores in Altivec registers by moving the value to FPR and using
the traditional STFS instruction.  However, in looking at the code, I came to
the conclusion that we could do better (PR 80510) by using a peephole2 to
load the offset value into a GPR and doing an indexed store.  I have code for
PR 80510 that I will submit after this patch.  That patch needs this patch to
prevent using direct move to do a store.

Is this patch ok for GCC 8?  How about GCC 7.2?

[gcc]
2017-05-05  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/68163
	* config/rs6000/rs6000.md (f32_lr): Delete mode attributes that
	are now unused after splitting mov{sf,sd}_hardfloat.
	(f32_lr2): Likewise.
	(f32_lm): Likewise.
	(f32_lm2): Likewise.
	(f32_li): Likewise.
	(f32_li2): Likewise.
	(f32_lv): Likewise.
	(f32_sr): Likewise.
	(f32_sr2): Likewise.
	(f32_sm): Likewise.
	(f32_sm2): Likewise.
	(f32_si): Likewise.
	(f32_si2): Likewise.
	(f32_sv): Likewise.
	(f32_dm): Likewise.
	(f32_vsx): Likewise.
	(f32_av): Likewise.
	(mov<mode>_hardfloat): Split into separate movsf and movsd pieces.
	For movsf, order stores so the VSX stores occur before the GPR
	store which encourages the register allocator to use a traditional
	FPR instead of a GPR.  For movsd, order the stores so that the GPR
	store comes before the VSX stores to allow the power6 to work.
	This is due to the power6 not having a 32-bit integer store
	instruction from a FPR.
	(movsf_hardfloat): Likewise.
	(movsd_hardfloat): Likewise.

[gcc/testsuite]
2017-05-05  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/68163
	* gcc.target/powerpc/pr68163.c: New test.



-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797

Attachment: gcc8.patch03b
Description: Text document


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]