This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH, powerpc] Rework#2 VSX scalar floating point support, patch #4


This patch moves most of the VSX DFmode operations from vsx.md to rs6000.md to
use the traditional floating point instructions (f*) instead of the VSX scalar
instructions (xs*) if all of the registers come from the traditional floating
point register set.  The add, subtract, multiply, divide, reciprocal estimate,
square root, absolute value, negate, round functions, and multiply/add
instructions were changed.  Some of the converts have not been changed with
these patches.  If the -mupper-regs-df switch is used, it will attempt to use
the upper registers (those that overlay on the traditional Altivec register
set).

This patch also combines the scalar SFmode/DFmode support on non-SPE systems.
It adds in ISA 2.07 (power8) single precision floating point support if the
-mupper-regs-sf switch is used.

At present, neither -mupper-regs-df nor -mupper-regs-sf is usable if reload has
to do anything.  A future patch will address this.

I did need to adjust a few tests that were specifically testing VSX scalar code
generation.  In addition, I put in a simple test to make sure the initial
-mupper-regs-df and -mupper-regs-sf works correctly.

I tested this an except for power7, power8 I could not find any changes in code
generated for power4, power5, power6, power6x, G4, G5, cell, e5500, e6500,
xilinx (sp_full, sp_lite, dp_full, dp_lite, none), 8548/8540 (spe), 750cl
(paired floating point).

For VSX systems there is code generation differences:

    1)	The traditional fp instruction is generated instead of VSX;

    2)	Because of #1, the code generator favors the 4 operand of multiply/add
	instructions, where the target register does not overlap with any of
	the inputs over the VSX version that that requires overlap.

    3)	A few of the vectorized tests on power8 now generate more direct move
	instructions, instead of moving values through the stack than
	previously.  These tests are integer tests, where you are doing an
	operation between an integer vector and a scalar value.  Previously in
	some cases, the register allocator would store the value from a GPR and
	reload it to the vector registers.

    4)	There is a slight scheduling difference in doing long double abs, that
	causes a different register to be used.  The code for long double abs
	needs to be improved in any case (the early splitting is causing spills
	to the stack).

I had no differences in doing bootstrap and make check (with the testsuite
fixes applied).

In addition, I am running Spec 2006 floating point tests on a power7 box to
compare the effects of going back to the traditional floating point tests.  For
most tests, there is less than 2% difference between the runs.  One benchmark
(482.sphinx3) is slightly faster with these changes, and it is dominated by
floating point multiply/add operations.

Can I apply these patches?

[gcc]
2013-09-30  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* config/rs6000/rs6000-builtin.def (XSRDPIM): Use floatdf2,
	ceildf2, btruncdf2, instead of vsx_* name.

	* config/rs6000/vsx.md (vsx_add<mode>3): Change arithmetic
	iterators to only do V2DF and V4SF here.  Move the DF code to
	rs6000.md where it is combined with SF mode.  Replace <VSv> with
	just 'v' since only vector operations are handled with these insns
	after moving the DF support to rs6000.md.
	(vsx_sub<mode>3): Likewise.
	(vsx_mul<mode>3): Likewise.
	(vsx_div<mode>3): Likewise.
	(vsx_fre<mode>2): Likewise.
	(vsx_neg<mode>2): Likewise.
	(vsx_abs<mode>2): Likewise.
	(vsx_nabs<mode>2): Likewise.
	(vsx_smax<mode>3): Likewise.
	(vsx_smin<mode>3): Likewise.
	(vsx_sqrt<mode>2): Likewise.
	(vsx_rsqrte<mode>2): Likewise.
	(vsx_fms<mode>4): Likewise.
	(vsx_nfma<mode>4): Likewise.
	(vsx_copysign<mode>3): Likewise.
	(vsx_btrunc<mode>2): Likewise.
	(vsx_floor<mode>2): Likewise.
	(vsx_ceil<mode>2): Likewise.
	(vsx_smaxsf3): Delete scalar ops that were moved to rs6000.md.
	(vsx_sminsf3): Likewise.
	(vsx_fmadf4): Likewise.
	(vsx_fmsdf4): Likewise.
	(vsx_nfmadf4): Likewise.
	(vsx_nfmsdf4): Likewise.
	(vsx_cmpdf_internal1): Likewise.

	* config/rs6000/rs6000.h (TARGET_SF_SPE): Define macros to make it
	simpler to select whether a target has SPE or traditional floating
	point support in iterators.
	(TARGET_DF_SPE): Likewise.
	(TARGET_SF_FPR): Likewise.
	(TARGET_DF_FPR): Likewise.
	(TARGET_SF_INSN): Macros to say whether floating point support
	exists for a given operation for expanders.
	(TARGET_DF_INSN): Likewise.

	* config/rs6000/rs6000.c (Ftrad): New mode attributes to allow
	combining of SF/DF mode operations, using both traditional and VSX
	registers.
	(Fvsx): Likewise.
	(Ff): Likewise.
	(Fv): Likewise.
	(Fs): Likewise.
	(Ffre): Likewise.
	(FFRE): Likewise.
	(abs<mode>2): Combine SF/DF modes using traditional floating point
	instructions.  Add support for using the upper DF registers with
	VSX support, and SF registers with power8-vector support.  Update
	expanders for operations supported by both the SPE and traditional
	floating point units.
	(abs<mode>2_fpr): Likewise.
	(nabs<mode>2): Likewise.
	(nabs<mode>2_fpr): Likewise.
	(neg<mode>2): Likewise.
	(neg<mode>2_fpr): Likewise.
	(add<mode>3): Likewise.
	(add<mode>3_fpr): Likewise.
	(sub<mode>3): Likewise.
	(sub<mode>3_fpr): Likewise.
	(mul<mode>3): Likewise.
	(mul<mode>3_fpr): Likewise.
	(div<mode>3): Likewise.
	(div<mode>3_fpr): Likewise.
	(sqrt<mode>3): Likewise.
	(sqrt<mode>3_fpr): Likewise.
	(fre<Fs>): Likewise.
	(rsqrt<mode>2): Likewise.
	(cmp<mode>_fpr): Likewise.
	(smax<mode>3): Likewise.
	(smin<mode>3): Likewise.
	(smax<mode>3_vsx): Likewise.
	(smin<mode>3_vsx): Likewise.
	(negsf2): Delete SF operations that are merged with DF.
	(abssf2): Likewise.
	(addsf3): Likewise.
	(subsf3): Likewise.
	(mulsf3): Likewise.
	(divsf3): Likewise.
	(fres): Likewise.
	(fmasf4_fpr): Likewise.
	(fmssf4_fpr): Likewise.
	(nfmasf4_fpr): Likewise.
	(nfmssf4_fpr): Likewise.
	(sqrtsf2): Likewise.
	(rsqrtsf_internal1): Likewise.
	(smaxsf3): Likewise.
	(sminsf3): Likewise.
	(cmpsf_internal1): Likewise.
	(copysign<mode>3_fcpsgn): Add VSX/power8-vector support.
	(negdf2): Delete DF operations that are merged with SF.
	(absdf2): Likewise.
	(nabsdf2): Likewise.
	(adddf3): Likewise.
	(subdf3): Likewise.
	(muldf3): Likewise.
	(divdf3): Likewise.
	(fred): Likewise.
	(rsqrtdf_internal1): Likewise.
	(fmadf4_fpr): Likewise.
	(fmsdf4_fpr): Likewise.
	(nfmadf4_fpr): Likewise.
	(nfmsdf4_fpr): Likewise.
	(sqrtdf2): Likewise.
	(smaxdf3): Likewise.
	(smindf3): Likewise.
	(cmpdf_internal1): Likewise.
	(lrint<mode>di2): Use TARGET_<MODE>_FPR macro.
	(btrunc<mode>2): Delete separate expander, and combine with the
	insn and add VSX instruction support.  Use TARGET_<MODE>_FPR.
	(btrunc<mode>2_fpr): Likewise.
	(ceil<mode>2): Likewise.
	(ceil<mode>2_fpr): Likewise.
	(floor<mode>2): Likewise.
	(floor<mode>2_fpr): Likewise.
	(fma<mode>4_fpr): Combine SF and DF fused multiply/add support.
	Add support for using the upper registers with VSX and
	power8-vector.  Move insns to be closer to the define_expands. On
	VSX systems, prefer the traditional form of FMA over the VSX
	version, since the traditional form allows the target not to
	overlap with the inputs.
	(fms<mode>4_fpr): Likewise.
	(nfma<mode>4_fpr): Likewise.
	(nfms<mode>4_fpr): Likewise.

[gcc/testsuite]
2013-09-30  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* gcc.target/powerpc/p8vector-fp.c: New test for floating point
	scalar operations when using -mupper-regs-sf and -mupper-regs-df.
	* gcc.target/powerpc/ppc-target-1.c: Update tests to allow either
	VSX scalar operations or the traditional floating point form of
	the instruction.
	* gcc.target/powerpc/ppc-target-2.c: Likewise.
	* gcc.target/powerpc/recip-3.c: Likewise.
	* gcc.target/powerpc/recip-5.c: Likewise.
	* gcc.target/powerpc/pr72747.c: Likewise.
	* gcc.target/powerpc/vsx-builtin-3.c: Likewise.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797

Attachment: gcc-power8.patch062b
Description: Text document


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]