This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: ppc64 floating point usage [was Re: PPC64 Compiler bug !!]
On Fri, Jun 13, 2003 at 11:31:33AM +0930, Alan Modra wrote:
> To expand on this: Under linux lazy fp save/restore we take an
> exception the first time (after a context switch) we use a fp
> temporary in user code. This could significantly increase the cost of
> using a fp reg for moves. IMO this is reason enough to change
> REG_ALLOC_ORDER for powerpc64.
Anton Blanchard ran some timing tests. The exception hit costs
around 1000 cycles on a Power4 processor running Linux. This patch
simply changes REG_ALLOC_ORDER to better reflect the fact that gprs
should be used before fprs. I'd like something better, such as a
working version of Zack's patch to only use fprs for user fp operations,
but this is better than nothing.
* config/rs6000/linux64.h (SUBSUBTARGET_OVERRIDE_OPTIONS): Use
RS6000_ALT_REG_ALLOC_ORDER.
* config/rs6000/rs6000.h: Formatting fixes.
(REG_ALLOC_ORDER): Correct comment.
(RS6000_ALT_REG_ALLOC_ORDER): Define.
Regression tested powerpc64-linux. OK mainline? 3.3 branch?
Index: gcc/config/rs6000/linux64.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/linux64.h,v
retrieving revision 1.44
diff -u -p -r1.44 linux64.h
--- gcc/config/rs6000/linux64.h 7 Jun 2003 17:11:47 -0000 1.44
+++ gcc/config/rs6000/linux64.h 13 Jun 2003 14:33:30 -0000
@@ -67,6 +67,9 @@
{ \
if (TARGET_64BIT) \
{ \
+ static const int order[FIRST_PSEUDO_REGISTER] \
+ = RS6000_ALT_REG_ALLOC_ORDER; \
+ memcpy (reg_alloc_order, order, sizeof (order)); \
if (DEFAULT_ABI != ABI_AIX) \
{ \
DEFAULT_ABI = ABI_AIX; \
Index: gcc/config/rs6000/rs6000.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.h,v
retrieving revision 1.278
diff -u -p -r1.278 rs6000.h
--- gcc/config/rs6000/rs6000.h 4 Jun 2003 17:50:43 -0000 1.278
+++ gcc/config/rs6000/rs6000.h 13 Jun 2003 14:33:33 -0000
@@ -781,8 +781,7 @@ extern int rs6000_alignment_flags;
/* AltiVec registers. */ \
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
- 1, 1 \
- , 1, 1 \
+ 1, 1, 1, 1 \
}
/* 1 for registers not available across function calls.
@@ -801,8 +800,7 @@ extern int rs6000_alignment_flags;
/* AltiVec registers. */ \
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
- 1, 1 \
- , 1, 1 \
+ 1, 1, 1, 1 \
}
/* Like `CALL_USED_REGISTERS' except this macro doesn't require that
@@ -820,8 +818,7 @@ extern int rs6000_alignment_flags;
/* AltiVec registers. */ \
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \
- 0, 0 \
- , 0, 0 \
+ 0, 0, 0, 0 \
}
#define MQ_REGNO 64
@@ -861,8 +858,7 @@ extern int rs6000_alignment_flags;
mq (not saved; best to use it if we can)
ctr (not saved; when we have the choice ctr is better)
lr (saved)
- cr5, r1, r2, ap, xer, vrsave, vscr (fixed)
- spe_acc, spefscr (fixed)
+ cr5, r1, r2, ap, xer (fixed, but note r2 exception on some ABIs)
AltiVec registers:
v0 - v1 (not saved or used for anything)
@@ -870,8 +866,10 @@ extern int rs6000_alignment_flags;
v2 (not saved; incoming vector arg reg; return value)
v19 - v14 (not saved or used for anything)
v31 - v20 (saved; order given to save least number)
+
+ vrsave, vscr, spe_acc, spefscr (fixed)
*/
-
+
#if FIXED_R2 == 1
#define MAYBE_R2_AVAILABLE
#define MAYBE_R2_FIXED 2,
@@ -900,8 +898,36 @@ extern int rs6000_alignment_flags;
79, \
96, 95, 94, 93, 92, 91, \
108, 107, 106, 105, 104, 103, 102, 101, 100, 99, 98, \
- 97, 109, 110 \
- , 111, 112 \
+ 97, \
+ 109, 110, 111, 112 \
+}
+
+/* Used by powerpc64-linux. Places fp regs after gp regs, so that
+ DImode moves tend to use a gp reg rather than a fp reg. Usage of fp
+ regs under Linux' lazy fp save/restore means an exception is taken on
+ first use of a fp reg. */
+#define RS6000_ALT_REG_ALLOC_ORDER \
+ {75, 74, 69, 68, 72, 71, 70, \
+ 0, MAYBE_R2_AVAILABLE \
+ 9, 11, 10, 8, 7, 6, 5, 4, \
+ 3, \
+ 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, \
+ 18, 17, 16, 15, 14, 13, 12, \
+ 64, 66, 65, \
+ 73, 1, MAYBE_R2_FIXED 67, 76, \
+ 32, \
+ 45, 44, 43, 42, 41, 40, 39, 38, 37, 36, 35, 34, \
+ 33, \
+ 63, 62, 61, 60, 59, 58, 57, 56, 55, 54, 53, 52, 51, \
+ 50, 49, 48, 47, 46, \
+ /* AltiVec registers. */ \
+ 77, 78, \
+ 90, 89, 88, 87, 86, 85, 84, 83, 82, 81, 80, \
+ 79, \
+ 96, 95, 94, 93, 92, 91, \
+ 108, 107, 106, 105, 104, 103, 102, 101, 100, 99, 98, \
+ 97, \
+ 109, 110, 111, 112 \
}
/* True if register is floating-point. */
--
Alan Modra
IBM OzLabs - Linux Technology Centre