This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[ARM] ARM NEON support part 1/7: VFPv3 support

From: Julian Brown <julian at codesourcery dot com>
To: GCC Patches <gcc-patches at gcc dot gnu dot org>
Cc: Paul Brook <paul at codesourcery dot com>
Date: Sat, 02 Jun 2007 22:16:42 +0100
Subject: [ARM] ARM NEON support part 1/7: VFPv3 support

This series of patches adds support for ARM's "Advanced SIMD Extension" NEON, as well as version 3 of the VFP architecture and scheduling support for ARM's Cortex-A8 core. The first three patches form the bulk of the implementation, and the remaining four patches provide incremental improvements.

The first patch adds support for the VFPv3 instruction set. There are mainly two features added, one being an extended register set for double-precision registers (32 up from 16), the second being added immediate-constant loading instructions "fconsts" and "fconstd". The special handling of registers D0-D7 isn't actually required for VFPv3, but is needed for the follow-up NEON patch.

(The patch series has been tested together with no regressions, targetting arm-none-eabi. See final part for further test information).

OK?

Julian

ChangeLog (vfpv3-support)

Julian Brown <julian@codesourcery.com>

    gcc/
    * config.gcc (with_fpu): Allow --with-fpu=vfp3.
    * config/arm/aout.h (REGISTER_NAMES): Add D16-D31.
    * config/arm/aof.h (REGISTER_NAMES): Add D16-D31.
    * config/arm/arm.c (FL_VFPV3): New flag for VFPv3 processor
    capability.
    (all_fpus): Add FPUTYPE_VFP3.
    (fp_model_for_fpu): Add VFPv3 field.
    (arm_rtx_costs_1): Give cost to VFPv3 constants.
    (vfp3_const_double_index): New function. Return integer index of
    VFPv3 constant suitable for fconst[sd] insns, or -1 if constant
    isn't suitable.
    (vfp3_const_double_rtx): New function. True if VFPv3 is enabled
    and argument represents a valid RTX for a VFPv3 constant.
    (vfp_output_fldmd): Split fldmd with > 16 registers in the list into
    two instructions.
    (vfp_emit_fstmd): Similar, for fstmd.
    (arm_print_operand): Implement new code 'G' for VFPv3 floating-point
    constants, represented as a integer indices.
    (arm_hard_regno_mode_ok): Use VFP_REGNO_OK_FOR_SINGLE,
    VFP_REGNO_OK_FOR_DOUBLE macros.
    (arm_regno_class): Handle VFPv3 d0-d7, low, high register split.
    (arm_file_start): Set float-abi attribute for VFPv3, and output
    correct ".fpu" assembler directive.
    (arm_dbx_register_numbering): Add FIXME.
    * config/arm/arm.h (TARGET_VFP3): New macro. Target supports VFPv3.
    (fputype): Add FPUTYPE_VFP3.
    (FIXED_REGISTERS): Add 32 registers for D16-D31.
    (CALL_USED_REGISTERS): Likewise.
    (CONDITIONAL_REGISTER_USAGE): Add note about conditional definition
    of LAST_VFP_REGNUM. Make D16-D31 caller-saved, if present.
    (LAST_VFP_REGNUM): Extend available VFP registers for VFPv3.
    (D7_VFP_REGNUM): New.
    (LAST_LO_VFP_REGNUM, LAST_HI_VFP_REGNUM, VFP_REGNO_OK_FOR_SINGLE)
    (VFP_REGNO_OK_FOR_SINGLE, VFP_REGNO_OK_FOR_DOUBLE): Define new
    macros.
    (FIRST_PSEUDO_REGISTER): Shift up to 128 to accommodate VFPv3.
    (REG_ALLOC_ORDER): Adjust for VFPv3.
    (reg_class): Add VFP_D0_D7_REGS, VFP_LO_REGS, VFP_HI_REGS.
    (REG_CLASS_NAMES): Add entries corresponding to VFP_D0_D7_REGS,
    VFP_LO_REGS, VFP_HI_REGS.
    (REG_CLASS_CONTENTS): Likewise. Extend contents for VFP_REGS.
    (IS_VFP_CLASS): Define macro.
    (SECONDARY_OUTPUT_RELOAD_CLASS, SECONDARY_INPUT_RELOAD_CLASS): Use
    IS_VFP_CLASS.
    (REGISTER_MOVE_COST): Likewise.
    * config/arm/arm-protos.h (vfp3_const_double_rtx): Add prototype.
    * config/arm/vfp.md (VFPCC_REGNUM): Redefine as 127.
    (*arm_movsi_vfp, *thumb2_movsi_vfp, *movsfcc_vfp)
    (*thumb2_movsfcc_vfp, *abssf2_vfp, *negsf2_vfp, *addsf3_vfp)
    (*subsf3_vfp, *divsf_vfp, *mulsf_vfp, *mulsf3negsf_vfp)
    (*mulsf3addsf_vfp, *mulsf3subsf_vfp, *mulsf3negsfaddsf_vfp)
    (*extendsfdf2_vfp, *truncdfsf2_vfp, *truncsisf2_vfp)
    (*truncsidf2_vfp, fixuns_truncsfsi2, fixuns_truncdfsi2)
    (*floatsisf2_vfp, *floatsidf2_vfp, floatunssisf2)
    (floatunssidf2, *sqrtsf2_vfp, *cmpsf_split_vfp)
    (*cmpsf_trap_split_vfp, *cmpsf_vfp, *cmpsf_trap_vfp): Use 't'
    where appropriate for single-word registers.
    (*movsf_vfp, *thumb2_movsf_vfp, *movdf_vfp, *thumb2_movdf_vfp):
    As above. Fix type attributes.
    * config/arm/constraints.md (register_contraint "t"): Define.
    (register_constraint "w"): Change to D0-D15, or D0-D31 for
    VFPv3/NEON.
    (register_constraint "x"): Define.
    (constraint "Dv"): Define.

--- .pc/vfpv3-support/gcc/config.gcc	2007-06-02 13:45:47.000000000 -0700
+++ gcc/config.gcc	2007-06-02 13:46:00.000000000 -0700
@@ -2822,7 +2822,7 @@ case "${target}" in
 
 		case "$with_fpu" in
 		"" \
-		| fpa | fpe2 | fpe3 | maverick | vfp )
+		| fpa | fpe2 | fpe3 | maverick | vfp | vfp3 )
 			# OK
 			;;
 		*)
--- .pc/vfpv3-support/gcc/config/arm/aout.h	2007-06-02 13:45:47.000000000 -0700
+++ gcc/config/arm/aout.h	2007-06-02 13:46:00.000000000 -0700
@@ -68,6 +68,10 @@
   "s8",  "s9",  "s10", "s11", "s12", "s13", "s14", "s15", \
   "s16", "s17", "s18", "s19", "s20", "s21", "s22", "s23", \
   "s24", "s25", "s26", "s27", "s28", "s29", "s30", "s31", \
+  "d16", "?16", "d17", "?17", "d18", "?18", "d19", "?19", \
+  "d20", "?20", "d21", "?21", "d22", "?22", "d23", "?23", \
+  "d24", "?24", "d25", "?25", "d26", "?26", "d27", "?27", \
+  "d28", "?28", "d29", "?29", "d30", "?30", "d31", "?31", \
   "vfpcc"					   \
 }
 #endif
--- .pc/vfpv3-support/gcc/config/arm/arm.c	2007-06-02 13:45:47.000000000 -0700
+++ gcc/config/arm/arm.c	2007-06-02 13:46:00.000000000 -0700
@@ -451,6 +451,7 @@ static int thumb_call_reg_needed;
 #define FL_NOTM	      (1 << 17)	      /* Instructions not present in the 'M'
 					 profile.  */
 #define FL_DIV	      (1 << 18)	      /* Hardware divide.  */
+#define FL_VFPV3      (1 << 19)       /* Vector Floating Point V3.  */
 
 #define FL_IWMMXT     (1 << 29)	      /* XScale v2 or "Intel Wireless MMX technology".  */
 
@@ -694,7 +695,8 @@ static const struct fpu_desc all_fpus[] 
   {"fpe2",	FPUTYPE_FPA_EMU2},
   {"fpe3",	FPUTYPE_FPA_EMU2},
   {"maverick",	FPUTYPE_MAVERICK},
-  {"vfp",	FPUTYPE_VFP}
+  {"vfp",	FPUTYPE_VFP},
+  {"vfp3",	FPUTYPE_VFP3},
 };
 
 
@@ -709,7 +711,8 @@ static const enum fputype fp_model_for_f
   ARM_FP_MODEL_FPA,		/* FPUTYPE_FPA_EMU2  */
   ARM_FP_MODEL_FPA,		/* FPUTYPE_FPA_EMU3  */
   ARM_FP_MODEL_MAVERICK,	/* FPUTYPE_MAVERICK  */
-  ARM_FP_MODEL_VFP		/* FPUTYPE_VFP  */
+  ARM_FP_MODEL_VFP,		/* FPUTYPE_VFP  */
+  ARM_FP_MODEL_VFP		/* FPUTYPE_VFP3  */
 };
 
 
@@ -4949,7 +4952,7 @@ arm_rtx_costs_1 (rtx x, enum rtx_code co
       return 6;
 
     case CONST_DOUBLE:
-      if (arm_const_double_rtx (x))
+      if (arm_const_double_rtx (x) || vfp3_const_double_rtx (x))
 	return outer == SET ? 2 : -1;
       else if ((outer == COMPARE || outer == PLUS)
 	       && neg_const_double_rtx_ok_for_fpa (x))
@@ -5648,6 +5651,108 @@ neg_const_double_rtx_ok_for_fpa (rtx x)
 
   return 0;
 }
+
+
+/* VFPv3 has a fairly wide range of representable immediates, formed from
+   "quarter-precision" floating-point values. These can be evaluated using this
+   formula (with ^ for exponentiation):
+
+     -1^s * n * 2^-r
+
+   Where 's' is a sign bit (0/1), 'n' and 'r' are integers such that
+   16 <= n <= 31 and 0 <= r <= 7.
+
+   These values are mapped onto an 8-bit integer ABCDEFGH s.t.
+
+     - A (most-significant) is the sign bit.
+     - BCD are the exponent (encoded as r XOR 3).
+     - EFGH are the mantissa (encoded as n - 16).
+*/
+
+/* Return an integer index for a VFPv3 immediate operand X suitable for the
+   fconst[sd] instruction, or -1 if X isn't suitable.  */
+static int
+vfp3_const_double_index (rtx x)
+{
+  REAL_VALUE_TYPE r, m;
+  int sign, exponent;
+  unsigned HOST_WIDE_INT mantissa, mant_hi;
+  unsigned HOST_WIDE_INT mask;
+  int point_pos = 2 * HOST_BITS_PER_WIDE_INT - 1;
+
+  if (!TARGET_VFP3 || GET_CODE (x) != CONST_DOUBLE)
+    return -1;
+
+  REAL_VALUE_FROM_CONST_DOUBLE (r, x);
+
+  /* We can't represent these things, so detect them first.  */
+  if (REAL_VALUE_ISINF (r) || REAL_VALUE_ISNAN (r) || REAL_VALUE_MINUS_ZERO (r))
+    return -1;
+
+  /* Extract sign, exponent and mantissa.  */
+  sign = REAL_VALUE_NEGATIVE (r) ? 1 : 0;
+  r = REAL_VALUE_ABS (r);
+  exponent = REAL_EXP (&r);
+  /* For the mantissa, we expand into two HOST_WIDE_INTS, apart from the
+     highest (sign) bit, with a fixed binary point at bit point_pos.
+     WARNING: If there's ever a VFP version which uses more than 2 * H_W_I - 1
+     bits for the mantissa, this may fail (low bits would be lost).  */
+  real_ldexp (&m, &r, point_pos - exponent);
+  REAL_VALUE_TO_INT (&mantissa, &mant_hi, m);
+
+  /* If there are bits set in the low part of the mantissa, we can't
+     represent this value.  */
+  if (mantissa != 0)
+    return -1;
+
+  /* Now make it so that mantissa contains the most-significant bits, and move
+     the point_pos to indicate that the least-significant bits have been
+     discarded.  */
+  point_pos -= HOST_BITS_PER_WIDE_INT;
+  mantissa = mant_hi;
+
+  /* We can permit four significant bits of mantissa only, plus a high bit
+     which is always 1.  */
+  mask = ((unsigned HOST_WIDE_INT)1 << (point_pos - 5)) - 1;
+  if ((mantissa & mask) != 0)
+    return -1;
+
+  /* Now we know the mantissa is in range, chop off the unneeded bits.  */
+  mantissa >>= point_pos - 5;
+
+  /* The mantissa may be zero. Disallow that case. (It's possible to load the
+     floating-point immediate zero with Neon using an integer-zero load, but
+     that case is handled elsewhere.)  */
+  if (mantissa == 0)
+    return -1;
+
+  gcc_assert (mantissa >= 16 && mantissa <= 31);
+
+  /* The value of 5 here would be 4 if GCC used IEEE754-like encoding (where
+     normalised significands are in the range [1, 2). (Our mantissa is shifted
+     left 4 places at this point relative to normalised IEEE754 values).  GCC
+     internally uses [0.5, 1) (see real.c), so the exponent returned from
+     REAL_EXP must be altered.  */
+  exponent = 5 - exponent;
+
+  if (exponent < 0 || exponent > 7)
+    return -1;
+
+  /* Sign, mantissa and exponent are now in the correct form to plug into the
+     formulae described in the comment above.  */
+  return (sign << 7) | ((exponent ^ 3) << 4) | (mantissa - 16);
+}
+
+/* Return TRUE if rtx X is a valid immediate VFPv3 constant.  */
+int
+vfp3_const_double_rtx (rtx x)
+{
+  if (!TARGET_VFP3)
+    return 0;
+
+  return vfp3_const_double_index (x) != -1;
+}
+
 
 /* Predicates for `match_operand' and `match_operator'.  */
 
@@ -8808,6 +8913,17 @@ vfp_output_fldmd (FILE * stream, unsigne
       count++;
     }
 
+  /* FLDMD may not load more than 16 doubleword registers at a time. Split the
+     load into multiple parts if we have to handle more than 16 registers.
+     FIXME: This will increase the maximum size of the epilogue, which will
+     need altering elsewhere.  */
+  if (count > 16)
+    {
+      vfp_output_fldmd (stream, base, reg, 16);
+      vfp_output_fldmd (stream, base, reg + 16, count - 16);
+      return;
+    }
+
   fputc ('\t', stream);
   asm_fprintf (stream, "fldmfdd\t%r!, {", base);
 
@@ -8870,6 +8986,19 @@ vfp_emit_fstmd (int base_reg, int count)
       count++;
     }
 
+  /* FSTMD may not store more than 16 doubleword registers at once.  Split
+     larger stores into multiple parts (up to a maximum of two, in
+     practice).  */
+  if (count > 16)
+    {
+      int saved;
+      /* NOTE: base_reg is an internal register number, so each D register
+         counts as 2.  */
+      saved = vfp_emit_fstmd (base_reg + 32, count - 16);
+      saved += vfp_emit_fstmd (base_reg, 16);
+      return saved;
+    }
+
   par = gen_rtx_PARALLEL (VOIDmode, rtvec_alloc (count));
   dwarf = gen_rtx_SEQUENCE (VOIDmode, rtvec_alloc (count + 1));
 
@@ -11957,6 +12086,16 @@ arm_print_operand (FILE *stream, rtx x, 
       }
       return;
 
+    /* Print a VFPv3 floating-point constant, represented as an integer
+       index.  */
+    case 'G':
+      {
+        int index = vfp3_const_double_index (x);
+	gcc_assert (index != -1);
+	fprintf (stream, "%d", index);
+      }
+      return;
+
     default:
       if (x == 0)
 	{
@@ -12737,11 +12876,10 @@ arm_hard_regno_mode_ok (unsigned int reg
       && IS_VFP_REGNUM (regno))
     {
       if (mode == SFmode || mode == SImode)
-	return TRUE;
+	return VFP_REGNO_OK_FOR_SINGLE (regno);
 
-      /* DFmode values are only valid in even register pairs.  */
       if (mode == DFmode)
-	return ((regno - FIRST_VFP_REGNUM) & 1) == 0;
+	return VFP_REGNO_OK_FOR_DOUBLE (regno);
       return FALSE;
     }
 
@@ -12804,7 +12942,14 @@ arm_regno_class (int regno)
     return CIRRUS_REGS;
 
   if (IS_VFP_REGNUM (regno))
-    return VFP_REGS;
+    {
+      if (regno <= D7_VFP_REGNUM)
+	return VFP_D0_D7_REGS;
+      else if (regno <= LAST_LO_VFP_REGNUM)
+        return VFP_LO_REGS;
+      else
+        return VFP_HI_REGS;
+    }
 
   if (IS_IWMMXT_REGNUM (regno))
     return IWMMXT_REGS;
@@ -15250,6 +15395,7 @@ arm_file_start (void)
 	}
       else
 	{
+	  int set_float_abi_attributes = 0;
 	  switch (arm_fpu_arch)
 	    {
 	    case FPUTYPE_FPA:
@@ -15265,15 +15411,23 @@ arm_file_start (void)
 	      fpu_name = "maverick";
 	      break;
 	    case FPUTYPE_VFP:
-	      if (TARGET_HARD_FLOAT)
-		asm_fprintf (asm_out_file, "\t.eabi_attribute 27, 3\n");
-	      if (TARGET_HARD_FLOAT_ABI)
-		asm_fprintf (asm_out_file, "\t.eabi_attribute 28, 1\n");
 	      fpu_name = "vfp";
+	      set_float_abi_attributes = 1;
+	      break;
+	    case FPUTYPE_VFP3:
+	      fpu_name = "vfp3";
+	      set_float_abi_attributes = 1;
 	      break;
 	    default:
 	      abort();
 	    }
+	  if (set_float_abi_attributes)
+	    {
+	      if (TARGET_HARD_FLOAT)
+		asm_fprintf (asm_out_file, "\t.eabi_attribute 27, 3\n");
+	      if (TARGET_HARD_FLOAT_ABI)
+		asm_fprintf (asm_out_file, "\t.eabi_attribute 28, 1\n");
+	    }
 	}
       asm_fprintf (asm_out_file, "\t.fpu %s\n", fpu_name);
 
@@ -16163,6 +16317,7 @@ arm_dbx_register_number (unsigned int re
   if (IS_FPA_REGNUM (regno))
     return (TARGET_AAPCS_BASED ? 96 : 16) + regno - FIRST_FPA_REGNUM;
 
+  /* FIXME: VFPv3 register numbering.  */
   if (IS_VFP_REGNUM (regno))
     return 64 + regno - FIRST_VFP_REGNUM;
 
--- .pc/vfpv3-support/gcc/config/arm/arm.h	2007-06-02 13:45:47.000000000 -0700
+++ gcc/config/arm/arm.h	2007-06-02 13:46:00.000000000 -0700
@@ -204,6 +204,12 @@ extern GTY(()) rtx aof_pic_label;
 /* 32-bit Thumb-2 code.  */
 #define TARGET_THUMB2			(TARGET_THUMB && arm_arch_thumb2)
 
+/* FPU is VFPv3 (with twice the number of D registers).  Setting the FPU to
+   Neon automatically enables VFPv3 too.  */
+#define TARGET_VFP3 (arm_fp_model == ARM_FP_MODEL_VFP \
+		     && (arm_fpu_arch == FPUTYPE_VFP3 \
+			 || arm_fpu_arch == FPUTYPE_NEON))
+
 /* "DSP" multiply instructions, eg. SMULxy.  */
 #define TARGET_DSP_MULTIPLY \
   (TARGET_32BIT && arm_arch5e && arm_arch_notm)
@@ -273,7 +279,9 @@ enum fputype
   /* Cirrus Maverick floating point co-processor.  */
   FPUTYPE_MAVERICK,
   /* VFP.  */
-  FPUTYPE_VFP
+  FPUTYPE_VFP,
+  /* VFPv3.  */
+  FPUTYPE_VFP3
 };
 
 /* Recast the floating point class to be the floating point attribute.  */
@@ -641,6 +649,10 @@ extern int arm_structure_size_boundary;
   1,1,1,1,1,1,1,1,	\
   1,1,1,1,1,1,1,1,	\
   1,1,1,1,1,1,1,1,	\
+  1,1,1,1,1,1,1,1,	\
+  1,1,1,1,1,1,1,1,	\
+  1,1,1,1,1,1,1,1,	\
+  1,1,1,1,1,1,1,1,	\
   1			\
 }
 
@@ -667,6 +679,10 @@ extern int arm_structure_size_boundary;
   1,1,1,1,1,1,1,1,	     \
   1,1,1,1,1,1,1,1,	     \
   1,1,1,1,1,1,1,1,	     \
+  1,1,1,1,1,1,1,1,	     \
+  1,1,1,1,1,1,1,1,	     \
+  1,1,1,1,1,1,1,1,	     \
+  1,1,1,1,1,1,1,1,	     \
   1			     \
 }
 
@@ -718,11 +734,15 @@ extern int arm_structure_size_boundary;
 	}							\
       if (TARGET_VFP)						\
 	{							\
+	  /* VFPv3 registers are disabled when earlier VFP	\
+	     versions are selected due to the definition of	\
+	     LAST_VFP_REGNUM.  */				\
 	  for (regno = FIRST_VFP_REGNUM;			\
 	       regno <= LAST_VFP_REGNUM; ++ regno)		\
 	    {							\
 	      fixed_regs[regno] = 0;				\
-	      call_used_regs[regno] = regno < FIRST_VFP_REGNUM + 16; \
+	      call_used_regs[regno] = regno < FIRST_VFP_REGNUM + 16 \
+	      	|| regno >= FIRST_VFP_REGNUM + 32;		\
 	    }							\
 	}							\
     }								\
@@ -896,15 +916,32 @@ extern int arm_structure_size_boundary;
   (((REGNUM) >= FIRST_CIRRUS_FP_REGNUM) && ((REGNUM) <= LAST_CIRRUS_FP_REGNUM))
 
 #define FIRST_VFP_REGNUM	63
-#define LAST_VFP_REGNUM		94
+#define D7_VFP_REGNUM		78  /* Registers 77 and 78 == VFP reg D7.  */
+#define LAST_VFP_REGNUM		(TARGET_VFP3 ? 126 : 94)
 #define IS_VFP_REGNUM(REGNUM) \
   (((REGNUM) >= FIRST_VFP_REGNUM) && ((REGNUM) <= LAST_VFP_REGNUM))
 
+/* VFP registers are split into two types: those defined by VFP versions < 3
+   have D registers overlaid on consecutive pairs of S registers. VFP version 3
+   defines 16 new D registers (d16-d31) which, for simplicity and correctness
+   in various parts of the backend, we implement as "fake" single-precision
+   registers (which would be S32-S63, but cannot be used in that way).  The
+   following macros define these ranges of registers.  */
+#define LAST_LO_VFP_REGNUM	94
+#define FIRST_HI_VFP_REGNUM	95
+
+#define VFP_REGNO_OK_FOR_SINGLE(REGNUM) \
+  ((REGNUM) <= LAST_LO_VFP_REGNUM)
+
+/* DFmode values are only valid in even register pairs.  */
+#define VFP_REGNO_OK_FOR_DOUBLE(REGNUM) \
+  ((((REGNUM) - FIRST_VFP_REGNUM) & 1) == 0)
+
 /* The number of hard registers is 16 ARM + 8 FPA + 1 CC + 1 SFP + 1 AFP.  */
 /* + 16 Cirrus registers take us up to 43.  */
 /* Intel Wireless MMX Technology registers add 16 + 4 more.  */
-/* VFP adds 32 + 1 more.  */
-#define FIRST_PSEUDO_REGISTER   96
+/* VFP (VFP3) adds 32 (64) + 1 more.  */
+#define FIRST_PSEUDO_REGISTER   128
 
 #define DBX_REGISTER_NUMBER(REGNO) arm_dbx_register_number (REGNO)
 
@@ -958,24 +995,33 @@ extern int arm_structure_size_boundary;
    function parameters.  It is quite good to use lr since other calls may
    clobber it anyway.  Allocate r0 through r3 in reverse order since r3 is
    least likely to contain a function parameter; in addition results are
-   returned in r0.  */
+   returned in r0.
+   For VFP/VFPv3, allocate caller-saved registers first (D0-D7), then D16-D31,
+   then D8-D15.  The reason for doing this is to attempt to reduce register
+   pressure when both single- and double-precision registers are used in a
+   function, but hopefully not force double-precision registers to be
+   callee-saved when it's not necessary. */
 
-#define REG_ALLOC_ORDER  	    \
-{                                   \
-     3,  2,  1,  0, 12, 14,  4,  5, \
-     6,  7,  8, 10,  9, 11, 13, 15, \
-    16, 17, 18, 19, 20, 21, 22, 23, \
-    27, 28, 29, 30, 31, 32, 33, 34, \
-    35, 36, 37, 38, 39, 40, 41, 42, \
-    43, 44, 45, 46, 47, 48, 49, 50, \
-    51, 52, 53, 54, 55, 56, 57, 58, \
-    59, 60, 61, 62,		    \
-    24, 25, 26,			    \
-    78, 77, 76, 75, 74, 73, 72, 71, \
-    70, 69, 68, 67, 66, 65, 64, 63, \
-    79, 80, 81, 82, 83, 84, 85, 86, \
-    87, 88, 89, 90, 91, 92, 93, 94, \
-    95				    \
+#define REG_ALLOC_ORDER				\
+{						\
+     3,  2,  1,  0, 12, 14,  4,  5,		\
+     6,  7,  8, 10,  9, 11, 13, 15,		\
+    16, 17, 18, 19, 20, 21, 22, 23,		\
+    27, 28, 29, 30, 31, 32, 33, 34,		\
+    35, 36, 37, 38, 39, 40, 41, 42,		\
+    43, 44, 45, 46, 47, 48, 49, 50,		\
+    51, 52, 53, 54, 55, 56, 57, 58,		\
+    59, 60, 61, 62,				\
+    24, 25, 26,					\
+    78,  77,  76,  75,  74,  73,  72,  71,	\
+    70,  69,  68,  67,  66,  65,  64,  63,	\
+    95,  96,  97,  98,  99, 100, 101, 102,	\
+   103, 104, 105, 106, 107, 108, 109, 110,	\
+   111, 112, 113, 114, 115, 116, 117, 118,	\
+   119, 120, 121, 122, 123, 124, 125, 126,	\
+    79,  80,  81,  82,  83,  84,  85,  86,	\
+    87,  88,  89,  90,  91,  92,  93,  94,	\
+   127						\
 }
 
 /* Interrupt functions can only use registers that have already been
@@ -994,6 +1040,9 @@ enum reg_class
   NO_REGS,
   FPA_REGS,
   CIRRUS_REGS,
+  VFP_D0_D7_REGS,
+  VFP_LO_REGS,
+  VFP_HI_REGS,
   VFP_REGS,
   IWMMXT_GR_REGS,
   IWMMXT_REGS,
@@ -1016,6 +1065,9 @@ enum reg_class
   "NO_REGS",		\
   "FPA_REGS",		\
   "CIRRUS_REGS",	\
+  "VFP_D0_D7_REGS",	\
+  "VFP_LO_REGS",	\
+  "VFP_HI_REGS",	\
   "VFP_REGS",		\
   "IWMMXT_GR_REGS",	\
   "IWMMXT_REGS",	\
@@ -1032,24 +1084,32 @@ enum reg_class
 /* Define which registers fit in which classes.
    This is an initializer for a vector of HARD_REG_SET
    of length N_REG_CLASSES.  */
-#define REG_CLASS_CONTENTS					\
-{								\
-  { 0x00000000, 0x00000000, 0x00000000 }, /* NO_REGS  */	\
-  { 0x00FF0000, 0x00000000, 0x00000000 }, /* FPA_REGS */	\
-  { 0xF8000000, 0x000007FF, 0x00000000 }, /* CIRRUS_REGS */	\
-  { 0x00000000, 0x80000000, 0x7FFFFFFF }, /* VFP_REGS  */	\
-  { 0x00000000, 0x00007800, 0x00000000 }, /* IWMMXT_GR_REGS */	\
-  { 0x00000000, 0x7FFF8000, 0x00000000 }, /* IWMMXT_REGS */	\
-  { 0x000000FF, 0x00000000, 0x00000000 }, /* LO_REGS */		\
-  { 0x00002000, 0x00000000, 0x00000000 }, /* STACK_REG */	\
-  { 0x000020FF, 0x00000000, 0x00000000 }, /* BASE_REGS */	\
-  { 0x0000FF00, 0x00000000, 0x00000000 }, /* HI_REGS */		\
-  { 0x01000000, 0x00000000, 0x00000000 }, /* CC_REG */		\
-  { 0x00000000, 0x00000000, 0x80000000 }, /* VFPCC_REG */	\
-  { 0x0200FFFF, 0x00000000, 0x00000000 }, /* GENERAL_REGS */	\
-  { 0xFAFFFFFF, 0xFFFFFFFF, 0x7FFFFFFF }  /* ALL_REGS */	\
+#define REG_CLASS_CONTENTS						\
+{									\
+  { 0x00000000, 0x00000000, 0x00000000, 0x00000000 }, /* NO_REGS  */	\
+  { 0x00FF0000, 0x00000000, 0x00000000, 0x00000000 }, /* FPA_REGS */	\
+  { 0xF8000000, 0x000007FF, 0x00000000, 0x00000000 }, /* CIRRUS_REGS */	\
+  { 0x00000000, 0x80000000, 0x00007FFF, 0x00000000 }, /* VFP_D0_D7_REGS  */ \
+  { 0x00000000, 0x80000000, 0x7FFFFFFF, 0x00000000 }, /* VFP_LO_REGS  */ \
+  { 0x00000000, 0x00000000, 0x80000000, 0x7FFFFFFF }, /* VFP_HI_REGS  */ \
+  { 0x00000000, 0x80000000, 0xFFFFFFFF, 0x7FFFFFFF }, /* VFP_REGS  */	\
+  { 0x00000000, 0x00007800, 0x00000000, 0x00000000 }, /* IWMMXT_GR_REGS */ \
+  { 0x00000000, 0x7FFF8000, 0x00000000, 0x00000000 }, /* IWMMXT_REGS */	\
+  { 0x000000FF, 0x00000000, 0x00000000, 0x00000000 }, /* LO_REGS */	\
+  { 0x00002000, 0x00000000, 0x00000000, 0x00000000 }, /* STACK_REG */	\
+  { 0x000020FF, 0x00000000, 0x00000000, 0x00000000 }, /* BASE_REGS */	\
+  { 0x0000FF00, 0x00000000, 0x00000000, 0x00000000 }, /* HI_REGS */	\
+  { 0x01000000, 0x00000000, 0x00000000, 0x00000000 }, /* CC_REG */	\
+  { 0x00000000, 0x00000000, 0x00000000, 0x80000000 }, /* VFPCC_REG */	\
+  { 0x0200FFFF, 0x00000000, 0x00000000, 0x00000000 }, /* GENERAL_REGS */ \
+  { 0xFAFFFFFF, 0xFFFFFFFF, 0xFFFFFFFF, 0x7FFFFFFF }  /* ALL_REGS */	\
 }
 
+/* Any of the VFP register classes.  */
+#define IS_VFP_CLASS(X) \
+  ((X) == VFP_D0_D7_REGS || (X) == VFP_LO_REGS \
+   || (X) == VFP_HI_REGS || (X) == VFP_REGS)
+
 /* The same information, inverted:
    Return the class number of the smallest class containing
    reg number REGNO.  This could be a conditional expression
@@ -1123,7 +1183,7 @@ enum reg_class
 #define SECONDARY_OUTPUT_RELOAD_CLASS(CLASS, MODE, X)		\
   /* Restrict which direct reloads are allowed for VFP/iWMMXt regs.  */ \
   ((TARGET_VFP && TARGET_HARD_FLOAT				\
-    && (CLASS) == VFP_REGS)					\
+    && IS_VFP_CLASS (CLASS))					\
    ? coproc_secondary_reload_class (MODE, X, FALSE)		\
    : (TARGET_IWMMXT && (CLASS) == IWMMXT_REGS)			\
    ? coproc_secondary_reload_class (MODE, X, TRUE)		\
@@ -1136,7 +1196,7 @@ enum reg_class
 #define SECONDARY_INPUT_RELOAD_CLASS(CLASS, MODE, X)		\
   /* Restrict which direct reloads are allowed for VFP/iWMMXt regs.  */ \
   ((TARGET_VFP && TARGET_HARD_FLOAT				\
-    && (CLASS) == VFP_REGS)					\
+    && IS_VFP_CLASS (CLASS))					\
     ? coproc_secondary_reload_class (MODE, X, FALSE) :		\
     (TARGET_IWMMXT && (CLASS) == IWMMXT_REGS) ?			\
     coproc_secondary_reload_class (MODE, X, TRUE) :		\
@@ -1255,8 +1315,8 @@ do {									      \
   (TARGET_32BIT ?						\
    ((FROM) == FPA_REGS && (TO) != FPA_REGS ? 20 :	\
     (FROM) != FPA_REGS && (TO) == FPA_REGS ? 20 :	\
-    (FROM) == VFP_REGS && (TO) != VFP_REGS ? 10 :  \
-    (FROM) != VFP_REGS && (TO) == VFP_REGS ? 10 :  \
+    IS_VFP_CLASS (FROM) && !IS_VFP_CLASS (TO) ? 10 :	\
+    !IS_VFP_CLASS (FROM) && IS_VFP_CLASS (TO) ? 10 :	\
     (FROM) == IWMMXT_REGS && (TO) != IWMMXT_REGS ? 4 :  \
     (FROM) != IWMMXT_REGS && (TO) == IWMMXT_REGS ? 4 :  \
     (FROM) == IWMMXT_GR_REGS || (TO) == IWMMXT_GR_REGS ? 20 :  \
--- .pc/vfpv3-support/gcc/config/arm/arm-protos.h	2007-06-02 13:45:47.000000000 -0700
+++ gcc/config/arm/arm-protos.h	2007-06-02 13:46:00.000000000 -0700
@@ -68,6 +68,7 @@ extern rtx thumb_legitimize_reload_addre
 					    int);
 extern int arm_const_double_rtx (rtx);
 extern int neg_const_double_rtx_ok_for_fpa (rtx);
+extern int vfp3_const_double_rtx (rtx);
 extern enum reg_class coproc_secondary_reload_class (enum machine_mode, rtx,
 						     bool);
 extern bool arm_tls_referenced_p (rtx);
--- .pc/vfpv3-support/gcc/config/arm/vfp.md	2007-06-02 13:45:47.000000000 -0700
+++ gcc/config/arm/vfp.md	2007-06-02 13:46:00.000000000 -0700
@@ -21,7 +21,7 @@
 
 ;; Additional register numbers
 (define_constants
-  [(VFPCC_REGNUM 95)]
+  [(VFPCC_REGNUM 127)]
 )
 
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
@@ -121,8 +121,8 @@
 ;; ??? For now do not allow loading constants into vfp regs.  This causes
 ;; problems because small constants get converted into adds.
 (define_insn "*arm_movsi_vfp"
-  [(set (match_operand:SI 0 "nonimmediate_operand" "=r,r,r,r ,m,*w,r,*w,*w, *Uv")
-      (match_operand:SI 1 "general_operand"	   "rI,K,N,mi,r,r,*w,*w,*Uvi,*w"))]
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=r,r,r,r ,m,*t,r,*t,*t, *Uv")
+      (match_operand:SI 1 "general_operand"	   "rI,K,N,mi,r,r,*t,*t,*Uvi,*t"))]
   "TARGET_ARM && TARGET_VFP && TARGET_HARD_FLOAT
    && (   s_register_operand (operands[0], SImode)
        || s_register_operand (operands[1], SImode))"
@@ -158,8 +158,8 @@
 )
 
 (define_insn "*thumb2_movsi_vfp"
-  [(set (match_operand:SI 0 "nonimmediate_operand" "=r,r,r,r,m,*w,r,*w,*w, *Uv")
-      (match_operand:SI 1 "general_operand"	   "rI,K,N,mi,r,r,*w,*w,*Uvi,*w"))]
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=r,r,r,r,m,*t,r,*t,*t, *Uv")
+      (match_operand:SI 1 "general_operand"	   "rI,K,N,mi,r,r,*t,*t,*Uvi,*t"))]
   "TARGET_THUMB2 && TARGET_VFP && TARGET_HARD_FLOAT
    && (   s_register_operand (operands[0], SImode)
        || s_register_operand (operands[1], SImode))"
@@ -262,8 +262,8 @@
 ;; preferable to loading the value via integer registers.
 
 (define_insn "*movsf_vfp"
-  [(set (match_operand:SF 0 "nonimmediate_operand" "=w,?r,w  ,Uv,r ,m,w,r")
-	(match_operand:SF 1 "general_operand"	   " ?r,w,UvE,w, mE,r,w,r"))]
+  [(set (match_operand:SF 0 "nonimmediate_operand" "=t,?r,t ,t  ,Uv,r ,m,t,r")
+	(match_operand:SF 1 "general_operand"	   " ?r,t,Dv,UvE,t, mE,r,t,r"))]
   "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP
    && (   s_register_operand (operands[0], SFmode)
        || s_register_operand (operands[1], SFmode))"
@@ -274,29 +274,33 @@
       return \"fmsr%?\\t%0, %1\";
     case 1:
       return \"fmrs%?\\t%0, %1\";
-    case 2: case 3:
+    case 2:
+      return \"fconsts%?\\t%0, #%G1\";
+    case 3: case 4:
       return output_move_vfp (operands);
-    case 4:
-      return \"ldr%?\\t%0, %1\\t%@ float\";
     case 5:
-      return \"str%?\\t%1, %0\\t%@ float\";
+      return \"ldr%?\\t%0, %1\\t%@ float\";
     case 6:
-      return \"fcpys%?\\t%0, %1\";
+      return \"str%?\\t%1, %0\\t%@ float\";
     case 7:
+      return \"fcpys%?\\t%0, %1\";
+    case 8:
       return \"mov%?\\t%0, %1\\t%@ float\";
     default:
       gcc_unreachable ();
     }
   "
   [(set_attr "predicable" "yes")
-   (set_attr "type" "r_2_f,f_2_r,ffarith,*,f_loads,f_stores,load1,store1")
-   (set_attr "pool_range" "*,*,1020,*,4096,*,*,*")
-   (set_attr "neg_pool_range" "*,*,1008,*,4080,*,*,*")]
+   (set_attr "type"
+     "r_2_f,f_2_r,farith,f_loads,f_stores,load1,store1,ffarith,*")
+   (set_attr "insn" "*,*,*,*,*,*,*,*,mov")
+   (set_attr "pool_range" "*,*,*,1020,*,4096,*,*,*")
+   (set_attr "neg_pool_range" "*,*,*,1008,*,4080,*,*,*")]
 )
 
 (define_insn "*thumb2_movsf_vfp"
-  [(set (match_operand:SF 0 "nonimmediate_operand" "=w,?r,w  ,Uv,r ,m,w,r")
-	(match_operand:SF 1 "general_operand"	   " ?r,w,UvE,w, mE,r,w,r"))]
+  [(set (match_operand:SF 0 "nonimmediate_operand" "=t,?r,t, t  ,Uv,r ,m,t,r")
+	(match_operand:SF 1 "general_operand"	   " ?r,t,Dv,UvE,t, mE,r,t,r"))]
   "TARGET_THUMB2 && TARGET_HARD_FLOAT && TARGET_VFP
    && (   s_register_operand (operands[0], SFmode)
        || s_register_operand (operands[1], SFmode))"
@@ -307,32 +311,35 @@
       return \"fmsr%?\\t%0, %1\";
     case 1:
       return \"fmrs%?\\t%0, %1\";
-    case 2: case 3:
+    case 2:
+      return \"fconsts%?\\t%0, #%G1\";
+    case 3: case 4:
       return output_move_vfp (operands);
-    case 4:
-      return \"ldr%?\\t%0, %1\\t%@ float\";
     case 5:
-      return \"str%?\\t%1, %0\\t%@ float\";
+      return \"ldr%?\\t%0, %1\\t%@ float\";
     case 6:
-      return \"fcpys%?\\t%0, %1\";
+      return \"str%?\\t%1, %0\\t%@ float\";
     case 7:
+      return \"fcpys%?\\t%0, %1\";
+    case 8:
       return \"mov%?\\t%0, %1\\t%@ float\";
     default:
       gcc_unreachable ();
     }
   "
   [(set_attr "predicable" "yes")
-   (set_attr "type" "r_2_f,f_2_r,ffarith,*,f_load,f_store,load1,store1")
-   (set_attr "pool_range" "*,*,1020,*,4092,*,*,*")
-   (set_attr "neg_pool_range" "*,*,1008,*,0,*,*,*")]
+   (set_attr "type"
+     "r_2_f,f_2_r,farith,f_load,f_store,load1,store1,ffarith,*")
+   (set_attr "pool_range" "*,*,*,1020,*,4092,*,*,*")
+   (set_attr "neg_pool_range" "*,*,*,1008,*,0,*,*,*")]
 )
 
 
 ;; DFmode moves
 
 (define_insn "*movdf_vfp"
-  [(set (match_operand:DF 0 "nonimmediate_soft_df_operand" "=w,?r,r, m,w  ,Uv,w,r")
-	(match_operand:DF 1 "soft_df_operand"		   " ?r,w,mF,r,UvF,w, w,r"))]
+  [(set (match_operand:DF 0 "nonimmediate_soft_df_operand" "=w,?r,w ,r, m,w  ,Uv,w,r")
+	(match_operand:DF 1 "soft_df_operand"		   " ?r,w,Dv,mF,r,UvF,w, w,r"))]
   "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP
    && (   register_operand (operands[0], DFmode)
        || register_operand (operands[1], DFmode))"
@@ -344,28 +351,31 @@
 	return \"fmdrr%?\\t%P0, %Q1, %R1\";
       case 1:
 	return \"fmrrd%?\\t%Q0, %R0, %P1\";
-      case 2: case 3:
+      case 2:
+        return \"fconstd%?\\t%P0, #%G1\";
+      case 3: case 4:
 	return output_move_double (operands);
-      case 4: case 5:
+      case 5: case 6:
 	return output_move_vfp (operands);
-      case 6:
-	return \"fcpyd%?\\t%P0, %P1\";
       case 7:
+	return \"fcpyd%?\\t%P0, %P1\";
+      case 8:
         return \"#\";
       default:
 	gcc_unreachable ();
       }
     }
   "
-  [(set_attr "type" "r_2_f,f_2_r,ffarith,*,load2,store2,f_loadd,f_stored")
-   (set_attr "length" "4,4,8,8,4,4,4,8")
-   (set_attr "pool_range" "*,*,1020,*,1020,*,*,*")
-   (set_attr "neg_pool_range" "*,*,1008,*,1008,*,*,*")]
+  [(set_attr "type"
+     "r_2_f,f_2_r,farith,f_loadd,f_stored,load2,store2,ffarith,*")
+   (set_attr "length" "4,4,4,8,8,4,4,4,8")
+   (set_attr "pool_range" "*,*,*,1020,*,1020,*,*,*")
+   (set_attr "neg_pool_range" "*,*,*,1008,*,1008,*,*,*")]
 )
 
 (define_insn "*thumb2_movdf_vfp"
-  [(set (match_operand:DF 0 "nonimmediate_soft_df_operand" "=w,?r,r, m,w  ,Uv,w,r")
-	(match_operand:DF 1 "soft_df_operand"		   " ?r,w,mF,r,UvF,w, w,r"))]
+  [(set (match_operand:DF 0 "nonimmediate_soft_df_operand" "=w,?r,w ,r, m,w  ,Uv,w,r")
+	(match_operand:DF 1 "soft_df_operand"		   " ?r,w,Dv,mF,r,UvF,w, w,r"))]
   "TARGET_THUMB2 && TARGET_HARD_FLOAT && TARGET_VFP"
   "*
   {
@@ -375,33 +385,36 @@
 	return \"fmdrr%?\\t%P0, %Q1, %R1\";
       case 1:
 	return \"fmrrd%?\\t%Q0, %R0, %P1\";
-      case 2: case 3: case 7:
+      case 2:
+	return \"fconstd%?\\t%P0, #%G1\";
+      case 3: case 4: case 8:
 	return output_move_double (operands);
-      case 4: case 5:
+      case 5: case 6:
 	return output_move_vfp (operands);
-      case 6:
+      case 7:
 	return \"fcpyd%?\\t%P0, %P1\";
       default:
 	abort ();
       }
     }
   "
-  [(set_attr "type" "r_2_f,f_2_r,ffarith,*,load2,store2,f_load,f_store")
-   (set_attr "length" "4,4,8,8,4,4,4,8")
-   (set_attr "pool_range" "*,*,4096,*,1020,*,*,*")
-   (set_attr "neg_pool_range" "*,*,0,*,1008,*,*,*")]
+  [(set_attr "type"
+     "r_2_f,f_2_r,farith,load2,store2,f_load,f_store,ffarith,*")
+   (set_attr "length" "4,4,4,8,8,4,4,4,8")
+   (set_attr "pool_range" "*,*,*,4096,*,1020,*,*,*")
+   (set_attr "neg_pool_range" "*,*,*,0,*,1008,*,*,*")]
 )
 
 
 ;; Conditional move patterns
 
 (define_insn "*movsfcc_vfp"
-  [(set (match_operand:SF   0 "s_register_operand" "=w,w,w,w,w,w,?r,?r,?r")
+  [(set (match_operand:SF   0 "s_register_operand" "=t,t,t,t,t,t,?r,?r,?r")
 	(if_then_else:SF
 	  (match_operator   3 "arm_comparison_operator"
 	    [(match_operand 4 "cc_register" "") (const_int 0)])
-	  (match_operand:SF 1 "s_register_operand" "0,w,w,0,?r,?r,0,w,w")
-	  (match_operand:SF 2 "s_register_operand" "w,0,w,?r,0,?r,w,0,w")))]
+	  (match_operand:SF 1 "s_register_operand" "0,t,t,0,?r,?r,0,t,t")
+	  (match_operand:SF 2 "s_register_operand" "t,0,t,?r,0,?r,t,0,t")))]
   "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
   "@
    fcpys%D3\\t%0, %2
@@ -419,12 +432,12 @@
 )
 
 (define_insn "*thumb2_movsfcc_vfp"
-  [(set (match_operand:SF   0 "s_register_operand" "=w,w,w,w,w,w,?r,?r,?r")
+  [(set (match_operand:SF   0 "s_register_operand" "=t,t,t,t,t,t,?r,?r,?r")
 	(if_then_else:SF
 	  (match_operator   3 "arm_comparison_operator"
 	    [(match_operand 4 "cc_register" "") (const_int 0)])
-	  (match_operand:SF 1 "s_register_operand" "0,w,w,0,?r,?r,0,w,w")
-	  (match_operand:SF 2 "s_register_operand" "w,0,w,?r,0,?r,w,0,w")))]
+	  (match_operand:SF 1 "s_register_operand" "0,t,t,0,?r,?r,0,t,t")
+	  (match_operand:SF 2 "s_register_operand" "t,0,t,?r,0,?r,t,0,t")))]
   "TARGET_THUMB2 && TARGET_HARD_FLOAT && TARGET_VFP"
   "@
    it\\t%D3\;fcpys%D3\\t%0, %2
@@ -491,8 +504,8 @@
 ;; Sign manipulation functions
 
 (define_insn "*abssf2_vfp"
-  [(set (match_operand:SF	  0 "s_register_operand" "=w")
-	(abs:SF (match_operand:SF 1 "s_register_operand" "w")))]
+  [(set (match_operand:SF	  0 "s_register_operand" "=t")
+	(abs:SF (match_operand:SF 1 "s_register_operand" "t")))]
   "TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_VFP"
   "fabss%?\\t%0, %1"
   [(set_attr "predicable" "yes")
@@ -509,8 +522,8 @@
 )
 
 (define_insn "*negsf2_vfp"
-  [(set (match_operand:SF	  0 "s_register_operand" "=w,?r")
-	(neg:SF (match_operand:SF 1 "s_register_operand" "w,r")))]
+  [(set (match_operand:SF	  0 "s_register_operand" "=t,?r")
+	(neg:SF (match_operand:SF 1 "s_register_operand" "t,r")))]
   "TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_VFP"
   "@
    fnegs%?\\t%0, %1
@@ -569,9 +582,9 @@
 ;; Arithmetic insns
 
 (define_insn "*addsf3_vfp"
-  [(set (match_operand:SF	   0 "s_register_operand" "=w")
-	(plus:SF (match_operand:SF 1 "s_register_operand" "w")
-		 (match_operand:SF 2 "s_register_operand" "w")))]
+  [(set (match_operand:SF	   0 "s_register_operand" "=t")
+	(plus:SF (match_operand:SF 1 "s_register_operand" "t")
+		 (match_operand:SF 2 "s_register_operand" "t")))]
   "TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_VFP"
   "fadds%?\\t%0, %1, %2"
   [(set_attr "predicable" "yes")
@@ -590,9 +603,9 @@
 
 
 (define_insn "*subsf3_vfp"
-  [(set (match_operand:SF	    0 "s_register_operand" "=w")
-	(minus:SF (match_operand:SF 1 "s_register_operand" "w")
-		  (match_operand:SF 2 "s_register_operand" "w")))]
+  [(set (match_operand:SF	    0 "s_register_operand" "=t")
+	(minus:SF (match_operand:SF 1 "s_register_operand" "t")
+		  (match_operand:SF 2 "s_register_operand" "t")))]
   "TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_VFP"
   "fsubs%?\\t%0, %1, %2"
   [(set_attr "predicable" "yes")
@@ -613,9 +626,9 @@
 ;; Division insns
 
 (define_insn "*divsf3_vfp"
-  [(set (match_operand:SF	  0 "s_register_operand" "+w")
-	(div:SF (match_operand:SF 1 "s_register_operand" "w")
-		(match_operand:SF 2 "s_register_operand" "w")))]
+  [(set (match_operand:SF	  0 "s_register_operand" "+t")
+	(div:SF (match_operand:SF 1 "s_register_operand" "t")
+		(match_operand:SF 2 "s_register_operand" "t")))]
   "TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_VFP"
   "fdivs%?\\t%0, %1, %2"
   [(set_attr "predicable" "yes")
@@ -636,9 +649,9 @@
 ;; Multiplication insns
 
 (define_insn "*mulsf3_vfp"
-  [(set (match_operand:SF	   0 "s_register_operand" "+w")
-	(mult:SF (match_operand:SF 1 "s_register_operand" "w")
-		 (match_operand:SF 2 "s_register_operand" "w")))]
+  [(set (match_operand:SF	   0 "s_register_operand" "+t")
+	(mult:SF (match_operand:SF 1 "s_register_operand" "t")
+		 (match_operand:SF 2 "s_register_operand" "t")))]
   "TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_VFP"
   "fmuls%?\\t%0, %1, %2"
   [(set_attr "predicable" "yes")
@@ -657,9 +670,9 @@
 
 
 (define_insn "*mulsf3negsf_vfp"
-  [(set (match_operand:SF		   0 "s_register_operand" "+w")
-	(mult:SF (neg:SF (match_operand:SF 1 "s_register_operand" "w"))
-		 (match_operand:SF	   2 "s_register_operand" "w")))]
+  [(set (match_operand:SF		   0 "s_register_operand" "+t")
+	(mult:SF (neg:SF (match_operand:SF 1 "s_register_operand" "t"))
+		 (match_operand:SF	   2 "s_register_operand" "t")))]
   "TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_VFP"
   "fnmuls%?\\t%0, %1, %2"
   [(set_attr "predicable" "yes")
@@ -681,9 +694,9 @@
 
 ;; 0 = 1 * 2 + 0
 (define_insn "*mulsf3addsf_vfp"
-  [(set (match_operand:SF		    0 "s_register_operand" "=w")
-	(plus:SF (mult:SF (match_operand:SF 2 "s_register_operand" "w")
-			  (match_operand:SF 3 "s_register_operand" "w"))
+  [(set (match_operand:SF		    0 "s_register_operand" "=t")
+	(plus:SF (mult:SF (match_operand:SF 2 "s_register_operand" "t")
+			  (match_operand:SF 3 "s_register_operand" "t"))
 		 (match_operand:SF	    1 "s_register_operand" "0")))]
   "TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_VFP"
   "fmacs%?\\t%0, %2, %3"
@@ -704,9 +717,9 @@
 
 ;; 0 = 1 * 2 - 0
 (define_insn "*mulsf3subsf_vfp"
-  [(set (match_operand:SF		     0 "s_register_operand" "=w")
-	(minus:SF (mult:SF (match_operand:SF 2 "s_register_operand" "w")
-			   (match_operand:SF 3 "s_register_operand" "w"))
+  [(set (match_operand:SF		     0 "s_register_operand" "=t")
+	(minus:SF (mult:SF (match_operand:SF 2 "s_register_operand" "t")
+			   (match_operand:SF 3 "s_register_operand" "t"))
 		  (match_operand:SF	     1 "s_register_operand" "0")))]
   "TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_VFP"
   "fmscs%?\\t%0, %2, %3"
@@ -727,10 +740,10 @@
 
 ;; 0 = -(1 * 2) + 0
 (define_insn "*mulsf3negsfaddsf_vfp"
-  [(set (match_operand:SF		     0 "s_register_operand" "=w")
+  [(set (match_operand:SF		     0 "s_register_operand" "=t")
 	(minus:SF (match_operand:SF	     1 "s_register_operand" "0")
-		  (mult:SF (match_operand:SF 2 "s_register_operand" "w")
-			   (match_operand:SF 3 "s_register_operand" "w"))))]
+		  (mult:SF (match_operand:SF 2 "s_register_operand" "t")
+			   (match_operand:SF 3 "s_register_operand" "t"))))]
   "TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_VFP"
   "fnmacs%?\\t%0, %2, %3"
   [(set_attr "predicable" "yes")
@@ -751,10 +764,10 @@
 
 ;; 0 = -(1 * 2) - 0
 (define_insn "*mulsf3negsfsubsf_vfp"
-  [(set (match_operand:SF		      0 "s_register_operand" "=w")
+  [(set (match_operand:SF		      0 "s_register_operand" "=t")
 	(minus:SF (mult:SF
-		    (neg:SF (match_operand:SF 2 "s_register_operand" "w"))
-		    (match_operand:SF	      3 "s_register_operand" "w"))
+		    (neg:SF (match_operand:SF 2 "s_register_operand" "t"))
+		    (match_operand:SF	      3 "s_register_operand" "t"))
 		  (match_operand:SF	      1 "s_register_operand" "0")))]
   "TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_VFP"
   "fnmscs%?\\t%0, %2, %3"
@@ -779,7 +792,7 @@
 
 (define_insn "*extendsfdf2_vfp"
   [(set (match_operand:DF		   0 "s_register_operand" "=w")
-	(float_extend:DF (match_operand:SF 1 "s_register_operand" "w")))]
+	(float_extend:DF (match_operand:SF 1 "s_register_operand" "t")))]
   "TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_VFP"
   "fcvtds%?\\t%P0, %1"
   [(set_attr "predicable" "yes")
@@ -787,7 +800,7 @@
 )
 
 (define_insn "*truncdfsf2_vfp"
-  [(set (match_operand:SF		   0 "s_register_operand" "=w")
+  [(set (match_operand:SF		   0 "s_register_operand" "=t")
 	(float_truncate:SF (match_operand:DF 1 "s_register_operand" "w")))]
   "TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_VFP"
   "fcvtsd%?\\t%0, %P1"
@@ -796,8 +809,8 @@
 )
 
 (define_insn "*truncsisf2_vfp"
-  [(set (match_operand:SI		  0 "s_register_operand" "=w")
-	(fix:SI (fix:SF (match_operand:SF 1 "s_register_operand" "w"))))]
+  [(set (match_operand:SI		  0 "s_register_operand" "=t")
+	(fix:SI (fix:SF (match_operand:SF 1 "s_register_operand" "t"))))]
   "TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_VFP"
   "ftosizs%?\\t%0, %1"
   [(set_attr "predicable" "yes")
@@ -805,7 +818,7 @@
 )
 
 (define_insn "*truncsidf2_vfp"
-  [(set (match_operand:SI		  0 "s_register_operand" "=w")
+  [(set (match_operand:SI		  0 "s_register_operand" "=t")
 	(fix:SI (fix:DF (match_operand:DF 1 "s_register_operand" "w"))))]
   "TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_VFP"
   "ftosizd%?\\t%0, %P1"
@@ -815,8 +828,8 @@
 
 
 (define_insn "fixuns_truncsfsi2"
-  [(set (match_operand:SI		  0 "s_register_operand" "=w")
-	(unsigned_fix:SI (fix:SF (match_operand:SF 1 "s_register_operand" "w"))))]
+  [(set (match_operand:SI		  0 "s_register_operand" "=t")
+	(unsigned_fix:SI (fix:SF (match_operand:SF 1 "s_register_operand" "t"))))]
   "TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_VFP"
   "ftouizs%?\\t%0, %1"
   [(set_attr "predicable" "yes")
@@ -824,8 +837,8 @@
 )
 
 (define_insn "fixuns_truncdfsi2"
-  [(set (match_operand:SI		  0 "s_register_operand" "=w")
-	(unsigned_fix:SI (fix:DF (match_operand:DF 1 "s_register_operand" "w"))))]
+  [(set (match_operand:SI		  0 "s_register_operand" "=t")
+	(unsigned_fix:SI (fix:DF (match_operand:DF 1 "s_register_operand" "t"))))]
   "TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_VFP"
   "ftouizd%?\\t%0, %P1"
   [(set_attr "predicable" "yes")
@@ -834,8 +847,8 @@
 
 
 (define_insn "*floatsisf2_vfp"
-  [(set (match_operand:SF	    0 "s_register_operand" "=w")
-	(float:SF (match_operand:SI 1 "s_register_operand" "w")))]
+  [(set (match_operand:SF	    0 "s_register_operand" "=t")
+	(float:SF (match_operand:SI 1 "s_register_operand" "t")))]
   "TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_VFP"
   "fsitos%?\\t%0, %1"
   [(set_attr "predicable" "yes")
@@ -844,7 +857,7 @@
 
 (define_insn "*floatsidf2_vfp"
   [(set (match_operand:DF	    0 "s_register_operand" "=w")
-	(float:DF (match_operand:SI 1 "s_register_operand" "w")))]
+	(float:DF (match_operand:SI 1 "s_register_operand" "t")))]
   "TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_VFP"
   "fsitod%?\\t%P0, %1"
   [(set_attr "predicable" "yes")
@@ -853,8 +866,8 @@
 
 
 (define_insn "floatunssisf2"
-  [(set (match_operand:SF	    0 "s_register_operand" "=w")
-	(unsigned_float:SF (match_operand:SI 1 "s_register_operand" "w")))]
+  [(set (match_operand:SF	    0 "s_register_operand" "=t")
+	(unsigned_float:SF (match_operand:SI 1 "s_register_operand" "t")))]
   "TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_VFP"
   "fuitos%?\\t%0, %1"
   [(set_attr "predicable" "yes")
@@ -863,7 +876,7 @@
 
 (define_insn "floatunssidf2"
   [(set (match_operand:DF	    0 "s_register_operand" "=w")
-	(unsigned_float:DF (match_operand:SI 1 "s_register_operand" "w")))]
+	(unsigned_float:DF (match_operand:SI 1 "s_register_operand" "t")))]
   "TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_VFP"
   "fuitod%?\\t%P0, %1"
   [(set_attr "predicable" "yes")
@@ -874,8 +887,8 @@
 ;; Sqrt insns.
 
 (define_insn "*sqrtsf2_vfp"
-  [(set (match_operand:SF	   0 "s_register_operand" "=w")
-	(sqrt:SF (match_operand:SF 1 "s_register_operand" "w")))]
+  [(set (match_operand:SF	   0 "s_register_operand" "=t")
+	(sqrt:SF (match_operand:SF 1 "s_register_operand" "t")))]
   "TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_VFP"
   "fsqrts%?\\t%0, %1"
   [(set_attr "predicable" "yes")
@@ -905,8 +918,8 @@
 
 (define_insn_and_split "*cmpsf_split_vfp"
   [(set (reg:CCFP CC_REGNUM)
-	(compare:CCFP (match_operand:SF 0 "s_register_operand"  "w")
-		      (match_operand:SF 1 "vfp_compare_operand" "wG")))]
+	(compare:CCFP (match_operand:SF 0 "s_register_operand"  "t")
+		      (match_operand:SF 1 "vfp_compare_operand" "tG")))]
   "TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_VFP"
   "#"
   "TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_VFP"
@@ -920,8 +933,8 @@
 
 (define_insn_and_split "*cmpsf_trap_split_vfp"
   [(set (reg:CCFPE CC_REGNUM)
-	(compare:CCFPE (match_operand:SF 0 "s_register_operand"  "w")
-		       (match_operand:SF 1 "vfp_compare_operand" "wG")))]
+	(compare:CCFPE (match_operand:SF 0 "s_register_operand"  "t")
+		       (match_operand:SF 1 "vfp_compare_operand" "tG")))]
   "TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_VFP"
   "#"
   "TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_VFP"
@@ -968,8 +981,8 @@
 
 (define_insn "*cmpsf_vfp"
   [(set (reg:CCFP VFPCC_REGNUM)
-	(compare:CCFP (match_operand:SF 0 "s_register_operand"  "w,w")
-		      (match_operand:SF 1 "vfp_compare_operand" "w,G")))]
+	(compare:CCFP (match_operand:SF 0 "s_register_operand"  "t,t")
+		      (match_operand:SF 1 "vfp_compare_operand" "t,G")))]
   "TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_VFP"
   "@
    fcmps%?\\t%0, %1
@@ -980,8 +993,8 @@
 
 (define_insn "*cmpsf_trap_vfp"
   [(set (reg:CCFPE VFPCC_REGNUM)
-	(compare:CCFPE (match_operand:SF 0 "s_register_operand"  "w,w")
-		       (match_operand:SF 1 "vfp_compare_operand" "w,G")))]
+	(compare:CCFPE (match_operand:SF 0 "s_register_operand"  "t,t")
+		       (match_operand:SF 1 "vfp_compare_operand" "t,G")))]
   "TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_VFP"
   "@
    fcmpes%?\\t%0, %1
--- .pc/vfpv3-support/gcc/config/arm/aof.h	2007-06-02 13:45:47.000000000 -0700
+++ gcc/config/arm/aof.h	2007-06-02 13:46:00.000000000 -0700
@@ -210,7 +210,11 @@ do {					\
   "s0",  "s1",  "s2",  "s3",  "s4",  "s5",  "s6",  "s7",  \
   "s8",  "s9",  "s10", "s11", "s12", "s13", "s14", "s15", \
   "s16", "s17", "s18", "s19", "s20", "s21", "s22", "s23", \
-  "s24", "s25", "s26", "s27", "s28", "s29", "s30", "s31",  \
+  "s24", "s25", "s26", "s27", "s28", "s29", "s30", "s31", \
+  "d16", "?16", "d17", "?17", "d18", "?18", "d19", "?19", \
+  "d20", "?20", "d21", "?21", "d22", "?22", "d23", "?23", \
+  "d24", "?24", "d25", "?25", "d26", "?26", "d27", "?27", \
+  "d28", "?28", "d29", "?29", "d30", "?30", "d31", "?31", \
   "vfpcc"					\
 }
 
--- .pc/vfpv3-support/gcc/config/arm/constraints.md	2007-06-02 13:45:47.000000000 -0700
+++ gcc/config/arm/constraints.md	2007-06-02 13:46:00.000000000 -0700
@@ -20,7 +20,7 @@
 ;; Boston, MA 02110-1301, USA.
 
 ;; The following register constraints have been used:
-;; - in ARM/Thumb-2 state: f, v, w, y, z
+;; - in ARM/Thumb-2 state: f, t, v, w, x, y, z
 ;; - in Thumb state: h, k, b
 ;; - in both states: l, c
 ;; In ARM state, 'l' is an alias for 'r'
@@ -30,7 +30,7 @@
 ;; in Thumb-1 state: I, J, K, L, M, N, O
 
 ;; The following multi-letter normal constraints have been used:
-;; in ARM/Thumb-2 state: Da, Db, Dc
+;; in ARM/Thumb-2 state: Da, Db, Dc, Dv
 
 ;; The following memory constraints have been used:
 ;; in ARM/Thumb-2 state: Q, Uv, Uy
@@ -40,11 +40,18 @@
 (define_register_constraint "f" "TARGET_ARM ? FPA_REGS : NO_REGS"
  "Legacy FPA registers @code{f0}-@code{f7}.")
 
+(define_register_constraint "t" "TARGET_32BIT ? VFP_LO_REGS : NO_REGS"
+ "The VFP registers @code{s0}-@code{s31}.")
+
 (define_register_constraint "v" "TARGET_ARM ? CIRRUS_REGS : NO_REGS"
  "The Cirrus Maverick co-processor registers.")
 
-(define_register_constraint "w" "TARGET_ARM ? VFP_REGS : NO_REGS"
- "The VFP registers @code{s0}-@code{s31}.")
+(define_register_constraint "w"
+  "TARGET_32BIT ? (TARGET_VFP3 ? VFP_REGS : VFP_LO_REGS) : NO_REGS"
+ "The VFP registers @code{d0}-@code{d15}, or @code{d0}-@code{d31} for VFPv3.")
+
+(define_register_constraint "x" "TARGET_32BIT ? VFP_D0_D7_REGS : NO_REGS"
+ "The VFP registers @code{d0}-@code{d7}.")
 
 (define_register_constraint "y" "TARGET_REALLY_IWMMXT ? IWMMXT_REGS : NO_REGS"
  "The Intel iWMMX co-processor registers.")
@@ -157,6 +164,13 @@
       (match_test "TARGET_32BIT && arm_const_double_inline_cost (op) == 4
 		   && !(optimize_size || arm_ld_sched)")))
 
+(define_constraint "Dv"
+ "@internal
+  In ARM/Thumb-2 state a const_double which can be used with a VFP fconsts
+  or fconstd instruction."
+ (and (match_code "const_double")
+      (match_test "TARGET_32BIT && vfp3_const_double_rtx (op)")))
+
 (define_memory_constraint "Uv"
  "@internal
   In ARM/Thumb-2 state a valid VFP load/store address."

Follow-Ups:
- Re: [ARM] ARM NEON support part 1/7: VFPv3 support
  - From: Richard Earnshaw

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]