ia64 long double support

Richard Henderson rth@cygnus.com
Mon Aug 14 14:23:00 GMT 2000


IA-64 uses the same 80-bit format that IA-32 uses.  The quirk being that
memory loads and stores absolutely positively must be 128-bit aligned.
We tried reusing the hacks that were added for the i960 (which has the
same issue), but found that somewhere in the last 5 years they had broken
pretty severely.

This patch takes a much simpler approach.  Namely, we consider long double
to be a 128-bit TFmode type.  This fixes all of the type alignment and size
issues.

The only thing left is real.c, which wants TFmode to be the IEEE Quad
precision format.  I hack this by defining INTEL_EXTENDED_IEEE_FORMAT to
do what i mean.  All in all, I think this is much cleaner than the horror
of tracking down one more missed invocation of ROUND_TYPE_SIZE.

The other bit of quirkiness in here is that it is impossible, or at least
exceedingly difficult, to move TFmode data between FR_REGS and GR_REGS
at reload time.  My solution, which I'm not exactly happy about, is to
take proactive measures to prevent any situation like this from ocurring.
Despite the fact that varargs functions take their TFmode arguments in
GR_REGS.  I do this by breaking up any (subreg:TF (reg:TI)) constructs
I see, as well as dumping to memory any register I can't prove is going
to be a GR_REG to begin with.  Plus defining predicates such that these
things can't creep in again.

This, surprisingly, seems to work fairly well.  And the generated code
is as good or better than you could hope for from reload, since CSE
and friends got a crack at optimizing the memory references.



r~

	* configure.in (ia64-*): Set float_format for i386 long double.

	* real.c (GET_REAL): Treat 128-bit INTEL_EXTENDED_IEEE_FORMAT
	as we would for i386 XFmode.
	(PUT_REAL): Likewise.
	(endian, ereal_atof, real_value_truncate): Likewise.
	(ereal_isneg, toe64, etens, make_nan): Likewise.
	* real.h (REAL_VALUE_TO_TARGET_LONG_DOUBLE): Likewise.

	* config/ia64/ia64-protos.h: Update.
	* config/ia64/ia64.c (general_tfmode_operand): New.
	(destination_tfmode_operand): New.
	(tfreg_or_fp01_operand): New.
	(ia64_split_timode): New.
	(spill_tfmode_operand): New.
	(ia64_expand_prologue): Use TFmode not XFmode.
	(ia64_expand_epilogue): Likewise.
	(ia64_function_arg): Likewise.
	(ia64_function_arg_advance): Likewise.
	(ia64_return_in_memory): Likewise.
	(ia64_function_value): Likewise.
	(ia64_print_operand): Likewise.
	(ia64_register_move_cost): Set GR<->FR to 5.
	(ia64_secondary_reload_class): Get GR for TImode memory op.
	* config/ia64/ia64.h (ROUND_TYPE_SIZE): Remove.
	(ROUND_TYPE_ALIGN): Remove.
	(LONG_DOUBLE_TYPE_SIZE): Set to 128.
	(INTEL_EXTENDED_IEEE_FORMAT): Define.
	(HARD_REGNO_NREGS): Use TFmode, not XFmode.
	(HARD_REGNO_MODE_OK): Likewise.  Disallow TImode in FRs.
	(MODES_TIEABLE_P): Use TFmode, not XFmode.
	(CLASS_MAX_NREGS): Likewise.
	(ASM_OUTPUT_LONG_DOUBLE): Output by 4 byte hunks.
	(PREDICATE_CODES): Update.
	* config/ia64/ia64.md (movti): New.
	(movti_internal): Use a clobber for memory alternatives.
	(reload_inti, reload_outti): New.
	(movsfcc_astep): Predicate properly.
	(movdfcc_astep): Likewise.
	(movxf): Remove.
	(movtf): New.
	(extendsftf2, extenddftf2): New.
	(trunctfsf2, trunctfdf2): New.
	(floatditf2, fix_trunctfdi2): New.
	(floatunsditf2, fixuns_trunctfdi2): New.
	(addtf3, subtf3, multf3, abstf2): New.
	(negtf2, nabstf2, mintf3, maxtf3): New.
	(maddtf3, msubtf3, nmultf3, nmaddtf3): New.
	(cmptf): New.
	(fr_spill): Use TFmode, not XFmode.
	(fr_restore): Likewise.
	* config/ia64/lib1funcs.asm (__divtf3): New.
	* config/ia64/t-ia64 (LIB1ASMFUNCS): Add it.

Index: configure.in
===================================================================
RCS file: /cvs/gcc/egcs/gcc/configure.in,v
retrieving revision 1.406
diff -c -p -d -r1.406 configure.in
*** configure.in	2000/08/14 18:08:45	1.406
--- configure.in	2000/08/14 21:00:05
*************** changequote([,])dnl
*** 2020,2025 ****
--- 2020,2026 ----
  		then
  			target_cpu_default="${target_cpu_default}|MASK_GNU_LD"
  		fi
+ 		float_format=i386
  		;;
  	ia64*-*-linux*)
  		tm_file=ia64/linux.h
*************** changequote([,])dnl
*** 2028,2033 ****
--- 2029,2035 ----
   		if test x$enable_threads = xyes; then
   			thread_file='posix'
   		fi
+ 		float_format=i386
  		;;
  	m32r-*-elf*)
  		extra_parts="crtinit.o crtfini.o"
Index: real.c
===================================================================
RCS file: /cvs/gcc/egcs/gcc/real.c,v
retrieving revision 1.41
diff -c -p -d -r1.41 real.c
*** real.c	2000/07/26 23:18:47	1.41
--- real.c	2000/08/14 21:00:05
*************** netlib.att.com: netlib/cephes.   */
*** 95,101 ****
  
     The case LONG_DOUBLE_TYPE_SIZE = 128 activates TFmode support
     and may deactivate XFmode since `long double' is used to refer
!    to both modes.
  
     The macros FLOAT_WORDS_BIG_ENDIAN, HOST_FLOAT_WORDS_BIG_ENDIAN,
     contributed by Richard Earnshaw <Richard.Earnshaw@cl.cam.ac.uk>,
--- 95,103 ----
  
     The case LONG_DOUBLE_TYPE_SIZE = 128 activates TFmode support
     and may deactivate XFmode since `long double' is used to refer
!    to both modes.  Defining INTEL_EXTENDED_IEEE_FORMAT at the same 
!    time enables 80387-style 80-bit floats in a 128-bit padded
!    image, as seen on IA-64.
  
     The macros FLOAT_WORDS_BIG_ENDIAN, HOST_FLOAT_WORDS_BIG_ENDIAN,
     contributed by Richard Earnshaw <Richard.Earnshaw@cl.cam.ac.uk>,
*************** unknown arithmetic type
*** 244,273 ****
     A REAL_VALUE_TYPE is guaranteed to occupy contiguous locations
     in memory, with no holes.  */
  
! #if MAX_LONG_DOUBLE_TYPE_SIZE == 96
  /* Number of 16 bit words in external e type format */
! #define NE 6
! #define MAXDECEXP 4932
! #define MINDECEXP -4956
! #define GET_REAL(r,e) bcopy ((char *) r, (char *) e, 2*NE)
! #define PUT_REAL(e,r)				\
! do {						\
!   if (2*NE < sizeof(*r))			\
!     bzero((char *)r, sizeof(*r));		\
!   bcopy ((char *) e, (char *) r, 2*NE);		\
! } while (0)
! #else /* no XFmode */
! #if MAX_LONG_DOUBLE_TYPE_SIZE == 128
! #define NE 10
! #define MAXDECEXP 4932
! #define MINDECEXP -4977
! #define GET_REAL(r,e) bcopy ((char *) r, (char *) e, 2*NE)
! #define PUT_REAL(e,r)				\
! do {						\
!   if (2*NE < sizeof(*r))			\
!     bzero((char *)r, sizeof(*r));		\
!   bcopy ((char *) e, (char *) r, 2*NE);		\
! } while (0)
  #else
  #define NE 6
  #define MAXDECEXP 4932
--- 246,276 ----
     A REAL_VALUE_TYPE is guaranteed to occupy contiguous locations
     in memory, with no holes.  */
  
! #if MAX_LONG_DOUBLE_TYPE_SIZE == 96 || \
!     (defined(INTEL_EXTENDED_IEEE_FORMAT) && MAX_LONG_DOUBLE_TYPE_SIZE == 128)
  /* Number of 16 bit words in external e type format */
! # define NE 6
! # define MAXDECEXP 4932
! # define MINDECEXP -4956
! # define GET_REAL(r,e)  memcpy ((char *)(e), (char *)(r), 2*NE)
! # define PUT_REAL(e,r)						\
! 	do {							\
! 	  memcpy ((char *)(r), (char *)(e), 2*NE);		\
! 	  if (2*NE < sizeof(*r))				\
! 	    memset ((char *)(r) + 2*NE, 0, sizeof(*r) - 2*NE);	\
! 	} while (0)
! # else /* no XFmode */
! #  if MAX_LONG_DOUBLE_TYPE_SIZE == 128
! #   define NE 10
! #   define MAXDECEXP 4932
! #   define MINDECEXP -4977
! #   define GET_REAL(r,e) memcpy ((char *)(e), (char *)(r), 2*NE)
! #   define PUT_REAL(e,r)					\
! 	do {							\
! 	  memcpy ((char *)(r), (char *)(e), 2*NE);		\
! 	  if (2*NE < sizeof(*r))				\
! 	    memset ((char *)(r) + 2*NE, 0, sizeof(*r) - 2*NE);	\
! 	} while (0)
  #else
  #define NE 6
  #define MAXDECEXP 4932
*************** endian (e, x, mode)
*** 497,507 ****
--- 500,512 ----
        switch (mode)
  	{
  	case TFmode:
+ #ifndef INTEL_EXTENDED_IEEE_FORMAT
  	  /* Swap halfwords in the fourth long.  */
  	  th = (unsigned long) e[6] & 0xffff;
  	  t = (unsigned long) e[7] & 0xffff;
  	  t |= th << 16;
  	  x[3] = (long) t;
+ #endif
  
  	case XFmode:
  	  /* Swap halfwords in the third long.  */
*************** endian (e, x, mode)
*** 539,549 ****
--- 544,556 ----
        switch (mode)
  	{
  	case TFmode:
+ #ifndef INTEL_EXTENDED_IEEE_FORMAT
  	  /* Pack the fourth long.  */
  	  th = (unsigned long) e[7] & 0xffff;
  	  t = (unsigned long) e[6] & 0xffff;
  	  t |= th << 16;
  	  x[3] = (long) t;
+ #endif
  
  	case XFmode:
  	  /* Pack the third long.
*************** ereal_atof (s, t)
*** 737,752 ****
        e53toe (tem, e);
        break;
  
-     case XFmode:
-       asctoe64 (s, tem);
-       e64toe (tem, e);
-       break;
- 
      case TFmode:
        asctoe113 (s, tem);
        e113toe (tem, e);
        break;
  
      default:
        asctoe (s, e);
      }
--- 744,762 ----
        e53toe (tem, e);
        break;
  
      case TFmode:
+ #ifndef INTEL_EXTENDED_IEEE_FORMAT
        asctoe113 (s, tem);
        e113toe (tem, e);
        break;
+ #endif
+       /* FALLTHRU */
  
+     case XFmode:
+       asctoe64 (s, tem);
+       e64toe (tem, e);
+       break;
+ 
      default:
        asctoe (s, e);
      }
*************** real_value_truncate (mode, arg)
*** 1070,1078 ****
--- 1080,1091 ----
    switch (mode)
      {
      case TFmode:
+ #ifndef INTEL_EXTENDED_IEEE_FORMAT
        etoe113 (e, t);
        e113toe (t, t);
        break;
+ #endif
+       /* FALLTHRU */
  
      case XFmode:
        etoe64 (e, t);
*************** ereal_isneg (x)
*** 1486,1492 ****
  
  /*  e type constants used by high precision check routines */
  
! #if MAX_LONG_DOUBLE_TYPE_SIZE == 128
  /* 0.0 */
  unsigned EMUSHORT ezero[NE] =
   {0x0000, 0x0000, 0x0000, 0x0000,
--- 1499,1505 ----
  
  /*  e type constants used by high precision check routines */
  
! #if MAX_LONG_DOUBLE_TYPE_SIZE == 128 && !defined(INTEL_EXTENDED_IEEE_FORMAT)
  /* 0.0 */
  unsigned EMUSHORT ezero[NE] =
   {0x0000, 0x0000, 0x0000, 0x0000,
*************** toe64 (a, b)
*** 3660,3665 ****
--- 3673,3687 ----
  	  /* Clear the last two bytes of 12-byte Intel format */
  	  *(q+1) = 0;
  	}
+ #ifdef INTEL_EXTENDED_IEEE_FORMAT
+       if (LONG_DOUBLE_TYPE_SIZE == 128)
+ 	{
+ 	  /* Clear the last 6 bytes of 16-byte Intel format.  */
+ 	  q[1] = 0;
+ 	  q[2] = 0;
+ 	  q[3] = 0;
+ 	}
+ #endif
      }
  #endif
  
*************** enormlz (x)
*** 4560,4566 ****
  #define NTEN 12
  #define MAXP 4096
  
! #if MAX_LONG_DOUBLE_TYPE_SIZE == 128
  static unsigned EMUSHORT etens[NTEN + 1][NE] =
  {
    {0x6576, 0x4a92, 0x804a, 0x153f,
--- 4582,4588 ----
  #define NTEN 12
  #define MAXP 4096
  
! #if MAX_LONG_DOUBLE_TYPE_SIZE == 128 && !defined(INTEL_EXTENDED_IEEE_FORMAT)
  static unsigned EMUSHORT etens[NTEN + 1][NE] =
  {
    {0x6576, 0x4a92, 0x804a, 0x153f,
*************** make_nan (nan, sign, mode)
*** 6276,6287 ****
--- 6298,6312 ----
     used like NaN's, but probably not in the same way as IEEE.  */
  #if !defined(DEC) && !defined(IBM) && !defined(C4X)
      case TFmode:
+ #ifndef INTEL_EXTENDED_IEEE_FORMAT
        n = 8;
        if (REAL_WORDS_BIG_ENDIAN)
  	p = TFbignan;
        else
  	p = TFlittlenan;
        break;
+ #endif
+       /* FALLTHRU */
  
      case XFmode:
        n = 6;
Index: real.h
===================================================================
RCS file: /cvs/gcc/egcs/gcc/real.h,v
retrieving revision 1.24
diff -c -p -d -r1.24 real.h
*** real.h	2000/08/05 00:50:00	1.24
--- real.h	2000/08/14 21:00:05
*************** extern REAL_VALUE_TYPE real_value_trunca
*** 207,217 ****
--- 207,221 ----
    ereal_from_uint (&d, lo, hi, mode)
  
  /* IN is a REAL_VALUE_TYPE.  OUT is an array of longs. */
+ #if defined(INTEL_EXTENDED_IEEE_FORMAT) && MAX_LONG_DOUBLE_TYPE_SIZE == 128
+ #define REAL_VALUE_TO_TARGET_LONG_DOUBLE(IN, OUT) (etarldouble ((IN), (OUT)))
+ #else
  #define REAL_VALUE_TO_TARGET_LONG_DOUBLE(IN, OUT) 		\
     (LONG_DOUBLE_TYPE_SIZE == 64 ? etardouble ((IN), (OUT))	\
      : LONG_DOUBLE_TYPE_SIZE == 96 ? etarldouble ((IN), (OUT))	\
      : LONG_DOUBLE_TYPE_SIZE == 128 ? etartdouble ((IN), (OUT))  \
      : abort())
+ #endif
  #define REAL_VALUE_TO_TARGET_DOUBLE(IN, OUT) (etardouble ((IN), (OUT)))
  
  /* IN is a REAL_VALUE_TYPE.  OUT is a long. */
Index: config/ia64/ia64-protos.h
===================================================================
RCS file: /cvs/gcc/egcs/gcc/config/ia64/ia64-protos.h,v
retrieving revision 1.16
diff -c -p -d -r1.16 ia64-protos.h
*** ia64-protos.h	2000/08/08 10:01:20	1.16
--- ia64-protos.h	2000/08/14 21:00:06
*************** extern int ia64_direct_return PARAMS((vo
*** 59,67 ****
--- 59,72 ----
  extern int predicate_operator PARAMS((rtx, enum machine_mode));
  extern int ar_lc_reg_operand PARAMS((rtx, enum machine_mode));
  extern int ar_ccv_reg_operand PARAMS((rtx, enum machine_mode));
+ extern int general_tfmode_operand PARAMS((rtx, enum machine_mode));
+ extern int destination_tfmode_operand PARAMS((rtx, enum machine_mode));
+ extern int tfreg_or_fp01_operand PARAMS((rtx, enum machine_mode));
  
  extern int ia64_move_ok PARAMS((rtx, rtx));
  extern rtx ia64_gp_save_reg PARAMS((int));
+ extern rtx ia64_split_timode PARAMS((rtx[], rtx, rtx));
+ extern rtx spill_tfmode_operand PARAMS((rtx, int));
  
  extern void ia64_expand_load_address PARAMS((rtx, rtx));
  extern void ia64_expand_fetch_and_op PARAMS ((enum fetchop_code,
*************** extern void ia64_output_end_prologue PAR
*** 112,117 ****
  extern void ia64_init_builtins PARAMS((void));
  extern void ia64_override_options PARAMS((void));
  extern int ia64_dbx_register_number PARAMS((int));
- 
- /* ??? Flag defined in toplev.c, for ia64.md -fssa hack.  */
- extern int flag_ssa;
--- 117,119 ----
Index: config/ia64/ia64.c
===================================================================
RCS file: /cvs/gcc/egcs/gcc/config/ia64/ia64.c,v
retrieving revision 1.41
diff -c -p -d -r1.41 ia64.c
*** ia64.c	2000/08/11 23:10:10	1.41
--- ia64.c	2000/08/14 21:00:06
*************** ar_ccv_reg_operand (op, mode)
*** 570,575 ****
--- 570,615 ----
  	  && GET_CODE (op) == REG
  	  && REGNO (op) == AR_CCV_REGNUM);
  }
+ 
+ /* Like general_operand, but don't allow (mem (addressof)).  */
+ 
+ int
+ general_tfmode_operand (op, mode)
+      rtx op;
+      enum machine_mode mode;
+ {
+   if (! general_operand (op, mode))
+     return 0;
+   if (GET_CODE (op) == MEM && GET_CODE (XEXP (op, 0)) == ADDRESSOF)
+     return 0;
+   return 1;
+ }
+ 
+ /* Similarly.  */
+ 
+ int
+ destination_tfmode_operand (op, mode)
+      rtx op;
+      enum machine_mode mode;
+ {
+   if (! destination_operand (op, mode))
+     return 0;
+   if (GET_CODE (op) == MEM && GET_CODE (XEXP (op, 0)) == ADDRESSOF)
+     return 0;
+   return 1;
+ }
+ 
+ /* Similarly.  */
+ 
+ int
+ tfreg_or_fp01_operand (op, mode)
+      rtx op;
+      enum machine_mode mode;
+ {
+   if (GET_CODE (op) == SUBREG)
+     return 0;
+   return reg_or_fp01_operand (op, mode);
+ }
  
  /* Return 1 if the operands of a move are ok.  */
  
*************** ia64_gp_save_reg (setjmp_p)
*** 681,686 ****
--- 721,826 ----
  
    return save;
  }
+ 
+ /* Split a post-reload TImode reference into two DImode components.  */
+ 
+ rtx
+ ia64_split_timode (out, in, scratch)
+      rtx out[2];
+      rtx in, scratch;
+ {
+   switch (GET_CODE (in))
+     {
+     case REG:
+       out[0] = gen_rtx_REG (DImode, REGNO (in));
+       out[1] = gen_rtx_REG (DImode, REGNO (in) + 1);
+       return NULL_RTX;
+ 
+     case MEM:
+       {
+ 	HOST_WIDE_INT offset;
+ 	rtx base = XEXP (in, 0);
+ 	rtx offset_rtx;
+ 
+ 	switch (GET_CODE (base))
+ 	  {
+ 	  case REG:
+ 	    out[0] = change_address (in, DImode, NULL_RTX);
+ 	    break;
+ 	  case POST_MODIFY:
+ 	    base = XEXP (base, 0);
+ 	    out[0] = change_address (in, DImode, NULL_RTX);
+ 	    break;
+ 
+ 	  /* Since we're changing the mode, we need to change to POST_MODIFY
+ 	     as well to preserve the size of the increment.  Either that or
+ 	     do the update in two steps, but we've already got this scratch
+ 	     register handy so let's use it.  */
+ 	  case POST_INC:
+ 	    base = XEXP (base, 0);
+ 	    out[0] = change_address (in, DImode,
+ 	      gen_rtx_POST_MODIFY (Pmode, base,plus_constant (base, 16)));
+ 	    break;
+ 	  case POST_DEC:
+ 	    base = XEXP (base, 0);
+ 	    out[0] = change_address (in, DImode,
+ 	      gen_rtx_POST_MODIFY (Pmode, base,plus_constant (base, -16)));
+ 	    break;
+ 	  default:
+ 	    abort ();
+ 	  }
+ 
+ 	if (scratch == NULL_RTX)
+ 	  abort ();
+ 	out[1] = change_address (in, DImode, scratch);
+ 	return gen_adddi3 (scratch, base, GEN_INT (8));
+       }
+ 
+     case CONST_INT:
+     case CONST_DOUBLE:
+       split_double (in, &out[0], &out[1]);
+       return NULL_RTX;
+ 
+     default:
+       abort ();
+     }
+ }
+ 
+ /* ??? Fixing GR->FR TFmode moves during reload is hard.  You need to go
+    through memory plus an extra GR scratch register.  Except that you can
+    either get the first from SECONDARY_MEMORY_NEEDED or the second from
+    SECONDARY_RELOAD_CLASS, but not both.
+ 
+    We got into problems in the first place by allowing a construct like
+    (subreg:TF (reg:TI)), which we got from a union containing a long double.  
+    This solution attempts to prevent this situation from ocurring.  When
+    we see something like the above, we spill the inner register to memory.  */
+ 
+ rtx
+ spill_tfmode_operand (in, force)
+      rtx in;
+      int force;
+ {
+   if (GET_CODE (in) == SUBREG
+       && GET_MODE (SUBREG_REG (in)) == TImode
+       && GET_CODE (SUBREG_REG (in)) == REG)
+     {
+       rtx mem = gen_mem_addressof (SUBREG_REG (in), NULL_TREE);
+       return gen_rtx_MEM (TFmode, copy_to_reg (XEXP (mem, 0)));
+     }
+   else if (force && GET_CODE (in) == REG)
+     {
+       rtx mem = gen_mem_addressof (in, NULL_TREE);
+       return gen_rtx_MEM (TFmode, copy_to_reg (XEXP (mem, 0)));
+     }
+   else if (GET_CODE (in) == MEM
+ 	   && GET_CODE (XEXP (in, 0)) == ADDRESSOF)
+     {
+       return change_address (in, TFmode, copy_to_reg (XEXP (in, 0)));
+     }
+   else
+     return in;
+ }
  
  /* Begin the assembly file.  */
  
*************** ia64_expand_prologue ()
*** 1702,1708 ****
        {
          if (cfa_off & 15)
  	  abort ();
! 	reg = gen_rtx_REG (XFmode, regno);
  	do_spill (gen_fr_spill_x, reg, cfa_off, reg);
  	cfa_off -= 16;
        }
--- 1842,1848 ----
        {
          if (cfa_off & 15)
  	  abort ();
! 	reg = gen_rtx_REG (TFmode, regno);
  	do_spill (gen_fr_spill_x, reg, cfa_off, reg);
  	cfa_off -= 16;
        }
*************** ia64_expand_epilogue ()
*** 1867,1873 ****
        {
          if (cfa_off & 15)
  	  abort ();
! 	reg = gen_rtx_REG (XFmode, regno);
  	do_restore (gen_fr_restore_x, reg, cfa_off);
  	cfa_off -= 16;
        }
--- 2007,2013 ----
        {
          if (cfa_off & 15)
  	  abort ();
! 	reg = gen_rtx_REG (TFmode, regno);
  	do_restore (gen_fr_restore_x, reg, cfa_off);
  	cfa_off -= 16;
        }
*************** ia64_function_arg (cum, mode, type, name
*** 2304,2310 ****
  				      gen_rtx_REG (hfa_mode, (FR_ARG_FIRST
  							      + fp_regs)),
  				      GEN_INT (offset));
- 	  /* ??? Padding for XFmode type?  */
  	  offset += hfa_size;
  	  args_byte_size += hfa_size;
  	  fp_regs++;
--- 2444,2449 ----
*************** ia64_function_arg_advance (cum, mode, ty
*** 2484,2490 ****
        for (; (offset < byte_size && fp_regs < MAX_ARGUMENT_SLOTS
  	      && args_byte_size < (MAX_ARGUMENT_SLOTS * UNITS_PER_WORD));)
  	{
- 	  /* ??? Padding for XFmode type?  */
  	  offset += hfa_size;
  	  args_byte_size += hfa_size;
  	  fp_regs++;
--- 2623,2628 ----
*************** ia64_return_in_memory (valtype)
*** 2586,2592 ****
      {
        int hfa_size = GET_MODE_SIZE (hfa_mode);
  
-       /* ??? Padding for XFmode type?  */
        if (byte_size / hfa_size > MAX_ARGUMENT_SLOTS)
  	return 1;
        else
--- 2724,2729 ----
*************** ia64_function_value (valtype, func)
*** 2629,2635 ****
  	  loc[i] = gen_rtx_EXPR_LIST (VOIDmode,
  				      gen_rtx_REG (hfa_mode, FR_ARG_FIRST + i),
  				      GEN_INT (offset));
- 	  /* ??? Padding for XFmode type?  */
  	  offset += hfa_size;
  	}
  
--- 2766,2771 ----
*************** ia64_print_operand (file, x, code)
*** 2782,2800 ****
  
  	  case POST_INC:
  	    value = GET_MODE_SIZE (GET_MODE (x));
- 
- 	    /* ??? This is for ldf.fill and stf.spill which use XFmode,
- 	       but which actually need 16 bytes increments.  Perhaps we
- 	       can change them to use TFmode instead.  Or don't use
- 	       POST_DEC/POST_INC for them.  */
- 	    if (value == 12)
- 	      value = 16;
  	    break;
  
  	  case POST_DEC:
  	    value = - (HOST_WIDE_INT) GET_MODE_SIZE (GET_MODE (x));
- 	    if (value == -12)
- 	      value = -16;
  	    break;
  	  }
  
--- 2918,2927 ----
*************** ia64_register_move_cost (from, to)
*** 2930,2946 ****
--- 3057,3084 ----
  {
    int from_hard, to_hard;
    int from_gr, to_gr;
+   int from_fr, to_fr;
  
    from_hard = (from == BR_REGS || from == AR_M_REGS || from == AR_I_REGS);
    to_hard = (to == BR_REGS || to == AR_M_REGS || to == AR_I_REGS);
    from_gr = (from == GENERAL_REGS);
    to_gr = (to == GENERAL_REGS);
+   from_fr = (from == FR_REGS);
+   to_fr = (to == FR_REGS);
  
    if (from_hard && to_hard)
      return 8;
    else if ((from_hard && !to_gr) || (!from_gr && to_hard))
      return 6;
  
+   /* ??? Moving from FR<->GR must be more expensive than 2, so that we get
+      secondary memory reloads for TFmode moves.  Unfortunately, we don't
+      have the mode here, so we can't check that.  */
+   /* Moreover, we have to make this at least as high as MEMORY_MOVE_COST
+      to avoid spectacularly poor register class preferencing for TFmode.  */
+   else if (from_fr != to_fr)
+     return 5;
+ 
    return 2;
  }
  
*************** ia64_secondary_reload_class (class, mode
*** 3015,3020 ****
--- 3153,3165 ----
  	 common for C++ programs that use exceptions.  To reproduce,
  	 return NO_REGS and compile libstdc++.  */
        if (GET_CODE (x) == MEM)
+ 	return GR_REGS;
+       break;
+ 
+     case GR_REGS:
+       /* Since we have no offsettable memory addresses, we need a temporary
+ 	 to hold the address of the second word.  */
+       if (mode == TImode)
  	return GR_REGS;
        break;
  
Index: config/ia64/ia64.h
===================================================================
RCS file: /cvs/gcc/egcs/gcc/config/ia64/ia64.h,v
retrieving revision 1.35
diff -c -p -d -r1.35 ia64.h
*** ia64.h	2000/08/14 20:18:17	1.35
--- ia64.h	2000/08/14 21:00:06
*************** while (0)
*** 383,405 ****
     a field, not crossing a boundary for it.  */
  #define PCC_BITFIELD_TYPE_MATTERS 1
  
- /* Define this macro as an expression for the overall size of a structure
-    (given by STRUCT as a tree node) when the size computed from the fields is
-    SIZE and the alignment is ALIGN.
- 
-    The default is to round SIZE up to a multiple of ALIGN.  */
- /* ??? Might need this for 80-bit double-extended floats.  */
- /* #define ROUND_TYPE_SIZE(STRUCT, SIZE, ALIGN) */
- 
- /* Define this macro as an expression for the alignment of a structure (given
-    by STRUCT as a tree node) if the alignment computed in the usual way is
-    COMPUTED and the alignment explicitly specified was SPECIFIED.
- 
-    The default is to use SPECIFIED if it is larger; otherwise, use the smaller
-    of COMPUTED and `BIGGEST_ALIGNMENT' */
- /* ??? Might need this for 80-bit double-extended floats.  */
- /* #define ROUND_TYPE_ALIGN(STRUCT, COMPUTED, SPECIFIED) */
- 
  /* An integer expression for the size in bits of the largest integer machine
     mode that should actually be used.  */
  
--- 383,388 ----
*************** while (0)
*** 465,473 ****
  
  /* A C expression for the size in bits of the type `long double' on the target
     machine.  If you don't define this, the default is two words.  */
! /* ??? We have an 80 bit extended double format.  */
! #define LONG_DOUBLE_TYPE_SIZE 64
  
  /* An expression whose value is 1 or 0, according to whether the type `char'
     should be signed or unsigned by default.  The user can always override this
     default with the options `-fsigned-char' and `-funsigned-char'.  */
--- 448,459 ----
  
  /* A C expression for the size in bits of the type `long double' on the target
     machine.  If you don't define this, the default is two words.  */
! #define LONG_DOUBLE_TYPE_SIZE 128
  
+ /* Tell real.c that this is the 80-bit Intel extended float format
+    packaged in a 128-bit entity.  */
+ #define INTEL_EXTENDED_IEEE_FORMAT
+ 
  /* An expression whose value is 1 or 0, according to whether the type `char'
     should be signed or unsigned by default.  The user can always override this
     default with the options `-fsigned-char' and `-funsigned-char'.  */
*************** while (0)
*** 812,818 ****
  /* A C expression for the number of consecutive hard registers, starting at
     register number REGNO, required to hold a value of mode MODE.  */
  
- /* ??? x86 80-bit FP values only require 1 register.  */
  /* ??? We say that CCmode values require two registers.  This allows us to
     easily store the normal and inverted values.  We use CCImode to indicate
     a single predicate register.  */
--- 798,803 ----
*************** while (0)
*** 821,839 ****
    ((REGNO) == PR_REG (0) && (MODE) == DImode ? 64			\
     : PR_REGNO_P (REGNO) && (MODE) == CCmode ? 2				\
     : PR_REGNO_P (REGNO) && (MODE) == CCImode ? 1			\
!    : FR_REGNO_P (REGNO) && (MODE) == XFmode ? 1				\
     : (GET_MODE_SIZE (MODE) + UNITS_PER_WORD - 1) / UNITS_PER_WORD)
  
  /* A C expression that is nonzero if it is permissible to store a value of mode
     MODE in hard register number REGNO (or in several registers starting with
     that one).  */
  
! #define HARD_REGNO_MODE_OK(REGNO, MODE) \
!   (FR_REGNO_P (REGNO) ? GET_MODE_CLASS (MODE) != MODE_CC		\
     : PR_REGNO_P (REGNO) ? GET_MODE_CLASS (MODE) == MODE_CC		\
!    : GR_REGNO_P (REGNO) ? (MODE) != XFmode && (MODE) != CCImode		\
     : AR_REGNO_P (REGNO) ? (MODE) == DImode				\
!    : 1)
  
  /* A C expression that is nonzero if it is desirable to choose register
     allocation so as to avoid move instructions between a value of mode MODE1
--- 806,825 ----
    ((REGNO) == PR_REG (0) && (MODE) == DImode ? 64			\
     : PR_REGNO_P (REGNO) && (MODE) == CCmode ? 2				\
     : PR_REGNO_P (REGNO) && (MODE) == CCImode ? 1			\
!    : FR_REGNO_P (REGNO) && (MODE) == TFmode ? 1				\
     : (GET_MODE_SIZE (MODE) + UNITS_PER_WORD - 1) / UNITS_PER_WORD)
  
  /* A C expression that is nonzero if it is permissible to store a value of mode
     MODE in hard register number REGNO (or in several registers starting with
     that one).  */
  
! #define HARD_REGNO_MODE_OK(REGNO, MODE)					\
!   (FR_REGNO_P (REGNO) ? GET_MODE_CLASS (MODE) != MODE_CC && (MODE) != TImode \
     : PR_REGNO_P (REGNO) ? GET_MODE_CLASS (MODE) == MODE_CC		\
!    : GR_REGNO_P (REGNO) ? (MODE) != CCImode && (MODE) != TFmode		\
     : AR_REGNO_P (REGNO) ? (MODE) == DImode				\
!    : BR_REGNO_P (REGNO) ? (MODE) == DImode				\
!    : 0)
  
  /* A C expression that is nonzero if it is desirable to choose register
     allocation so as to avoid move instructions between a value of mode MODE1
*************** while (0)
*** 846,856 ****
     INTEGRAL_MODE_P or FLOAT_MODE_P and the other is not.  Otherwise, it is
     true.  */
  /* Don't tie integer and FP modes, as that causes us to get integer registers
!    allocated for FP instructions.  XFmode only supported in FP registers at
!    the moment, so we can't tie it with any other modes.  */
  #define MODES_TIEABLE_P(MODE1, MODE2) \
    ((GET_MODE_CLASS (MODE1) == GET_MODE_CLASS (MODE2)) \
!    && (((MODE1) == XFmode) == ((MODE2) == XFmode)))
  
  /* Define this macro if the compiler should avoid copies to/from CCmode
     registers.  You should only define this macro if support fo copying to/from
--- 832,842 ----
     INTEGRAL_MODE_P or FLOAT_MODE_P and the other is not.  Otherwise, it is
     true.  */
  /* Don't tie integer and FP modes, as that causes us to get integer registers
!    allocated for FP instructions.  TFmode only supported in FP registers so
!    we can't tie it with any other modes.  */
  #define MODES_TIEABLE_P(MODE1, MODE2) \
    ((GET_MODE_CLASS (MODE1) == GET_MODE_CLASS (MODE2)) \
!    && (((MODE1) == TFmode) == ((MODE2) == TFmode)))
  
  /* Define this macro if the compiler should avoid copies to/from CCmode
     registers.  You should only define this macro if support fo copying to/from
*************** enum reg_class
*** 1044,1053 ****
     registers of CLASS1 can only be copied to registers of class CLASS2 by
     storing a register of CLASS1 into memory and loading that memory location
     into a register of CLASS2.  */
! /* ??? We may need this for XFmode moves between FR and GR regs.  Using
!    getf.sig/getf.exp almost works, but the result in the GR regs is not
!    properly formatted and has two extra bits.  */
! /* #define SECONDARY_MEMORY_NEEDED(CLASS1, CLASS2, M) */
  
  /* A C expression for the maximum number of consecutive registers of
     class CLASS needed to hold a value of mode MODE.
--- 1030,1045 ----
     registers of CLASS1 can only be copied to registers of class CLASS2 by
     storing a register of CLASS1 into memory and loading that memory location
     into a register of CLASS2.  */
! 
! #if 0
! /* ??? May need this, but since we've disallowed TFmode in GR_REGS,
!    I'm not quite sure how it could be invoked.  The normal problems
!    with unions should be solved with the addressof fiddling done by
!    movtf and friends.  */
! #define SECONDARY_MEMORY_NEEDED(CLASS1, CLASS2, MODE)			\
!   ((MODE) == TFmode && (((CLASS1) == GR_REGS && (CLASS2) == FR_REGS)	\
! 			|| ((CLASS1) == FR_REGS && (CLASS2) == GR_REGS)))
! #endif
  
  /* A C expression for the maximum number of consecutive registers of
     class CLASS needed to hold a value of mode MODE.
*************** enum reg_class
*** 1055,1061 ****
  
  #define CLASS_MAX_NREGS(CLASS, MODE) \
    ((MODE) == CCmode && (CLASS) == PR_REGS ? 2			\
!    : ((CLASS) == FR_REGS && (MODE) == XFmode) ? 1		\
     : (GET_MODE_SIZE (MODE) + UNITS_PER_WORD - 1) / UNITS_PER_WORD)
  
  /* If defined, gives a class of registers that cannot be used as the
--- 1047,1053 ----
  
  #define CLASS_MAX_NREGS(CLASS, MODE) \
    ((MODE) == CCmode && (CLASS) == PR_REGS ? 2			\
!    : ((CLASS) == FR_REGS && (MODE) == TFmode) ? 1		\
     : (GET_MODE_SIZE (MODE) + UNITS_PER_WORD - 1) / UNITS_PER_WORD)
  
  /* If defined, gives a class of registers that cannot be used as the
*************** do {									\
*** 1785,1796 ****
     executed if memory address X (an RTX) can have different meanings depending
     on the machine mode of the memory reference it is used for or if the address
     is valid for some modes but not others.  */
- 
- /* ??? Strictly speaking this isn't true, because we can use any increment with
-    any mode.  Unfortunately, the RTL implies that the increment depends on the
-    mode, so we need this for now.  */
  
! #define GO_IF_MODE_DEPENDENT_ADDRESS(ADDR, LABEL) \
    if (GET_CODE (ADDR) == POST_DEC || GET_CODE (ADDR) == POST_INC)	\
      goto LABEL;
  
--- 1777,1784 ----
     executed if memory address X (an RTX) can have different meanings depending
     on the machine mode of the memory reference it is used for or if the address
     is valid for some modes but not others.  */
  
! #define GO_IF_MODE_DEPENDENT_ADDRESS(ADDR, LABEL)			\
    if (GET_CODE (ADDR) == POST_DEC || GET_CODE (ADDR) == POST_INC)	\
      goto LABEL;
  
*************** do {									\
*** 1996,2015 ****
  /* Output of Data.  */
  
  /* A C statement to output to the stdio stream STREAM an assembler instruction
!    to assemble a floating-point constant of `XFmode', `DFmode', `SFmode',
     respectively, whose value is VALUE.  */
  
- /* ??? This has not been tested.  Long doubles are really 10 bytes not 12
-    bytes on ia64.  */
- 
  /* ??? Must reverse the word order for big-endian code?  */
  
  #define ASM_OUTPUT_LONG_DOUBLE(FILE, VALUE) \
  do {									\
    long t[3];								\
    REAL_VALUE_TO_TARGET_LONG_DOUBLE (VALUE, t);				\
!   fprintf (FILE, "\tdata8 0x%08lx, 0x%08lx, 0x%08lx\n",			\
! 	   t[0] & 0xffffffff, t[1] & 0xffffffff, t[2] & 0xffffffff);	\
  } while (0)
  
  /* ??? Must reverse the word order for big-endian code?  */
--- 1984,2000 ----
  /* Output of Data.  */
  
  /* A C statement to output to the stdio stream STREAM an assembler instruction
!    to assemble a floating-point constant of `TFmode', `DFmode', `SFmode',
     respectively, whose value is VALUE.  */
  
  /* ??? Must reverse the word order for big-endian code?  */
  
  #define ASM_OUTPUT_LONG_DOUBLE(FILE, VALUE) \
  do {									\
    long t[3];								\
    REAL_VALUE_TO_TARGET_LONG_DOUBLE (VALUE, t);				\
!   fprintf (FILE, "\tdata4 0x%08lx, 0x%08lx, 0x%08lx, 0x%08lx\n",	\
! 	   t[0] & 0xffffffff, t[1] & 0xffffffff, t[2] & 0xffffffff, 0);	\
  } while (0)
  
  /* ??? Must reverse the word order for big-endian code?  */
*************** do {									\
*** 2667,2679 ****
  				  CONSTANT_P_RTX}},			\
  { "shladd_operand", {CONST_INT}},					\
  { "fetchadd_operand", {CONST_INT}},					\
! { "reg_or_fp01_operand", {SUBREG, REG, CONST_DOUBLE, CONSTANT_P_RTX}},	\
  { "normal_comparison_operator", {EQ, NE, GT, LE, GTU, LEU}},		\
  { "adjusted_comparison_operator", {LT, GE, LTU, GEU}},			\
  { "call_multiple_values_operation", {PARALLEL}},			\
  { "predicate_operator", {NE, EQ}},					\
  { "ar_lc_reg_operand", {REG}},						\
! { "ar_ccv_reg_operand", {REG}},
  
  /* An alias for a machine mode name.  This is the machine mode that elements of
     a jump-table should have.  */
--- 2652,2667 ----
  				  CONSTANT_P_RTX}},			\
  { "shladd_operand", {CONST_INT}},					\
  { "fetchadd_operand", {CONST_INT}},					\
! { "reg_or_fp01_operand", {SUBREG, REG, CONST_DOUBLE}},			\
  { "normal_comparison_operator", {EQ, NE, GT, LE, GTU, LEU}},		\
  { "adjusted_comparison_operator", {LT, GE, LTU, GEU}},			\
  { "call_multiple_values_operation", {PARALLEL}},			\
  { "predicate_operator", {NE, EQ}},					\
  { "ar_lc_reg_operand", {REG}},						\
! { "ar_ccv_reg_operand", {REG}},						\
! { "general_tfmode_operand", {SUBREG, REG, CONST_DOUBLE, MEM}},		\
! { "destination_tfmode_operand", {SUBREG, REG, MEM}},			\
! { "tfreg_or_fp01_operand", {REG, CONST_DOUBLE}},
  
  /* An alias for a machine mode name.  This is the machine mode that elements of
     a jump-table should have.  */
Index: config/ia64/ia64.md
===================================================================
RCS file: /cvs/gcc/egcs/gcc/config/ia64/ia64.md,v
retrieving revision 1.33
diff -c -p -d -r1.33 ia64.md
*** ia64.md	2000/08/14 20:28:11	1.33
--- ia64.md	2000/08/14 21:00:06
***************
*** 22,29 ****
  
  ;;- See file "rtl.def" for documentation on define_insn, match_*, et. al.
  
- ;; ??? Add support for long double XFmode patterns.
- 
  ;; ??? register_operand accepts (subreg:DI (mem:SI X)) which forces later
  ;; reload.  This will be fixed once scheduling support is turned on.
  
--- 22,27 ----
***************
*** 575,600 ****
    "addl %0 = @ltoff(%1), gp"
    [(set_attr "type" "A")])
  
! ;; ??? These patterns exist to make SSA happy.  We can get TImode values
! ;; because of structure moves generated for parameter and return value
! ;; loads and stores.
  
! (define_insn "*movti_internal"
!   [(set (match_operand:TI 0 "register_operand" "=r")
! 	(match_operand:TI 1 "register_operand" "r"))]
!   "flag_ssa"
    "#"
    [(set_attr "type" "unknown")
     (set_attr "predicable" "no")])
  
! (define_split
!   [(set (match_operand:TI 0 "register_operand" "")
! 	(match_operand:TI 1 "register_operand" ""))]
!   "flag_ssa && reload_completed"
!   [(set (subreg:DI (match_dup 0) 0) (subreg:DI (match_dup 1) 0))
!    (set (subreg:DI (match_dup 0) 1) (subreg:DI (match_dup 1) 1))]
!   "")
  
  ;; Floating Point Moves
  ;;
  ;; Note - Patterns for SF mode moves are compulsory, but
--- 573,688 ----
    "addl %0 = @ltoff(%1), gp"
    [(set_attr "type" "A")])
  
! ;; With no offsettable memory references, we've got to have a scratch
! ;; around to play with the second word.
! (define_expand "movti"
!   [(parallel [(set (match_operand:TI 0 "general_operand" "")
! 		   (match_operand:TI 1 "general_operand" ""))
! 	      (clobber (match_scratch:DI 2 ""))])]
!   ""
!   "
! {
!   if (! reload_in_progress && ! reload_completed
!       && ! ia64_move_ok (operands[0], operands[1]))
!     operands[1] = force_reg (TImode, operands[1]);
! }")
  
! (define_insn_and_split "*movti_internal"
!   [(set (match_operand:TI 0 "nonimmediate_operand" "=r,r,m")
! 	(match_operand:TI 1 "general_operand"      "ri,m,r"))
!    (clobber (match_scratch:DI 2 "=X,&r,&r"))]
!   "ia64_move_ok (operands[0], operands[1])"
    "#"
+   "reload_completed"
+   [(const_int 0)]
+   "
+ {
+   rtx adj1, adj2, in[2], out[2];
+   int first;
+ 
+   adj1 = ia64_split_timode (in, operands[1], operands[2]);
+   adj2 = ia64_split_timode (out, operands[0], operands[2]);
+ 
+   first = 0;
+   if (reg_overlap_mentioned_p (out[0], in[1]))
+     {
+       if (reg_overlap_mentioned_p (out[1], in[0]))
+ 	abort ();
+       first = 1;
+     }
+ 
+   if (adj1 && adj2)
+     abort ();
+   if (adj1)
+     emit_insn (adj1);
+   if (adj2)
+     emit_insn (adj2);
+   emit_insn (gen_rtx_SET (VOIDmode, out[first], in[first]));
+   emit_insn (gen_rtx_SET (VOIDmode, out[!first], in[!first]));
+   DONE;
+ }"
    [(set_attr "type" "unknown")
     (set_attr "predicable" "no")])
  
! ;; ??? SSA creates these.  Can't allow memories since we don't have
! ;; the scratch register.  Fortunately combine will know how to add
! ;; the clobber and scratch.
! (define_insn_and_split "*movti_internal_reg"
!   [(set (match_operand:TI 0 "register_operand"  "=r")
! 	(match_operand:TI 1 "nonmemory_operand" "ri"))]
!   ""
!   "#"
!   "reload_completed"
!   [(const_int 0)]
!   "
! {
!   rtx in[2], out[2];
!   int first;
! 
!   ia64_split_timode (in, operands[1], NULL_RTX);
!   ia64_split_timode (out, operands[0], NULL_RTX);
! 
!   first = 0;
!   if (reg_overlap_mentioned_p (out[0], in[1]))
!     {
!       if (reg_overlap_mentioned_p (out[1], in[0]))
! 	abort ();
!       first = 1;
!     }
! 
!   emit_insn (gen_rtx_SET (VOIDmode, out[first], in[first]));
!   emit_insn (gen_rtx_SET (VOIDmode, out[!first], in[!first]));
!   DONE;
! }"
!   [(set_attr "type" "unknown")
!    (set_attr "predicable" "no")])
  
+ (define_expand "reload_inti"
+   [(parallel [(set (match_operand:TI 0 "register_operand" "=r")
+ 		   (match_operand:TI 1 "" "m"))
+ 	      (clobber (match_operand:DI 2 "register_operand" "=&r"))])]
+   ""
+   "
+ {
+   /* ??? Should now be enforced by tweeks to push_secondary_reload.  */
+   if (reg_overlap_mentioned_p (operands[2], operands[0])
+       || reg_overlap_mentioned_p (operands[2], operands[1]))
+     abort ();
+ }")
+ 
+ (define_expand "reload_outti"
+   [(parallel [(set (match_operand:TI 0 "" "=m")
+ 		   (match_operand:TI 1 "register_operand" "r"))
+ 	      (clobber (match_operand:DI 2 "register_operand" "=&r"))])]
+   ""
+   "
+ {
+   /* ??? Should now be enforced by tweeks to push_secondary_reload.  */
+   if (reg_overlap_mentioned_p (operands[2], operands[0])
+       || reg_overlap_mentioned_p (operands[2], operands[1]))
+     abort ();
+ }")
+ 
  ;; Floating Point Moves
  ;;
  ;; Note - Patterns for SF mode moves are compulsory, but
***************
*** 621,630 ****
  	  (match_operand:SF 1 "nonmemory_operand" "fG,fG,*r,*r")))]
    "TARGET_A_STEP && ia64_move_ok (operands[0], operands[1])"
    "@
!   mov %0 = %F1
!   getf.s %0 = %F1
!   setf.s %0 = %1
!   mov %0 = %1"
    [(set_attr "type" "F,M,M,A")
     (set_attr "predicable" "no")])
  
--- 709,718 ----
  	  (match_operand:SF 1 "nonmemory_operand" "fG,fG,*r,*r")))]
    "TARGET_A_STEP && ia64_move_ok (operands[0], operands[1])"
    "@
!   (%J2) mov %0 = %F1
!   (%J2) getf.s %0 = %F1
!   (%J2) setf.s %0 = %1
!   (%J2) mov %0 = %1"
    [(set_attr "type" "F,M,M,A")
     (set_attr "predicable" "no")])
  
***************
*** 680,689 ****
  	  (match_operand:DF 1 "nonmemory_operand" "fG,fG,*r,*r")))]
    "TARGET_A_STEP && ia64_move_ok (operands[0], operands[1])"
    "@
!   mov %0 = %F1
!   getf.d %0 = %F1
!   setf.d %0 = %1
!   mov %0 = %1"
    [(set_attr "type" "F,M,M,A")
     (set_attr "predicable" "no")])
  
--- 768,777 ----
  	  (match_operand:DF 1 "nonmemory_operand" "fG,fG,*r,*r")))]
    "TARGET_A_STEP && ia64_move_ok (operands[0], operands[1])"
    "@
!   (%J2) mov %0 = %F1
!   (%J2) getf.d %0 = %F1
!   (%J2) setf.d %0 = %1
!   (%J2) mov %0 = %1"
    [(set_attr "type" "F,M,M,A")
     (set_attr "predicable" "no")])
  
***************
*** 718,752 ****
    st8%Q0 %0 = %1%P0"
    [(set_attr "type" "F,M,M,M,M,A,M,M")])
  
! (define_expand "movxf"
!   [(set (match_operand:XF 0 "general_operand" "")
! 	(match_operand:XF 1 "general_operand" ""))]
    ""
    "
  {
!   if (! reload_in_progress && ! reload_completed
!       && ! ia64_move_ok (operands[0], operands[1]))
!     operands[1] = force_reg (XFmode, operands[1]);
  }")
  
  ;; ??? There's no easy way to mind volatile acquire/release semantics.
  
  ;; Errata 72 workaround.
! (define_insn "*movxfcc_astep"
    [(cond_exec
       (match_operator 2 "predicate_operator"
         [(match_operand:CC 3 "register_operand" "c")
          (const_int 0)])
!      (set (match_operand:XF 0 "register_operand"  "=f")
! 	  (match_operand:XF 1 "nonmemory_operand" "fG")))]
    "TARGET_A_STEP && ia64_move_ok (operands[0], operands[1])"
!   "mov %0 = %F1"
    [(set_attr "type" "F")
     (set_attr "predicable" "no")])
  
! (define_insn "*movxf_internal_astep"
!   [(set (match_operand:XF 0 "destination_operand" "=f,f, m")
! 	(match_operand:XF 1 "general_operand"     "fG,m,fG"))]
    "TARGET_A_STEP && ia64_move_ok (operands[0], operands[1])"
    "@
    mov %0 = %F1
--- 806,901 ----
    st8%Q0 %0 = %1%P0"
    [(set_attr "type" "F,M,M,M,M,A,M,M")])
  
! ;; With no offsettable memory references, we've got to have a scratch
! ;; around to play with the second word if the variable winds up in GRs.
! (define_expand "movtf"
!   [(set (match_operand:TF 0 "general_operand" "")
! 	(match_operand:TF 1 "general_operand" ""))]
    ""
    "
  {
!   /* We must support TFmode loads into general registers for stdarg/vararg
!      and unprototyped calls.  We split them into DImode loads for convenience.
!      We don't need TFmode stores from general regs, because a stdarg/vararg
!      routine does a block store to memory of unnamed arguments.  */
!   if (GET_CODE (operands[0]) == REG
!       && GR_REGNO_P (REGNO (operands[0])))
!     {
!       /* We're hoping to transform everything that deals with TFmode
! 	 quantities and GR registers early in the compiler.  */
!       if (no_new_pseudos)
! 	abort ();
! 
!       /* Struct to register can just use TImode instead.  */
!       if ((GET_CODE (operands[1]) == SUBREG
! 	   && GET_MODE (SUBREG_REG (operands[1])) == TImode)
! 	  || (GET_CODE (operands[1]) == REG
! 	      && GR_REGNO_P (REGNO (operands[1]))))
! 	{
! 	  emit_move_insn (gen_rtx_REG (TImode, REGNO (operands[0])),
! 			  SUBREG_REG (operands[1]));
! 	  DONE;
! 	}
! 
!       if (GET_CODE (operands[1]) == CONST_DOUBLE)
! 	{
! 	  emit_move_insn (gen_rtx_REG (DImode, REGNO (operands[0])),
! 			  operand_subword (operands[1], 0, 0, DImode));
! 	  emit_move_insn (gen_rtx_REG (DImode, REGNO (operands[0]) + 1),
! 			  operand_subword (operands[1], 1, 0, DImode));
! 	  DONE;
! 	}
! 
!       /* If the quantity is in a register not known to be GR, spill it.  */
!       if (register_operand (operands[1], TFmode))
! 	operands[1] = spill_tfmode_operand (operands[1], 1);
! 
!       if (GET_CODE (operands[1]) == MEM)
! 	{
! 	  rtx out[2];
! 
! 	  out[WORDS_BIG_ENDIAN] = gen_rtx_REG (DImode, REGNO (operands[0]));
! 	  out[!WORDS_BIG_ENDIAN] = gen_rtx_REG (DImode, REGNO (operands[0])+1);
! 
! 	  emit_move_insn (out[0], change_address (operands[1], DImode, NULL));
! 	  emit_move_insn (out[1],
! 			  change_address (operands[1], DImode,
! 					  plus_constant (XEXP (operands[1], 0),
! 							 8)));
! 	  DONE;
! 	}
! 
!       abort ();
!     }
! 
!   if (! reload_in_progress && ! reload_completed)
!     {
!       operands[0] = spill_tfmode_operand (operands[0], 0);
!       operands[1] = spill_tfmode_operand (operands[1], 0);
! 
!       if (! ia64_move_ok (operands[0], operands[1]))
! 	operands[1] = force_reg (TFmode, operands[1]);
!     }
  }")
  
  ;; ??? There's no easy way to mind volatile acquire/release semantics.
  
  ;; Errata 72 workaround.
! (define_insn "*movtfcc_astep"
    [(cond_exec
       (match_operator 2 "predicate_operator"
         [(match_operand:CC 3 "register_operand" "c")
          (const_int 0)])
!      (set (match_operand:TF 0 "register_operand"  "=f")
! 	  (match_operand:TF 1 "nonmemory_operand" "fG")))]
    "TARGET_A_STEP && ia64_move_ok (operands[0], operands[1])"
!   "(%J2) mov %0 = %F1"
    [(set_attr "type" "F")
     (set_attr "predicable" "no")])
  
! (define_insn "*movtf_internal_astep"
!   [(set (match_operand:TF 0 "destination_tfmode_operand" "=f,f, m")
! 	(match_operand:TF 1 "general_tfmode_operand"     "fG,m,fG"))]
    "TARGET_A_STEP && ia64_move_ok (operands[0], operands[1])"
    "@
    mov %0 = %F1
***************
*** 755,763 ****
    [(set_attr "type" "F,M,M")
     (set_attr "predicable" "no")])
  
! (define_insn "*movxf_internal"
!   [(set (match_operand:XF 0 "destination_operand" "=f,f, m")
! 	(match_operand:XF 1 "general_operand"     "fG,m,fG"))]
    "! TARGET_A_STEP && ia64_move_ok (operands[0], operands[1])"
    "@
    mov %0 = %F1
--- 904,912 ----
    [(set_attr "type" "F,M,M")
     (set_attr "predicable" "no")])
  
! (define_insn "*movtf_internal"
!   [(set (match_operand:TF 0 "destination_tfmode_operand" "=f,f, m")
! 	(match_operand:TF 1 "general_tfmode_operand"     "fG,m,fG"))]
    "! TARGET_A_STEP && ia64_move_ok (operands[0], operands[1])"
    "@
    mov %0 = %F1
***************
*** 843,848 ****
--- 992,1017 ----
    "if (true_regnum (operands[0]) == true_regnum (operands[1])) DONE;"
    [(set_attr "type" "F")])
  
+ (define_insn_and_split "extendsftf2"
+   [(set (match_operand:TF 0 "register_operand" "=f,f")
+ 	(float_extend:TF (match_operand:SF 1 "register_operand" "0,f")))]
+   ""
+   "mov %0 = %1"
+   "reload_completed"
+   [(set (match_dup 0) (float_extend:TF (match_dup 1)))]
+   "if (true_regnum (operands[0]) == true_regnum (operands[1])) DONE;"
+   [(set_attr "type" "F")])
+ 
+ (define_insn_and_split "extenddftf2"
+   [(set (match_operand:TF 0 "register_operand" "=f,f")
+ 	(float_extend:TF (match_operand:DF 1 "register_operand" "0,f")))]
+   ""
+   "mov %0 = %1"
+   "reload_completed"
+   [(set (match_dup 0) (float_extend:TF (match_dup 1)))]
+   "if (true_regnum (operands[0]) == true_regnum (operands[1])) DONE;"
+   [(set_attr "type" "F")])
+ 
  (define_insn "truncdfsf2"
    [(set (match_operand:SF 0 "register_operand" "=f")
  	(float_truncate:SF (match_operand:DF 1 "register_operand" "f")))]
***************
*** 850,874 ****
    "fnorm.s %0 = %1%B0"
    [(set_attr "type" "F")])
  
! (define_insn "truncxfsf2"
    [(set (match_operand:SF 0 "register_operand" "=f")
! 	(float_truncate:SF (match_operand:XF 1 "register_operand" "f")))]
    ""
    "fnorm.s %0 = %1%B0"
    [(set_attr "type" "F")])
  
! (define_insn "truncxfdf2"
    [(set (match_operand:DF 0 "register_operand" "=f")
! 	(float_truncate:DF (match_operand:XF 1 "register_operand" "f")))]
    ""
    "fnorm.d %0 = %1%B0"
    [(set_attr "type" "F")])
  
  ;; Convert between signed integer types and floating point.
  
! (define_insn "floatdixf2"
!   [(set (match_operand:XF 0 "register_operand" "=f")
! 	(float:XF (match_operand:DI 1 "register_operand" "f")))]
    ""
    "fcvt.xf %0 = %1"
    [(set_attr "type" "F")])
--- 1019,1043 ----
    "fnorm.s %0 = %1%B0"
    [(set_attr "type" "F")])
  
! (define_insn "trunctfsf2"
    [(set (match_operand:SF 0 "register_operand" "=f")
! 	(float_truncate:SF (match_operand:TF 1 "register_operand" "f")))]
    ""
    "fnorm.s %0 = %1%B0"
    [(set_attr "type" "F")])
  
! (define_insn "trunctfdf2"
    [(set (match_operand:DF 0 "register_operand" "=f")
! 	(float_truncate:DF (match_operand:TF 1 "register_operand" "f")))]
    ""
    "fnorm.d %0 = %1%B0"
    [(set_attr "type" "F")])
  
  ;; Convert between signed integer types and floating point.
  
! (define_insn "floatditf2"
!   [(set (match_operand:TF 0 "register_operand" "=f")
! 	(float:TF (match_operand:DI 1 "register_operand" "f")))]
    ""
    "fcvt.xf %0 = %1"
    [(set_attr "type" "F")])
***************
*** 887,892 ****
--- 1056,1068 ----
    "fcvt.fx.trunc %0 = %1%B0"
    [(set_attr "type" "F")])
  
+ (define_insn "fix_trunctfdi2"
+   [(set (match_operand:DI 0 "register_operand" "=f")
+ 	(fix:DI (match_operand:TF 1 "register_operand" "f")))]
+   ""
+   "fcvt.fx.trunc %0 = %1%B0"
+   [(set_attr "type" "F")])
+ 
  ;; Convert between unsigned integer types and floating point.
  
  (define_insn "floatunsdisf2"
***************
*** 903,908 ****
--- 1079,1091 ----
    "fcvt.xuf.d %0 = %1%B0"
    [(set_attr "type" "F")])
  
+ (define_insn "floatunsditf2"
+   [(set (match_operand:TF 0 "register_operand" "=f")
+ 	(unsigned_float:TF (match_operand:DI 1 "register_operand" "f")))]
+   ""
+   "fcvt.xuf %0 = %1%B0"
+   [(set_attr "type" "F")])
+ 
  (define_insn "fixuns_truncsfdi2"
    [(set (match_operand:DI 0 "register_operand" "=f")
  	(unsigned_fix:DI (match_operand:SF 1 "register_operand" "f")))]
***************
*** 917,922 ****
--- 1100,1111 ----
    "fcvt.fxu.trunc %0 = %1%B0"
    [(set_attr "type" "F")])
  
+ (define_insn "fixuns_trunctfdi2"
+   [(set (match_operand:DI 0 "register_operand" "=f")
+ 	(unsigned_fix:DI (match_operand:TF 1 "register_operand" "f")))]
+   ""
+   "fcvt.fxu.trunc %0 = %1%B0"
+   [(set_attr "type" "F")])
  
  ;; ::::::::::::::::::::
  ;; ::
***************
*** 1702,1708 ****
--- 1891,2001 ----
    ""
    "fnma.d %0 = %1, %2, %F3%B0"
    [(set_attr "type" "F")])
+ 
+ ;; ::::::::::::::::::::
+ ;; ::
+ ;; :: 80 bit floating point arithmetic
+ ;; ::
+ ;; ::::::::::::::::::::
+ 
+ (define_insn "addtf3"
+   [(set (match_operand:TF 0 "register_operand" "=f")
+ 	(plus:TF (match_operand:TF 1 "tfreg_or_fp01_operand" "fG")
+ 		 (match_operand:TF 2 "tfreg_or_fp01_operand" "fG")))]
+   ""
+   "fadd %0 = %F1, %F2%B0"
+   [(set_attr "type" "F")])
+ 
+ (define_insn "subtf3"
+   [(set (match_operand:TF 0 "register_operand" "=f")
+ 	(minus:TF (match_operand:TF 1 "tfreg_or_fp01_operand" "fG")
+ 		  (match_operand:TF 2 "tfreg_or_fp01_operand" "fG")))]
+   ""
+   "fsub %0 = %F1, %F2%B0"
+   [(set_attr "type" "F")])
  
+ (define_insn "multf3"
+   [(set (match_operand:TF 0 "register_operand" "=f")
+ 	(mult:TF (match_operand:TF 1 "tfreg_or_fp01_operand" "fG")
+ 		 (match_operand:TF 2 "tfreg_or_fp01_operand" "fG")))]
+   ""
+   "fmpy %0 = %F1, %F2%B0"
+   [(set_attr "type" "F")])
+ 
+ (define_insn "abstf2"
+   [(set (match_operand:TF 0 "register_operand" "=f")
+ 	(abs:TF (match_operand:TF 1 "tfreg_or_fp01_operand" "fG")))]
+   ""
+   "fabs %0 = %F1%B0"
+   [(set_attr "type" "F")])
+ 
+ (define_insn "negtf2"
+   [(set (match_operand:TF 0 "register_operand" "=f")
+ 	(neg:TF (match_operand:TF 1 "tfreg_or_fp01_operand" "fG")))]
+   ""
+   "fneg %0 = %F1%B0"
+   [(set_attr "type" "F")])
+ 
+ (define_insn "*nabstf2"
+   [(set (match_operand:TF 0 "register_operand" "=f")
+ 	(neg:TF (abs:TF (match_operand:TF 1 "tfreg_or_fp01_operand" "fG"))))]
+   ""
+   "fnegabs %0 = %F1%B0"
+   [(set_attr "type" "F")])
+ 
+ (define_insn "mintf3"
+   [(set (match_operand:TF 0 "register_operand" "=f")
+ 	(smin:TF (match_operand:TF 1 "tfreg_or_fp01_operand" "fG")
+ 		 (match_operand:TF 2 "tfreg_or_fp01_operand" "fG")))]
+   ""
+   "fmin %0 = %F1, %F2%B0"
+   [(set_attr "type" "F")])
+ 
+ (define_insn "maxtf3"
+   [(set (match_operand:TF 0 "register_operand" "=f")
+ 	(smax:TF (match_operand:TF 1 "tfreg_or_fp01_operand" "fG")
+ 		 (match_operand:TF 2 "tfreg_or_fp01_operand" "fG")))]
+   ""
+   "fmax %0 = %F1, %F2%B0"
+   [(set_attr "type" "F")])
+ 
+ (define_insn "*maddtf3"
+   [(set (match_operand:TF 0 "register_operand" "=f")
+ 	(plus:TF (mult:TF (match_operand:TF 1 "tfreg_or_fp01_operand" "fG")
+ 			  (match_operand:TF 2 "tfreg_or_fp01_operand" "fG"))
+ 		 (match_operand:TF 3 "tfreg_or_fp01_operand" "fG")))]
+   ""
+   "fma %0 = %F1, %F2, %F3%B0"
+   [(set_attr "type" "F")])
+ 
+ (define_insn "*msubtf3"
+   [(set (match_operand:TF 0 "register_operand" "=f")
+ 	(minus:TF (mult:TF (match_operand:TF 1 "tfreg_or_fp01_operand" "fG")
+ 			   (match_operand:TF 2 "tfreg_or_fp01_operand" "fG"))
+ 		  (match_operand:TF 3 "tfreg_or_fp01_operand" "fG")))]
+   ""
+   "fms %0 = %F1, %F2, %F3%B0"
+   [(set_attr "type" "F")])
+ 
+ (define_insn "*nmultf3"
+   [(set (match_operand:TF 0 "register_operand" "=f")
+ 	(neg:TF (mult:TF (match_operand:TF 1 "tfreg_or_fp01_operand" "fG")
+ 			 (match_operand:TF 2 "tfreg_or_fp01_operand" "fG"))))]
+   ""
+   "fnmpy %0 = %F1, %F2%B0"
+   [(set_attr "type" "F")])
+ 
+ ;; ??? Is it possible to canonicalize this as (minus (reg) (mult))?
+ 
+ (define_insn "*nmaddtf3"
+   [(set (match_operand:TF 0 "register_operand" "=f")
+ 	(plus:TF (neg:TF (mult:TF
+ 			  (match_operand:TF 1 "tfreg_or_fp01_operand" "fG")
+ 			  (match_operand:TF 2 "tfreg_or_fp01_operand" "fG")))
+ 		 (match_operand:TF 3 "tfreg_or_fp01_operand" "fG")))]
+   ""
+   "fnma %0 = %F1, %F2, %F3%B0"
+   [(set_attr "type" "F")])
  
  ;; ::::::::::::::::::::
  ;; ::
***************
*** 2037,2050 ****
    ia64_compare_op1 = operands[1];
    DONE;
  }")
- 
- ;; ??? Enable this for XFmode support.
  
! (define_expand "cmpxf"
    [(set (cc0)
!         (compare (match_operand:XF 0 "reg_or_fp01_operand" "")
!   		 (match_operand:XF 1 "reg_or_fp01_operand" "")))]
!   "0"
    "
  {
    ia64_compare_op0 = operands[0];
--- 2330,2341 ----
    ia64_compare_op1 = operands[1];
    DONE;
  }")
  
! (define_expand "cmptf"
    [(set (cc0)
!         (compare (match_operand:TF 0 "tfreg_or_fp01_operand" "")
!   		 (match_operand:TF 1 "tfreg_or_fp01_operand" "")))]
!   ""
    "
  {
    ia64_compare_op0 = operands[0];
***************
*** 2108,2113 ****
--- 2399,2413 ----
    "fcmp.%D1 %0, %I0 = %F2, %F3"
    [(set_attr "type" "F")])
  
+ (define_insn "*cmptf_internal"
+   [(set (match_operand:CC 0 "register_operand" "=c")
+ 	(match_operator:CC 1 "comparison_operator"
+ 		   [(match_operand:TF 2 "tfreg_or_fp01_operand" "fG")
+ 		    (match_operand:TF 3 "tfreg_or_fp01_operand" "fG")]))]
+   ""
+   "fcmp.%D1 %0, %I0 = %F2, %F3"
+   [(set_attr "type" "F")])
+ 
  ;; ??? Can this pattern be generated?
  
  (define_insn "*bit_zero"
***************
*** 3383,3397 ****
    [(set_attr "type" "M")])
  
  (define_insn "fr_spill"
!   [(set (match_operand:XF 0 "memory_operand" "=m")
! 	(unspec:XF [(match_operand:XF 1 "register_operand" "f")] 3))]
    ""
    "stf.spill %0 = %1%P0"
    [(set_attr "type" "M")])
  
  (define_insn "fr_restore"
!   [(set (match_operand:XF 0 "register_operand" "=f")
! 	(unspec:XF [(match_operand:XF 1 "memory_operand" "m")] 4))]
    ""
    "ldf.fill %0 = %1%P1"
    [(set_attr "type" "M")])
--- 3683,3697 ----
    [(set_attr "type" "M")])
  
  (define_insn "fr_spill"
!   [(set (match_operand:TF 0 "memory_operand" "=m")
! 	(unspec:TF [(match_operand:TF 1 "register_operand" "f")] 3))]
    ""
    "stf.spill %0 = %1%P0"
    [(set_attr "type" "M")])
  
  (define_insn "fr_restore"
!   [(set (match_operand:TF 0 "register_operand" "=f")
! 	(unspec:TF [(match_operand:TF 1 "memory_operand" "m")] 4))]
    ""
    "ldf.fill %0 = %1%P1"
    [(set_attr "type" "M")])
Index: config/ia64/lib1funcs.asm
===================================================================
RCS file: /cvs/gcc/egcs/gcc/config/ia64/lib1funcs.asm,v
retrieving revision 1.5
diff -c -p -d -r1.5 lib1funcs.asm
*** lib1funcs.asm	2000/08/08 10:01:20	1.5
--- lib1funcs.asm	2000/08/14 21:00:06
***************
*** 1,3 ****
--- 1,45 ----
+ #ifdef L__divtf3
+ // Compute a 80-bit IEEE double-extended quotient.
+ //
+ // From the Intel IA-64 Optimization Guide, choose the minimum latency
+ // alternative.
+ //
+ // farg0 holds the dividend.  farg1 holds the divisor.
+ 
+ 	.text
+ 	.align 16
+ 	.global __divtf3
+ 	.proc __divtf3
+ __divtf3:
+ 	frcpa f10, p6 = farg0, farg1
+ 	;;
+ (p6)	fnma.s1 f11 = farg1, f10, f1
+ 	;;
+ (p6)	fma.s1 f12 = f11, f10, f10
+ (p6)	fma.s1 f11 = f11, f11, f0
+ 	;;
+ (p6)	fma.s1 f11 = f11, f12, f12
+ 	;;
+ (p6)	fnma.s1 f12 = farg1, f11, f1
+ (p6)	fma.s1 f10 = farg0, f10, f0
+ 	;;
+ (p6)	fma.s1 f11 = f12, f11, f11
+ (p6)	fnma.s1 f12 = farg1, f10, farg0
+ 	;;
+ (p6)	fma.s1 f10 = f12, f11, f10
+ (p6)	fnma.s1 f12 = farg1, f11, f1
+ 	;;
+ (p6)	fnma.s1 f8 = farg1, f10, farg0
+ (p6)	fma.s1 f9 = f12, f11, f11
+ 	;;
+ (p6)	fma f10 = f8, f9, f10
+ 	;;
+ 	mov fret0 = f10
+ 	br.ret.sptk rp
+ 	;;
+ 	.endp __divtf3
+ #endif
+ 
  #ifdef L__divdf3
  // Compute a 64-bit IEEE double quotient.
  //
Index: config/ia64/t-ia64
===================================================================
RCS file: /cvs/gcc/egcs/gcc/config/ia64/t-ia64,v
retrieving revision 1.4
diff -c -p -d -r1.4 t-ia64
*** t-ia64	2000/08/08 10:01:20	1.4
--- t-ia64	2000/08/14 21:00:07
*************** LIB1ASMSRC    = ia64/lib1funcs.asm
*** 8,14 ****
  # ??? We change the names of the DImode div/mod files so that they won't
  # accidentally be overridden by libgcc2.c files.  We used to use __ia64 as
  # a prefix, now we use __ as the prefix.
! LIB1ASMFUNCS  = __divdf3 __divsf3 \
  	__divdi3 __moddi3 __udivdi3 __umoddi3 \
  	__divsi3 __modsi3 __udivsi3 __umodsi3 __save_stack_nonlocal \
  	__nonlocal_goto __restore_stack_nonlocal __trampoline
--- 8,14 ----
  # ??? We change the names of the DImode div/mod files so that they won't
  # accidentally be overridden by libgcc2.c files.  We used to use __ia64 as
  # a prefix, now we use __ as the prefix.
! LIB1ASMFUNCS  = __divtf3 __divdf3 __divsf3 \
  	__divdi3 __moddi3 __udivdi3 __umoddi3 \
  	__divsi3 __modsi3 __udivsi3 __umodsi3 __save_stack_nonlocal \
  	__nonlocal_goto __restore_stack_nonlocal __trampoline


More information about the Gcc-patches mailing list