This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: -mtune=generic for i386 backend


> On Wed, Jan 18, 2006 at 08:35:12AM +0100, Jan Hubicka wrote:
> > Perhaps I can add two "generic" entries in the first to keep numeric
> > values in sync?
> 
> Or just a comment, and we'll remember why the missing 32/64,
> and skip a number later.

If we would skip a number, we would get the string array out of sync.  
The TARGET_CPU_* and TARGET_CPU_DEFAULT_NAMES are really independent of
PROCESSOR_* enumeration that needs to be in sync with "cpu" attribute.

I am testing the attached patch with new comments and the dupliced entry
with comment, but those enumerations are really out of sync already:

enum processor_type
{
  PROCESSOR_I386,			/* 80386 */
  PROCESSOR_I486,			/* 80486DX, 80486SX, 80486DX[24] */
  PROCESSOR_PENTIUM,
  PROCESSOR_PENTIUMPRO,
  PROCESSOR_K6,
  PROCESSOR_ATHLON,
  PROCESSOR_PENTIUM4,
  PROCESSOR_K8,
  PROCESSOR_NOCONA,
  PROCESSOR_GENERIC32,
  PROCESSOR_GENERIC64,
  PROCESSOR_max
};

wrt

#define TARGET_CPU_DEFAULT_i386 0
#define TARGET_CPU_DEFAULT_i486 1
#define TARGET_CPU_DEFAULT_pentium 2
#define TARGET_CPU_DEFAULT_pentium_mmx 3
#define TARGET_CPU_DEFAULT_pentiumpro 4
#define TARGET_CPU_DEFAULT_pentium2 5
#define TARGET_CPU_DEFAULT_pentium3 6
#define TARGET_CPU_DEFAULT_pentium4 7
#define TARGET_CPU_DEFAULT_k6 8
#define TARGET_CPU_DEFAULT_k6_2 9
#define TARGET_CPU_DEFAULT_k6_3 10
#define TARGET_CPU_DEFAULT_athlon 11
#define TARGET_CPU_DEFAULT_athlon_sse 12
#define TARGET_CPU_DEFAULT_k8 13
#define TARGET_CPU_DEFAULT_pentium_m 14
#define TARGET_CPU_DEFAULT_prescott 15
#define TARGET_CPU_DEFAULT_nocona 16
#define TARGET_CPU_DEFAULT_generic 17

One is enumerating internal names (that include two generics) other
external names for config machinery, so perhaps we don't really need to
worry here.  I also added the comments as requested and fixed typos.

2006-01-17  Jan Hubicka  <jh@suse.cz>
            H.J. Lu  <hongjiu.lu@intel.com>
	    Evandro Menezes <evandro.menezes@amd.com>
	* invoke.texi (generic): Document
	(i686) Update.
	* config.gcc: Make x86_64-* and i686-* default to generic tunning.
	* i386.h (TARGET_GENERIC32, TARGET_GENERIC64, TARGET_GENERIC,
	TARGET_USE_INCDEC, TARGET_PAD_RETURNS): New macros.
	(x86_use_incdec, x86_pad_returns): New variables
	(TARGET_CPU_DEFAULT_generic): New constant
	(TARGET_CPU_DEFAULT_NAMES): Add generic.
	(enum processor_type): Add generic32 and generic64.
	* i386.md (cpu attribute): Add generic32/generic64
	(movhi splitter): Behave sanely when both partial_reg_dependency and
	partial_reg_stall are set.
	(K8 splitters): Enable for generic as well.
	* predicates.md (incdec_operand): Use TARGET_INCDEC
	(aligned_operand): Avoid memory mismatch stalls.
	* athlon.md: Enable for generic64, new patterns for 128bit moves.
	* ppro.md: Enable for generic32
	* i386.c (generic64_cost, generic32_cost): New.
	(m_GENERIC32, m_GENERIC64, m_GENERIC): New macros.
	(x86_use_leave): Enable for generic64.  (x86_use_sahf,
	x86_ext_80387_constants): Enable for generic32.  (x86_push_memory,
	x86_movx, x86_unroll_strlen, x86_deep_branch, x86_use_simode_fiop,
	x86_use_cltd, x86_promote_QImode, x86_sub_esp_4, x86_sub_esp_8,
	x86_add_esp_4, x86_add_esp_8, x86_integer_DFmode_moves,
	x86_partial_reg_dependency, x86_memory_mismatch_stall,
	x86_accumulate_outgoing_args, x86_prologue_using_move,
	x86_epilogue_using_move, x86_arch_always_fancy_math_387,
	x86_sse_partial_reg_dependency, x86_four_jump_limit, x86_schedule):
	Enable for generic.
	(x86_use_incdec, x86_pad_returns): New.
	(override_options): Add generic32 and generic64, translate "generic"
	to generic32/generic64 and "i686" to "generic32", refuse
	"generic32"/"generic64" as arch target.
	(ix86_issue_rate, ix86_adjust_cost): Handle generic as athlon.
	*ix86_reorg): Honor PAD_RETURNS.

Index: doc/invoke.texi
===================================================================
*** doc/invoke.texi	(revision 109820)
--- doc/invoke.texi	(working copy)
*************** Tune to @var{cpu-type} everything applic
*** 9037,9042 ****
--- 9037,9053 ----
  for the ABI and the set of available instructions.  The choices for
  @var{cpu-type} are:
  @table @emph
+ @item generic
+ Produce code working well on most common x86 and x86-64 CPUs.  The set of CPUs
+ the option tune to differs in 32-bit and 64-bit compilation and is supposed to
+ envolve in future versions of GCC as new CPU models are introduced and other
+ become obsolette.  At present this option in 32-bit mode generate code tuned for
+ Athlon, Dothan, Nocona, Northwood, Opteron, PentiumPro, Pentium2, Pentium3,
+ Prescott and Yonah.  In 64-bit mode code is tuned for Opteron and Nocona.  This
+ "virtual CPU" can not be used as @option{-march} operand. In 64-bit mode
+ @option{x86-64} should be used instead. In 32-bit mode the @option{i686} is
+ available for i686 familly chips (instruction set of PentiumPro and tunning
+ defaults to @code{generic}).
  @item i386
  Original Intel's i386 CPU@.
  @item i486
*************** Intel's i486 CPU@.  (No scheduling is im
*** 9045,9052 ****
  Intel Pentium CPU with no MMX support.
  @item pentium-mmx
  Intel PentiumMMX CPU based on Pentium core with MMX instruction set support.
! @item i686, pentiumpro
  Intel PentiumPro CPU@.
  @item pentium2
  Intel Pentium2 CPU based on PentiumPro core with MMX instruction set support.
  @item pentium3, pentium3m
--- 9056,9066 ----
  Intel Pentium CPU with no MMX support.
  @item pentium-mmx
  Intel PentiumMMX CPU based on Pentium core with MMX instruction set support.
! @item pentiumpro
  Intel PentiumPro CPU@.
+ @item i686
+ Same as @code{generic}, but when used as @code{march} option, PentiumPro
+ instruction set will be used, so the code will run on all i686 familly chips.
  @item pentium2
  Intel Pentium2 CPU based on PentiumPro core with MMX instruction set support.
  @item pentium3, pentium3m
Index: cgraph.c
===================================================================
*** cgraph.c	(revision 109820)
--- cgraph.c	(working copy)
*************** static GTY((param_is (struct cgraph_varp
*** 132,138 ****
  struct cgraph_varpool_node *cgraph_varpool_nodes_queue, *cgraph_varpool_first_unanalyzed_node;
  
  /* The linked list of cgraph varpool nodes.  */
! static GTY(()) struct cgraph_varpool_node *cgraph_varpool_nodes;
  
  /* End of the varpool queue.  Needs to be QTYed to work with PCH.  */
  static GTY(()) struct cgraph_varpool_node *cgraph_varpool_last_needed_node;
--- 132,138 ----
  struct cgraph_varpool_node *cgraph_varpool_nodes_queue, *cgraph_varpool_first_unanalyzed_node;
  
  /* The linked list of cgraph varpool nodes.  */
! struct cgraph_varpool_node *cgraph_varpool_nodes;
  
  /* End of the varpool queue.  Needs to be QTYed to work with PCH.  */
  static GTY(()) struct cgraph_varpool_node *cgraph_varpool_last_needed_node;
*************** bool
*** 838,845 ****
  decide_is_variable_needed (struct cgraph_varpool_node *node, tree decl)
  {
    /* If the user told us it is used, then it must be so.  */
!   if (node->externally_visible
!       || lookup_attribute ("used", DECL_ATTRIBUTES (decl)))
      return true;
  
    /* ??? If the assembler name is set by hand, it is possible to assemble
--- 838,844 ----
  decide_is_variable_needed (struct cgraph_varpool_node *node, tree decl)
  {
    /* If the user told us it is used, then it must be so.  */
!   if (node->externally_visible)
      return true;
  
    /* ??? If the assembler name is set by hand, it is possible to assemble
Index: cgraph.h
===================================================================
*** cgraph.h	(revision 109820)
--- cgraph.h	(working copy)
*************** extern GTY(()) struct cgraph_node *cgrap
*** 242,247 ****
--- 242,248 ----
  
  extern GTY(()) struct cgraph_varpool_node *cgraph_varpool_first_unanalyzed_node;
  extern GTY(()) struct cgraph_varpool_node *cgraph_varpool_nodes_queue;
+ extern GTY(()) struct cgraph_varpool_node *cgraph_varpool_nodes;
  extern GTY(()) struct cgraph_asm_node *cgraph_asm_nodes;
  extern GTY(()) int cgraph_order;
  
Index: cgraphunit.c
===================================================================
*** cgraphunit.c	(revision 109820)
--- cgraphunit.c	(working copy)
*************** decide_is_function_needed (struct cgraph
*** 198,205 ****
      }
  
    /* If the user told us it is used, then it must be so.  */
!   if (node->local.externally_visible
!       || lookup_attribute ("used", DECL_ATTRIBUTES (decl)))
      return true;
  
    /* ??? If the assembler name is set by hand, it is possible to assemble
--- 198,204 ----
      }
  
    /* If the user told us it is used, then it must be so.  */
!   if (node->local.externally_visible)
      return true;
  
    /* ??? If the assembler name is set by hand, it is possible to assemble
*************** cgraph_analyze_function (struct cgraph_n
*** 906,911 ****
--- 905,950 ----
    current_function_decl = NULL;
  }
  
+ /* Look for externally_visible and used attributes and mark cgraph nodes
+    accordingly.
+ 
+    This is not easilly doable earlier in handle_*_attribute because they might
+    be passed different copy of decl before merging.  We can't do that in
+    cgraph_finalize_function either because we want to allow defining the attributes
+    later, so we do that in separate pass at the end of unit.  */
+ 
+ static void
+ process_function_and_variable_attributes (void)
+ {
+   struct cgraph_node *node;
+   struct cgraph_varpool_node *vnode;
+ 
+   for (node = cgraph_nodes; node; node = node->next)
+     {
+       tree decl = node->decl;
+       if (lookup_attribute ("used", DECL_ATTRIBUTES (decl)))
+ 	mark_decl_referenced (decl);
+       if (lookup_attribute ("externally_visible", DECL_ATTRIBUTES (decl)))
+ 	{
+ 	  if (node->local.finalized)
+ 	    cgraph_mark_needed_node (node);
+ 	  node->externally_visible = true;
+ 	}
+     }
+   for (vnode = cgraph_varpool_nodes; vnode; vnode = vnode->next)
+     {
+       tree decl = vnode->decl;
+       if (lookup_attribute ("used", DECL_ATTRIBUTES (decl)))
+ 	mark_decl_referenced (decl);
+       if (lookup_attribute ("externally_visible", DECL_ATTRIBUTES (decl)))
+ 	{
+ 	  if (vnode->finalized)
+ 	    cgraph_varpool_mark_needed_node (vnode);
+ 	  vnode->externally_visible = true;
+ 	}
+     }
+ }
+ 
  /* Analyze the whole compilation unit once it is parsed completely.  */
  
  void
*************** cgraph_finalize_compilation_unit (void)
*** 916,925 ****
--- 955,966 ----
       intermodule optimization.  */
    static struct cgraph_node *first_analyzed;
  
+   process_function_and_variable_attributes ();
    finish_aliases_1 ();
  
    if (!flag_unit_at_a_time)
      {
+       process_function_and_variable_attributes ();
        cgraph_output_pending_asms ();
        cgraph_assemble_pending_functions ();
        return;
Index: testsuite/gcc.target/i386/lea.c
===================================================================
*** testsuite/gcc.target/i386/lea.c	(revision 109820)
--- testsuite/gcc.target/i386/lea.c	(working copy)
***************
*** 1,6 ****
  /* { dg-do compile { target i?86-*-* x86_64-*-* } } */
  /* { dg-require-effective-target ilp32 } */
! /* { dg-options "-O2 -march=i686" } */
  /* { dg-final { scan-assembler "leal" } } */
  typedef struct {
    char **visbuf;
--- 1,6 ----
  /* { dg-do compile { target i?86-*-* x86_64-*-* } } */
  /* { dg-require-effective-target ilp32 } */
! /* { dg-options "-O2 -march=pentiumpro" } */
  /* { dg-final { scan-assembler "leal" } } */
  typedef struct {
    char **visbuf;
Index: c-decl.c
===================================================================
*** c-decl.c	(revision 109820)
--- c-decl.c	(working copy)
*************** finish_decl (tree decl, tree init, tree 
*** 3498,3507 ****
  	}
      }
  
-   /* If this was marked 'used', be sure it will be output.  */
-   if (lookup_attribute ("used", DECL_ATTRIBUTES (decl)))
-     mark_decl_referenced (decl);
- 
    if (TREE_CODE (decl) == TYPE_DECL)
      {
        if (!DECL_FILE_SCOPE_P (decl)
--- 3498,3503 ----
Index: cfgexpand.c
===================================================================
*** cfgexpand.c	(revision 109820)
--- cfgexpand.c	(working copy)
*************** stack_var_conflict_p (size_t x, size_t y
*** 275,281 ****
    
  /* A subroutine of expand_used_vars.  If two variables X and Y have alias
     sets that do not conflict, then do add a conflict for these variables
!    in the interference graph.  We also have to mind MEM_IN_STRUCT_P and
     MEM_SCALAR_P.  */
  
  static void
--- 275,282 ----
    
  /* A subroutine of expand_used_vars.  If two variables X and Y have alias
     sets that do not conflict, then do add a conflict for these variables
!    in the interference graph.  We also need to make sure to add conflicts
!    for unions of the same type.  We also have to mind MEM_IN_STRUCT_P and
     MEM_SCALAR_P.  */
  
  static void
*************** add_alias_set_conflicts (void)
*** 292,298 ****
  	{
  	  tree type_j = TREE_TYPE (stack_vars[j].decl);
  	  bool aggr_j = AGGREGATE_TYPE_P (type_j);
! 	  if (aggr_i != aggr_j || !objects_must_conflict_p (type_i, type_j))
  	    add_stack_var_conflict (i, j);
  	}
      }
--- 293,303 ----
  	{
  	  tree type_j = TREE_TYPE (stack_vars[j].decl);
  	  bool aggr_j = AGGREGATE_TYPE_P (type_j);
! 	  if (aggr_i != aggr_j
! 	      || !objects_must_conflict_p (type_i, type_j)
! 	      || ((TREE_CODE (type_i) == UNION_TYPE
! 		   || TREE_CODE (type_i) == QUAL_UNION_TYPE)
! 		  && type_i == type_j))
  	    add_stack_var_conflict (i, j);
  	}
      }
Index: c-common.c
===================================================================
*** c-common.c	(revision 109820)
--- c-common.c	(working copy)
*************** handle_externally_visible_attribute (tre
*** 4274,4293 ****
  	       "%qE attribute have effect only on public objects", name);
        *no_add_attrs = true;
      }
!   else if (TREE_CODE (node) == FUNCTION_DECL)
!     {
!       struct cgraph_node *n = cgraph_node (node);
!       n->local.externally_visible = true;
!       if (n->local.finalized)
! 	cgraph_mark_needed_node (n);
!     }
!   else if (TREE_CODE (node) == VAR_DECL)
!     {
!       struct cgraph_varpool_node *n = cgraph_varpool_node (node);
!       n->externally_visible = true;
!       if (n->finalized)
! 	cgraph_varpool_mark_needed_node (n);
!     }
    else
      {
        warning (OPT_Wattributes, "%qE attribute ignored", name);
--- 4274,4282 ----
  	       "%qE attribute have effect only on public objects", name);
        *no_add_attrs = true;
      }
!   else if (TREE_CODE (node) == FUNCTION_DECL
! 	   || TREE_CODE (node) == VAR_DECL)
!     ;
    else
      {
        warning (OPT_Wattributes, "%qE attribute ignored", name);
Index: config.gcc
===================================================================
*** config.gcc	(revision 109820)
--- config.gcc	(working copy)
*************** if test x$with_cpu = x ; then
*** 2407,2419 ****
          pentium_m-*)
            with_cpu=pentium-m
            ;;
!         *)
            with_cpu=pentiumpro
            ;;
        esac
        ;;
      x86_64-*-*)
!       with_cpu=k8
        ;;
      alphaev6[78]*-*-*)
        with_cpu=ev67
--- 2407,2432 ----
          pentium_m-*)
            with_cpu=pentium-m
            ;;
!         pentiumpro-*)
            with_cpu=pentiumpro
            ;;
+         *)
+           with_cpu=generic
+           ;;
        esac
        ;;
      x86_64-*-*)
!       case ${target_noncanonical} in
!         k8-*|opteron-*|athlon_64-*)
!           with_cpu=k8
!           ;;
!         nocona-*)
!           with_cpu=nocona
!           ;;
!         *)
!           with_cpu=generic
!           ;;
!       esac
        ;;
      alphaev6[78]*-*-*)
        with_cpu=ev67
*************** case "${target}" in
*** 2619,2631 ****
  		for which in arch cpu tune; do
  			eval "val=\$with_$which"
  			case ${val} in
! 			"" | i386 | i486 \
  			| i586 | pentium | pentium-mmx | winchip-c6 | winchip2 \
  			| c3 | c3-2 | i686 | pentiumpro | pentium2 | pentium3 \
  			| pentium4 | k6 | k6-2 | k6-3 | athlon | athlon-tbird \
! 			| athlon-4 | athlon-xp | athlon-mp | k8 | opteron \
! 			| athlon64 | athlon-fx | prescott | pentium-m \
! 			| pentium4m | pentium3m| nocona)
  				# OK
  				;;
  			*)
--- 2632,2652 ----
  		for which in arch cpu tune; do
  			eval "val=\$with_$which"
  			case ${val} in
! 			i386 | i486 \
  			| i586 | pentium | pentium-mmx | winchip-c6 | winchip2 \
  			| c3 | c3-2 | i686 | pentiumpro | pentium2 | pentium3 \
  			| pentium4 | k6 | k6-2 | k6-3 | athlon | athlon-tbird \
! 			| athlon-4 | athlon-xp | athlon-mp \
! 			| prescott | pentium-m | pentium4m | pentium3m)
! 				case "${target}" in
! 				  x86_64-*-*)
! 				      echo "CPU given in --with-$which=$val don't support 64bit mode." 1>&2
! 				      exit 1
! 				      ;;
! 				esac
! 				# OK
! 				;;
! 			"" | k8 | opteron | athlon64 | athlon-fx | nocona | generic)
  				# OK
  				;;
  			*)
Index: config/i386/i386.h
===================================================================
*** config/i386/i386.h	(revision 109820)
--- config/i386/i386.h	(working copy)
*************** extern const struct processor_costs *ix8
*** 140,145 ****
--- 140,148 ----
  #define TARGET_K8 (ix86_tune == PROCESSOR_K8)
  #define TARGET_ATHLON_K8 (TARGET_K8 || TARGET_ATHLON)
  #define TARGET_NOCONA (ix86_tune == PROCESSOR_NOCONA)
+ #define TARGET_GENERIC32 (ix86_tune == PROCESSOR_GENERIC32)
+ #define TARGET_GENERIC64 (ix86_tune == PROCESSOR_GENERIC64)
+ #define TARGET_GENERIC (TARGET_GENERIC32 || TARGET_GENERIC64)
  
  #define TUNEMASK (1 << ix86_tune)
  extern const int x86_use_leave, x86_push_memory, x86_zero_extend_with_and;
*************** extern const int x86_use_ffreep;
*** 163,168 ****
--- 166,173 ----
  extern const int x86_inter_unit_moves, x86_schedule;
  extern const int x86_use_bt;
  extern const int x86_cmpxchg, x86_cmpxchg8b, x86_cmpxchg16b, x86_xadd;
+ extern const int x86_use_incdec;
+ extern const int x86_pad_returns;
  extern int x86_prefetch_sse;
  
  #define TARGET_USE_LEAVE (x86_use_leave & TUNEMASK)
*************** extern int x86_prefetch_sse;
*** 217,222 ****
--- 222,229 ----
  #define TARGET_FOUR_JUMP_LIMIT (x86_four_jump_limit & TUNEMASK)
  #define TARGET_SCHEDULE (x86_schedule & TUNEMASK)
  #define TARGET_USE_BT (x86_use_bt & TUNEMASK)
+ #define TARGET_USE_INCDEC (x86_use_incdec & TUNEMASK)
+ #define TARGET_PAD_RETURNS (x86_pad_returns & TUNEMASK)
  
  #define ASSEMBLER_DIALECT (ix86_asm_dialect)
  
*************** extern int x86_prefetch_sse;
*** 462,473 ****
  #define TARGET_CPU_DEFAULT_pentium_m 14
  #define TARGET_CPU_DEFAULT_prescott 15
  #define TARGET_CPU_DEFAULT_nocona 16
  
  #define TARGET_CPU_DEFAULT_NAMES {"i386", "i486", "pentium", "pentium-mmx",\
  				  "pentiumpro", "pentium2", "pentium3", \
  				  "pentium4", "k6", "k6-2", "k6-3",\
  				  "athlon", "athlon-4", "k8", \
! 				  "pentium-m", "prescott", "nocona"}
  
  #ifndef CC1_SPEC
  #define CC1_SPEC "%(cc1_cpu) "
--- 469,486 ----
  #define TARGET_CPU_DEFAULT_pentium_m 14
  #define TARGET_CPU_DEFAULT_prescott 15
  #define TARGET_CPU_DEFAULT_nocona 16
+ /* Internally "generic" CPU is actually handled as two CPUs "generic32" and
+    "generic64".  In order to stay in sync with "cpu" attribute, allocate
+    two slots for generic.  */
+ #define TARGET_CPU_DEFAULT_generic 17
+       /*TARGET_CPU_DEFAULT_generic 18*/
  
  #define TARGET_CPU_DEFAULT_NAMES {"i386", "i486", "pentium", "pentium-mmx",\
  				  "pentiumpro", "pentium2", "pentium3", \
  				  "pentium4", "k6", "k6-2", "k6-3",\
  				  "athlon", "athlon-4", "k8", \
! 				  "pentium-m", "prescott", "nocona",
! 				  "generic" /*32*/, "generic" /*64*/}
  
  #ifndef CC1_SPEC
  #define CC1_SPEC "%(cc1_cpu) "
*************** enum processor_type
*** 2117,2122 ****
--- 2130,2137 ----
    PROCESSOR_PENTIUM4,
    PROCESSOR_K8,
    PROCESSOR_NOCONA,
+   PROCESSOR_GENERIC32,
+   PROCESSOR_GENERIC64,
    PROCESSOR_max
  };
  
Index: config/i386/i386.md
===================================================================
*** config/i386/i386.md	(revision 109820)
--- config/i386/i386.md	(working copy)
***************
*** 186,192 ****
  
  ;; Processor type.  This attribute must exactly match the processor_type
  ;; enumeration in i386.h.
! (define_attr "cpu" "i386,i486,pentium,pentiumpro,k6,athlon,pentium4,k8,nocona"
    (const (symbol_ref "ix86_tune")))
  
  ;; A basic instruction type.  Refinements due to arguments to be
--- 186,192 ----
  
  ;; Processor type.  This attribute must exactly match the processor_type
  ;; enumeration in i386.h.
! (define_attr "cpu" "i386,i486,pentium,pentiumpro,k6,athlon,pentium4,k8,nocona,generic32,generic64"
    (const (symbol_ref "ix86_tune")))
  
  ;; A basic instruction type.  Refinements due to arguments to be
***************
*** 1510,1517 ****
  	       (const_string "SI")
  	     (and (eq_attr "type" "imov")
  		  (and (eq_attr "alternative" "0,1")
! 		       (ne (symbol_ref "TARGET_PARTIAL_REG_DEPENDENCY")
! 			   (const_int 0))))
  	       (const_string "SI")
  	     ;; Avoid partial register stalls when not using QImode arithmetic
  	     (and (eq_attr "type" "imov")
--- 1510,1519 ----
  	       (const_string "SI")
  	     (and (eq_attr "type" "imov")
  		  (and (eq_attr "alternative" "0,1")
! 		       (and (ne (symbol_ref "TARGET_PARTIAL_REG_DEPENDENCY")
! 				(const_int 0))
! 			    (eq (symbol_ref "TARGET_PARTIAL_REG_STALL")
! 				(const_int 0)))))
  	       (const_string "SI")
  	     ;; Avoid partial register stalls when not using QImode arithmetic
  	     (and (eq_attr "type" "imov")
***************
*** 4144,4150 ****
    [(match_scratch:DF 2 "Y")
     (set (match_operand:SSEMODEI24 0 "register_operand" "")
  	(fix:SSEMODEI24 (match_operand:DF 1 "memory_operand" "")))]
!   "TARGET_K8 && !optimize_size"
    [(set (match_dup 2) (match_dup 1))
     (set (match_dup 0) (fix:SSEMODEI24 (match_dup 2)))]
    "")
--- 4146,4152 ----
    [(match_scratch:DF 2 "Y")
     (set (match_operand:SSEMODEI24 0 "register_operand" "")
  	(fix:SSEMODEI24 (match_operand:DF 1 "memory_operand" "")))]
!   "(TARGET_K8 || TARGET_GENERIC64) && !optimize_size"
    [(set (match_dup 2) (match_dup 1))
     (set (match_dup 0) (fix:SSEMODEI24 (match_dup 2)))]
    "")
***************
*** 4153,4159 ****
    [(match_scratch:SF 2 "x")
     (set (match_operand:SSEMODEI24 0 "register_operand" "")
  	(fix:SSEMODEI24 (match_operand:SF 1 "memory_operand" "")))]
!   "TARGET_K8 && !optimize_size"
    [(set (match_dup 2) (match_dup 1))
     (set (match_dup 0) (fix:SSEMODEI24 (match_dup 2)))]
    "")
--- 4155,4161 ----
    [(match_scratch:SF 2 "x")
     (set (match_operand:SSEMODEI24 0 "register_operand" "")
  	(fix:SSEMODEI24 (match_operand:SF 1 "memory_operand" "")))]
!   "(TARGET_K8 || TARGET_GENERIC64) && !optimize_size"
    [(set (match_dup 2) (match_dup 1))
     (set (match_dup 0) (fix:SSEMODEI24 (match_dup 2)))]
    "")
***************
*** 19731,19737 ****
  		   (mult:DI (match_operand:DI 1 "memory_operand" "")
  			    (match_operand:DI 2 "immediate_operand" "")))
  	      (clobber (reg:CC FLAGS_REG))])]
!   "TARGET_K8 && !optimize_size
     && (GET_CODE (operands[2]) != CONST_INT
         || !CONST_OK_FOR_LETTER_P (INTVAL (operands[2]), 'K'))"
    [(set (match_dup 3) (match_dup 1))
--- 19733,19739 ----
  		   (mult:DI (match_operand:DI 1 "memory_operand" "")
  			    (match_operand:DI 2 "immediate_operand" "")))
  	      (clobber (reg:CC FLAGS_REG))])]
!   "(TARGET_K8 || TARGET_GENERIC64) && !optimize_size
     && (GET_CODE (operands[2]) != CONST_INT
         || !CONST_OK_FOR_LETTER_P (INTVAL (operands[2]), 'K'))"
    [(set (match_dup 3) (match_dup 1))
***************
*** 19745,19751 ****
  		   (mult:SI (match_operand:SI 1 "memory_operand" "")
  			    (match_operand:SI 2 "immediate_operand" "")))
  	      (clobber (reg:CC FLAGS_REG))])]
!   "TARGET_K8 && !optimize_size
     && (GET_CODE (operands[2]) != CONST_INT
         || !CONST_OK_FOR_LETTER_P (INTVAL (operands[2]), 'K'))"
    [(set (match_dup 3) (match_dup 1))
--- 19747,19753 ----
  		   (mult:SI (match_operand:SI 1 "memory_operand" "")
  			    (match_operand:SI 2 "immediate_operand" "")))
  	      (clobber (reg:CC FLAGS_REG))])]
!   "(TARGET_K8 || TARGET_GENERIC64) && !optimize_size
     && (GET_CODE (operands[2]) != CONST_INT
         || !CONST_OK_FOR_LETTER_P (INTVAL (operands[2]), 'K'))"
    [(set (match_dup 3) (match_dup 1))
***************
*** 19760,19766 ****
  		     (mult:SI (match_operand:SI 1 "memory_operand" "")
  			      (match_operand:SI 2 "immediate_operand" ""))))
  	      (clobber (reg:CC FLAGS_REG))])]
!   "TARGET_K8 && !optimize_size
     && (GET_CODE (operands[2]) != CONST_INT
         || !CONST_OK_FOR_LETTER_P (INTVAL (operands[2]), 'K'))"
    [(set (match_dup 3) (match_dup 1))
--- 19762,19768 ----
  		     (mult:SI (match_operand:SI 1 "memory_operand" "")
  			      (match_operand:SI 2 "immediate_operand" ""))))
  	      (clobber (reg:CC FLAGS_REG))])]
!   "(TARGET_K8 || TARGET_GENERIC64) && !optimize_size
     && (GET_CODE (operands[2]) != CONST_INT
         || !CONST_OK_FOR_LETTER_P (INTVAL (operands[2]), 'K'))"
    [(set (match_dup 3) (match_dup 1))
***************
*** 19778,19784 ****
  			    (match_operand:DI 2 "const_int_operand" "")))
  	      (clobber (reg:CC FLAGS_REG))])
     (match_scratch:DI 3 "r")]
!   "TARGET_K8 && !optimize_size
     && CONST_OK_FOR_LETTER_P (INTVAL (operands[2]), 'K')"
    [(set (match_dup 3) (match_dup 2))
     (parallel [(set (match_dup 0) (mult:DI (match_dup 0) (match_dup 3)))
--- 19780,19786 ----
  			    (match_operand:DI 2 "const_int_operand" "")))
  	      (clobber (reg:CC FLAGS_REG))])
     (match_scratch:DI 3 "r")]
!   "(TARGET_K8 || TARGET_GENERIC64) && !optimize_size
     && CONST_OK_FOR_LETTER_P (INTVAL (operands[2]), 'K')"
    [(set (match_dup 3) (match_dup 2))
     (parallel [(set (match_dup 0) (mult:DI (match_dup 0) (match_dup 3)))
***************
*** 19794,19800 ****
  			    (match_operand:SI 2 "const_int_operand" "")))
  	      (clobber (reg:CC FLAGS_REG))])
     (match_scratch:SI 3 "r")]
!   "TARGET_K8 && !optimize_size
     && CONST_OK_FOR_LETTER_P (INTVAL (operands[2]), 'K')"
    [(set (match_dup 3) (match_dup 2))
     (parallel [(set (match_dup 0) (mult:SI (match_dup 0) (match_dup 3)))
--- 19796,19802 ----
  			    (match_operand:SI 2 "const_int_operand" "")))
  	      (clobber (reg:CC FLAGS_REG))])
     (match_scratch:SI 3 "r")]
!   "(TARGET_K8 || TARGET_GENERIC64) && !optimize_size
     && CONST_OK_FOR_LETTER_P (INTVAL (operands[2]), 'K')"
    [(set (match_dup 3) (match_dup 2))
     (parallel [(set (match_dup 0) (mult:SI (match_dup 0) (match_dup 3)))
***************
*** 19810,19816 ****
  			    (match_operand:HI 2 "immediate_operand" "")))
  	      (clobber (reg:CC FLAGS_REG))])
     (match_scratch:HI 3 "r")]
!   "TARGET_K8 && !optimize_size"
    [(set (match_dup 3) (match_dup 2))
     (parallel [(set (match_dup 0) (mult:HI (match_dup 0) (match_dup 3)))
  	      (clobber (reg:CC FLAGS_REG))])]
--- 19812,19818 ----
  			    (match_operand:HI 2 "immediate_operand" "")))
  	      (clobber (reg:CC FLAGS_REG))])
     (match_scratch:HI 3 "r")]
!   "(TARGET_K8 || TARGET_GENERIC64) && !optimize_size"
    [(set (match_dup 3) (match_dup 2))
     (parallel [(set (match_dup 0) (mult:HI (match_dup 0) (match_dup 3)))
  	      (clobber (reg:CC FLAGS_REG))])]
Index: config/i386/predicates.md
===================================================================
*** config/i386/predicates.md	(revision 109820)
--- config/i386/predicates.md	(working copy)
***************
*** 619,625 ****
  {
    /* On Pentium4, the inc and dec operations causes extra dependency on flag
       registers, since carry flag is not set.  */
!   if ((TARGET_PENTIUM4 || TARGET_NOCONA) && !optimize_size)
      return 0;
    return op == const1_rtx || op == constm1_rtx;
  })
--- 619,625 ----
  {
    /* On Pentium4, the inc and dec operations causes extra dependency on flag
       registers, since carry flag is not set.  */
!   if (!TARGET_USE_INCDEC && !optimize_size)
      return 0;
    return op == const1_rtx || op == constm1_rtx;
  })
***************
*** 697,702 ****
--- 697,707 ----
    /* Registers and immediate operands are always "aligned".  */
    if (GET_CODE (op) != MEM)
      return 1;
+ 
+   /* All patterns using aligned_operand on memory operands ends up
+      in promoting memory operand to 64bit and thus causing memory missmatch.  */
+   if (TARGET_MEMORY_MISMATCH_STALL)
+     return 0;
  
    /* Don't even try to do any aligned optimizations with volatiles.  */
    if (MEM_VOLATILE_P (op))
Index: config/i386/athlon.md
===================================================================
*** config/i386/athlon.md	(revision 109820)
--- config/i386/athlon.md	(working copy)
***************
*** 123,129 ****
  (define_cpu_unit "athlon-fmul" "athlon_fp")
  (define_cpu_unit "athlon-fstore" "athlon_fp")
  (define_reservation "athlon-fany" "(athlon-fstore | athlon-fmul | athlon-fadd)")
! (define_reservation "athlon-faddmul" "(athlon-fmul | athlon-fadd)")
  
  ;; Vector operations usually consume many of pipes.
  (define_reservation "athlon-fvector" "(athlon-fadd + athlon-fmul + athlon-fstore)")
--- 123,129 ----
  (define_cpu_unit "athlon-fmul" "athlon_fp")
  (define_cpu_unit "athlon-fstore" "athlon_fp")
  (define_reservation "athlon-fany" "(athlon-fstore | athlon-fmul | athlon-fadd)")
! (define_reservation "athlon-faddmul" "(athlon-fadd | athlon-fmul)")
  
  ;; Vector operations usually consume many of pipes.
  (define_reservation "athlon-fvector" "(athlon-fadd + athlon-fmul + athlon-fstore)")
***************
*** 131,156 ****
  
  ;; Jump instructions are executed in the branch unit completely transparent to us
  (define_insn_reservation "athlon_branch" 0
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (eq_attr "type" "ibr"))
  			 "athlon-direct,athlon-ieu")
  (define_insn_reservation "athlon_call" 0
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (eq_attr "type" "call,callv"))
  			 "athlon-vector,athlon-ieu")
  
  ;; Latency of push operation is 3 cycles, but ESP value is available
  ;; earlier
  (define_insn_reservation "athlon_push" 2
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (eq_attr "type" "push"))
  			 "athlon-direct,athlon-agu,athlon-store")
  (define_insn_reservation "athlon_pop" 4
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (eq_attr "type" "pop"))
  			 "athlon-vector,athlon-load,athlon-ieu")
  (define_insn_reservation "athlon_pop_k8" 3
! 			 (and (eq_attr "cpu" "k8")
  			      (eq_attr "type" "pop"))
  			 "athlon-double,(athlon-ieu+athlon-load)")
  (define_insn_reservation "athlon_leave" 3
--- 131,156 ----
  
  ;; Jump instructions are executed in the branch unit completely transparent to us
  (define_insn_reservation "athlon_branch" 0
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (eq_attr "type" "ibr"))
  			 "athlon-direct,athlon-ieu")
  (define_insn_reservation "athlon_call" 0
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (eq_attr "type" "call,callv"))
  			 "athlon-vector,athlon-ieu")
  
  ;; Latency of push operation is 3 cycles, but ESP value is available
  ;; earlier
  (define_insn_reservation "athlon_push" 2
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (eq_attr "type" "push"))
  			 "athlon-direct,athlon-agu,athlon-store")
  (define_insn_reservation "athlon_pop" 4
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (eq_attr "type" "pop"))
  			 "athlon-vector,athlon-load,athlon-ieu")
  (define_insn_reservation "athlon_pop_k8" 3
! 			 (and (eq_attr "cpu" "k8,generic64")
  			      (eq_attr "type" "pop"))
  			 "athlon-double,(athlon-ieu+athlon-load)")
  (define_insn_reservation "athlon_leave" 3
***************
*** 158,170 ****
  			      (eq_attr "type" "leave"))
  			 "athlon-vector,(athlon-ieu+athlon-load)")
  (define_insn_reservation "athlon_leave_k8" 3
! 			 (and (eq_attr "cpu" "k8")
  			      (eq_attr "type" "leave"))
  			 "athlon-double,(athlon-ieu+athlon-load)")
  
  ;; Lea executes in AGU unit with 2 cycles latency.
  (define_insn_reservation "athlon_lea" 2
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (eq_attr "type" "lea"))
  			 "athlon-direct,athlon-agu,nothing")
  
--- 158,170 ----
  			      (eq_attr "type" "leave"))
  			 "athlon-vector,(athlon-ieu+athlon-load)")
  (define_insn_reservation "athlon_leave_k8" 3
! 			 (and (eq_attr "cpu" "k8,generic64")
  			      (eq_attr "type" "leave"))
  			 "athlon-double,(athlon-ieu+athlon-load)")
  
  ;; Lea executes in AGU unit with 2 cycles latency.
  (define_insn_reservation "athlon_lea" 2
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (eq_attr "type" "lea"))
  			 "athlon-direct,athlon-agu,nothing")
  
***************
*** 176,188 ****
  			 "athlon-vector,athlon-ieu0,athlon-mult,nothing,nothing,athlon-ieu0")
  ;; ??? Widening multiply is vector or double.
  (define_insn_reservation "athlon_imul_k8_DI" 4
! 			 (and (eq_attr "cpu" "k8")
  			      (and (eq_attr "type" "imul")
  				   (and (eq_attr "mode" "DI")
  					(eq_attr "memory" "none,unknown"))))
  			 "athlon-direct0,athlon-ieu0,athlon-mult,nothing,athlon-ieu0")
  (define_insn_reservation "athlon_imul_k8" 3
! 			 (and (eq_attr "cpu" "k8")
  			      (and (eq_attr "type" "imul")
  				   (eq_attr "memory" "none,unknown")))
  			 "athlon-direct0,athlon-ieu0,athlon-mult,athlon-ieu0")
--- 176,188 ----
  			 "athlon-vector,athlon-ieu0,athlon-mult,nothing,nothing,athlon-ieu0")
  ;; ??? Widening multiply is vector or double.
  (define_insn_reservation "athlon_imul_k8_DI" 4
! 			 (and (eq_attr "cpu" "k8,generic64")
  			      (and (eq_attr "type" "imul")
  				   (and (eq_attr "mode" "DI")
  					(eq_attr "memory" "none,unknown"))))
  			 "athlon-direct0,athlon-ieu0,athlon-mult,nothing,athlon-ieu0")
  (define_insn_reservation "athlon_imul_k8" 3
! 			 (and (eq_attr "cpu" "k8,generic64")
  			      (and (eq_attr "type" "imul")
  				   (eq_attr "memory" "none,unknown")))
  			 "athlon-direct0,athlon-ieu0,athlon-mult,athlon-ieu0")
***************
*** 192,204 ****
  				   (eq_attr "memory" "load,both")))
  			 "athlon-vector,athlon-load,athlon-ieu,athlon-mult,nothing,nothing,athlon-ieu")
  (define_insn_reservation "athlon_imul_mem_k8_DI" 7
! 			 (and (eq_attr "cpu" "k8")
  			      (and (eq_attr "type" "imul")
  				   (and (eq_attr "mode" "DI")
  					(eq_attr "memory" "load,both"))))
  			 "athlon-vector,athlon-load,athlon-ieu,athlon-mult,nothing,athlon-ieu")
  (define_insn_reservation "athlon_imul_mem_k8" 6
! 			 (and (eq_attr "cpu" "k8")
  			      (and (eq_attr "type" "imul")
  				   (eq_attr "memory" "load,both")))
  			 "athlon-vector,athlon-load,athlon-ieu,athlon-mult,athlon-ieu")
--- 192,204 ----
  				   (eq_attr "memory" "load,both")))
  			 "athlon-vector,athlon-load,athlon-ieu,athlon-mult,nothing,nothing,athlon-ieu")
  (define_insn_reservation "athlon_imul_mem_k8_DI" 7
! 			 (and (eq_attr "cpu" "k8,generic64")
  			      (and (eq_attr "type" "imul")
  				   (and (eq_attr "mode" "DI")
  					(eq_attr "memory" "load,both"))))
  			 "athlon-vector,athlon-load,athlon-ieu,athlon-mult,nothing,athlon-ieu")
  (define_insn_reservation "athlon_imul_mem_k8" 6
! 			 (and (eq_attr "cpu" "k8,generic64")
  			      (and (eq_attr "type" "imul")
  				   (eq_attr "memory" "load,both")))
  			 "athlon-vector,athlon-load,athlon-ieu,athlon-mult,athlon-ieu")
***************
*** 211,269 ****
  ;; of the other code
  
  (define_insn_reservation "athlon_idiv" 6
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (and (eq_attr "type" "idiv")
  				   (eq_attr "memory" "none,unknown")))
  			 "athlon-vector,(athlon-ieu0*6+(athlon-fpsched,athlon-fvector))")
  (define_insn_reservation "athlon_idiv_mem" 9
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (and (eq_attr "type" "idiv")
  				   (eq_attr "memory" "load,both")))
  			 "athlon-vector,((athlon-load,athlon-ieu0*6)+(athlon-fpsched,athlon-fvector))")
  ;; The parallelism of string instructions is not documented.  Model it same way
  ;; as idiv to create smaller automata.  This probably does not matter much.
  (define_insn_reservation "athlon_str" 6
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (and (eq_attr "type" "str")
  				   (eq_attr "memory" "load,both,store")))
  			 "athlon-vector,athlon-load,athlon-ieu0*6")
  
  (define_insn_reservation "athlon_idirect" 1
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (and (eq_attr "athlon_decode" "direct")
  				   (and (eq_attr "unit" "integer,unknown")
  					(eq_attr "memory" "none,unknown"))))
  			 "athlon-direct,athlon-ieu")
  (define_insn_reservation "athlon_ivector" 2
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (and (eq_attr "athlon_decode" "vector")
  				   (and (eq_attr "unit" "integer,unknown")
  					(eq_attr "memory" "none,unknown"))))
  			 "athlon-vector,athlon-ieu,athlon-ieu")
  (define_insn_reservation "athlon_idirect_loadmov" 3
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (and (eq_attr "type" "imov")
  				   (eq_attr "memory" "load")))
  			 "athlon-direct,athlon-load")
  (define_insn_reservation "athlon_idirect_load" 4
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (and (eq_attr "athlon_decode" "direct")
  				   (and (eq_attr "unit" "integer,unknown")
  					(eq_attr "memory" "load"))))
  			 "athlon-direct,athlon-load,athlon-ieu")
  (define_insn_reservation "athlon_ivector_load" 6
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (and (eq_attr "athlon_decode" "vector")
  				   (and (eq_attr "unit" "integer,unknown")
  					(eq_attr "memory" "load"))))
  			 "athlon-vector,athlon-load,athlon-ieu,athlon-ieu")
  (define_insn_reservation "athlon_idirect_movstore" 1
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (and (eq_attr "type" "imov")
  				   (eq_attr "memory" "store")))
  			 "athlon-direct,athlon-agu,athlon-store")
  (define_insn_reservation "athlon_idirect_both" 4
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (and (eq_attr "athlon_decode" "direct")
  				   (and (eq_attr "unit" "integer,unknown")
  					(eq_attr "memory" "both"))))
--- 211,269 ----
  ;; of the other code
  
  (define_insn_reservation "athlon_idiv" 6
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (and (eq_attr "type" "idiv")
  				   (eq_attr "memory" "none,unknown")))
  			 "athlon-vector,(athlon-ieu0*6+(athlon-fpsched,athlon-fvector))")
  (define_insn_reservation "athlon_idiv_mem" 9
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (and (eq_attr "type" "idiv")
  				   (eq_attr "memory" "load,both")))
  			 "athlon-vector,((athlon-load,athlon-ieu0*6)+(athlon-fpsched,athlon-fvector))")
  ;; The parallelism of string instructions is not documented.  Model it same way
  ;; as idiv to create smaller automata.  This probably does not matter much.
  (define_insn_reservation "athlon_str" 6
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (and (eq_attr "type" "str")
  				   (eq_attr "memory" "load,both,store")))
  			 "athlon-vector,athlon-load,athlon-ieu0*6")
  
  (define_insn_reservation "athlon_idirect" 1
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (and (eq_attr "athlon_decode" "direct")
  				   (and (eq_attr "unit" "integer,unknown")
  					(eq_attr "memory" "none,unknown"))))
  			 "athlon-direct,athlon-ieu")
  (define_insn_reservation "athlon_ivector" 2
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (and (eq_attr "athlon_decode" "vector")
  				   (and (eq_attr "unit" "integer,unknown")
  					(eq_attr "memory" "none,unknown"))))
  			 "athlon-vector,athlon-ieu,athlon-ieu")
  (define_insn_reservation "athlon_idirect_loadmov" 3
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (and (eq_attr "type" "imov")
  				   (eq_attr "memory" "load")))
  			 "athlon-direct,athlon-load")
  (define_insn_reservation "athlon_idirect_load" 4
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (and (eq_attr "athlon_decode" "direct")
  				   (and (eq_attr "unit" "integer,unknown")
  					(eq_attr "memory" "load"))))
  			 "athlon-direct,athlon-load,athlon-ieu")
  (define_insn_reservation "athlon_ivector_load" 6
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (and (eq_attr "athlon_decode" "vector")
  				   (and (eq_attr "unit" "integer,unknown")
  					(eq_attr "memory" "load"))))
  			 "athlon-vector,athlon-load,athlon-ieu,athlon-ieu")
  (define_insn_reservation "athlon_idirect_movstore" 1
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (and (eq_attr "type" "imov")
  				   (eq_attr "memory" "store")))
  			 "athlon-direct,athlon-agu,athlon-store")
  (define_insn_reservation "athlon_idirect_both" 4
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (and (eq_attr "athlon_decode" "direct")
  				   (and (eq_attr "unit" "integer,unknown")
  					(eq_attr "memory" "both"))))
***************
*** 271,277 ****
  			  athlon-ieu,athlon-store,
  			  athlon-store")
  (define_insn_reservation "athlon_ivector_both" 6
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (and (eq_attr "athlon_decode" "vector")
  				   (and (eq_attr "unit" "integer,unknown")
  					(eq_attr "memory" "both"))))
--- 271,277 ----
  			  athlon-ieu,athlon-store,
  			  athlon-store")
  (define_insn_reservation "athlon_ivector_both" 6
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (and (eq_attr "athlon_decode" "vector")
  				   (and (eq_attr "unit" "integer,unknown")
  					(eq_attr "memory" "both"))))
***************
*** 280,293 ****
  			  athlon-ieu,
  			  athlon-store")
  (define_insn_reservation "athlon_idirect_store" 1
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (and (eq_attr "athlon_decode" "direct")
  				   (and (eq_attr "unit" "integer,unknown")
  					(eq_attr "memory" "store"))))
  			 "athlon-direct,(athlon-ieu+athlon-agu),
  			  athlon-store")
  (define_insn_reservation "athlon_ivector_store" 2
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (and (eq_attr "athlon_decode" "vector")
  				   (and (eq_attr "unit" "integer,unknown")
  					(eq_attr "memory" "store"))))
--- 280,293 ----
  			  athlon-ieu,
  			  athlon-store")
  (define_insn_reservation "athlon_idirect_store" 1
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (and (eq_attr "athlon_decode" "direct")
  				   (and (eq_attr "unit" "integer,unknown")
  					(eq_attr "memory" "store"))))
  			 "athlon-direct,(athlon-ieu+athlon-agu),
  			  athlon-store")
  (define_insn_reservation "athlon_ivector_store" 2
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (and (eq_attr "athlon_decode" "vector")
  				   (and (eq_attr "unit" "integer,unknown")
  					(eq_attr "memory" "store"))))
***************
*** 302,308 ****
  					(eq_attr "mode" "XF"))))
  			 "athlon-vector,athlon-fpload2,athlon-fvector*9")
  (define_insn_reservation "athlon_fldxf_k8" 13
! 			 (and (eq_attr "cpu" "k8")
  			      (and (eq_attr "type" "fmov")
  				   (and (eq_attr "memory" "load")
  					(eq_attr "mode" "XF"))))
--- 302,308 ----
  					(eq_attr "mode" "XF"))))
  			 "athlon-vector,athlon-fpload2,athlon-fvector*9")
  (define_insn_reservation "athlon_fldxf_k8" 13
! 			 (and (eq_attr "cpu" "k8,generic64")
  			      (and (eq_attr "type" "fmov")
  				   (and (eq_attr "memory" "load")
  					(eq_attr "mode" "XF"))))
***************
*** 314,320 ****
  				   (eq_attr "memory" "load")))
  			 "athlon-direct,athlon-fpload,athlon-fany")
  (define_insn_reservation "athlon_fld_k8" 2
! 			 (and (eq_attr "cpu" "k8")
  			      (and (eq_attr "type" "fmov")
  				   (eq_attr "memory" "load")))
  			 "athlon-direct,athlon-fploadk8,athlon-fstore")
--- 314,320 ----
  				   (eq_attr "memory" "load")))
  			 "athlon-direct,athlon-fpload,athlon-fany")
  (define_insn_reservation "athlon_fld_k8" 2
! 			 (and (eq_attr "cpu" "k8,generic64")
  			      (and (eq_attr "type" "fmov")
  				   (eq_attr "memory" "load")))
  			 "athlon-direct,athlon-fploadk8,athlon-fstore")
***************
*** 326,332 ****
  					(eq_attr "mode" "XF"))))
  			 "athlon-vector,(athlon-fpsched+athlon-agu),(athlon-store2+(athlon-fvector*7))")
  (define_insn_reservation "athlon_fstxf_k8" 8
! 			 (and (eq_attr "cpu" "k8")
  			      (and (eq_attr "type" "fmov")
  				   (and (eq_attr "memory" "store,both")
  					(eq_attr "mode" "XF"))))
--- 326,332 ----
  					(eq_attr "mode" "XF"))))
  			 "athlon-vector,(athlon-fpsched+athlon-agu),(athlon-store2+(athlon-fvector*7))")
  (define_insn_reservation "athlon_fstxf_k8" 8
! 			 (and (eq_attr "cpu" "k8,generic64")
  			      (and (eq_attr "type" "fmov")
  				   (and (eq_attr "memory" "store,both")
  					(eq_attr "mode" "XF"))))
***************
*** 337,352 ****
  				   (eq_attr "memory" "store,both")))
  			 "athlon-direct,(athlon-fpsched+athlon-agu),(athlon-fstore+athlon-store)")
  (define_insn_reservation "athlon_fst_k8" 2
! 			 (and (eq_attr "cpu" "k8")
  			      (and (eq_attr "type" "fmov")
  				   (eq_attr "memory" "store,both")))
  			 "athlon-direct,(athlon-fpsched+athlon-agu),(athlon-fstore+athlon-store)")
  (define_insn_reservation "athlon_fist" 4
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (eq_attr "type" "fistp"))
  			 "athlon-direct,(athlon-fpsched+athlon-agu),(athlon-fstore+athlon-store)")
  (define_insn_reservation "athlon_fmov" 2
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (eq_attr "type" "fmov"))
  			 "athlon-direct,athlon-fpsched,athlon-faddmul")
  (define_insn_reservation "athlon_fadd_load" 4
--- 337,352 ----
  				   (eq_attr "memory" "store,both")))
  			 "athlon-direct,(athlon-fpsched+athlon-agu),(athlon-fstore+athlon-store)")
  (define_insn_reservation "athlon_fst_k8" 2
! 			 (and (eq_attr "cpu" "k8,generic64")
  			      (and (eq_attr "type" "fmov")
  				   (eq_attr "memory" "store,both")))
  			 "athlon-direct,(athlon-fpsched+athlon-agu),(athlon-fstore+athlon-store)")
  (define_insn_reservation "athlon_fist" 4
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (eq_attr "type" "fistp"))
  			 "athlon-direct,(athlon-fpsched+athlon-agu),(athlon-fstore+athlon-store)")
  (define_insn_reservation "athlon_fmov" 2
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (eq_attr "type" "fmov"))
  			 "athlon-direct,athlon-fpsched,athlon-faddmul")
  (define_insn_reservation "athlon_fadd_load" 4
***************
*** 355,366 ****
  				   (eq_attr "memory" "load")))
  			 "athlon-direct,athlon-fpload,athlon-fadd")
  (define_insn_reservation "athlon_fadd_load_k8" 6
! 			 (and (eq_attr "cpu" "k8")
  			      (and (eq_attr "type" "fop")
  				   (eq_attr "memory" "load")))
  			 "athlon-direct,athlon-fploadk8,athlon-fadd")
  (define_insn_reservation "athlon_fadd" 4
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (eq_attr "type" "fop"))
  			 "athlon-direct,athlon-fpsched,athlon-fadd")
  (define_insn_reservation "athlon_fmul_load" 4
--- 355,366 ----
  				   (eq_attr "memory" "load")))
  			 "athlon-direct,athlon-fpload,athlon-fadd")
  (define_insn_reservation "athlon_fadd_load_k8" 6
! 			 (and (eq_attr "cpu" "k8,generic64")
  			      (and (eq_attr "type" "fop")
  				   (eq_attr "memory" "load")))
  			 "athlon-direct,athlon-fploadk8,athlon-fadd")
  (define_insn_reservation "athlon_fadd" 4
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (eq_attr "type" "fop"))
  			 "athlon-direct,athlon-fpsched,athlon-fadd")
  (define_insn_reservation "athlon_fmul_load" 4
***************
*** 369,384 ****
  				   (eq_attr "memory" "load")))
  			 "athlon-direct,athlon-fpload,athlon-fmul")
  (define_insn_reservation "athlon_fmul_load_k8" 6
! 			 (and (eq_attr "cpu" "k8")
  			      (and (eq_attr "type" "fmul")
  				   (eq_attr "memory" "load")))
  			 "athlon-direct,athlon-fploadk8,athlon-fmul")
  (define_insn_reservation "athlon_fmul" 4
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (eq_attr "type" "fmul"))
  			 "athlon-direct,athlon-fpsched,athlon-fmul")
  (define_insn_reservation "athlon_fsgn" 2
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (eq_attr "type" "fsgn"))
  			 "athlon-direct,athlon-fpsched,athlon-fmul")
  (define_insn_reservation "athlon_fdiv_load" 24
--- 369,384 ----
  				   (eq_attr "memory" "load")))
  			 "athlon-direct,athlon-fpload,athlon-fmul")
  (define_insn_reservation "athlon_fmul_load_k8" 6
! 			 (and (eq_attr "cpu" "k8,generic64")
  			      (and (eq_attr "type" "fmul")
  				   (eq_attr "memory" "load")))
  			 "athlon-direct,athlon-fploadk8,athlon-fmul")
  (define_insn_reservation "athlon_fmul" 4
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (eq_attr "type" "fmul"))
  			 "athlon-direct,athlon-fpsched,athlon-fmul")
  (define_insn_reservation "athlon_fsgn" 2
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (eq_attr "type" "fsgn"))
  			 "athlon-direct,athlon-fpsched,athlon-fmul")
  (define_insn_reservation "athlon_fdiv_load" 24
***************
*** 387,393 ****
  				   (eq_attr "memory" "load")))
  			 "athlon-direct,athlon-fpload,athlon-fmul")
  (define_insn_reservation "athlon_fdiv_load_k8" 13
! 			 (and (eq_attr "cpu" "k8")
  			      (and (eq_attr "type" "fdiv")
  				   (eq_attr "memory" "load")))
  			 "athlon-direct,athlon-fploadk8,athlon-fmul")
--- 387,393 ----
  				   (eq_attr "memory" "load")))
  			 "athlon-direct,athlon-fpload,athlon-fmul")
  (define_insn_reservation "athlon_fdiv_load_k8" 13
! 			 (and (eq_attr "cpu" "k8,generic64")
  			      (and (eq_attr "type" "fdiv")
  				   (eq_attr "memory" "load")))
  			 "athlon-direct,athlon-fploadk8,athlon-fmul")
***************
*** 396,411 ****
  			      (eq_attr "type" "fdiv"))
  			 "athlon-direct,athlon-fpsched,athlon-fmul")
  (define_insn_reservation "athlon_fdiv_k8" 11
! 			 (and (eq_attr "cpu" "k8")
  			      (eq_attr "type" "fdiv"))
  			 "athlon-direct,athlon-fpsched,athlon-fmul")
  (define_insn_reservation "athlon_fpspc_load" 103
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (and (eq_attr "type" "fpspc")
  				   (eq_attr "memory" "load")))
  			 "athlon-vector,athlon-fpload,athlon-fvector")
  (define_insn_reservation "athlon_fpspc" 100
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (eq_attr "type" "fpspc"))
  			 "athlon-vector,athlon-fpsched,athlon-fvector")
  (define_insn_reservation "athlon_fcmov_load" 7
--- 396,411 ----
  			      (eq_attr "type" "fdiv"))
  			 "athlon-direct,athlon-fpsched,athlon-fmul")
  (define_insn_reservation "athlon_fdiv_k8" 11
! 			 (and (eq_attr "cpu" "k8,generic64")
  			      (eq_attr "type" "fdiv"))
  			 "athlon-direct,athlon-fpsched,athlon-fmul")
  (define_insn_reservation "athlon_fpspc_load" 103
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (and (eq_attr "type" "fpspc")
  				   (eq_attr "memory" "load")))
  			 "athlon-vector,athlon-fpload,athlon-fvector")
  (define_insn_reservation "athlon_fpspc" 100
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (eq_attr "type" "fpspc"))
  			 "athlon-vector,athlon-fpsched,athlon-fvector")
  (define_insn_reservation "athlon_fcmov_load" 7
***************
*** 418,429 ****
  			      (eq_attr "type" "fcmov"))
  			 "athlon-vector,athlon-fpsched,athlon-fvector")
  (define_insn_reservation "athlon_fcmov_load_k8" 17
! 			 (and (eq_attr "cpu" "k8")
  			      (and (eq_attr "type" "fcmov")
  				   (eq_attr "memory" "load")))
  			 "athlon-vector,athlon-fploadk8,athlon-fvector")
  (define_insn_reservation "athlon_fcmov_k8" 15
! 			 (and (eq_attr "cpu" "k8")
  			      (eq_attr "type" "fcmov"))
  			 "athlon-vector,athlon-fpsched,athlon-fvector")
  ;; fcomi is vector decoded by uses only one pipe.
--- 418,429 ----
  			      (eq_attr "type" "fcmov"))
  			 "athlon-vector,athlon-fpsched,athlon-fvector")
  (define_insn_reservation "athlon_fcmov_load_k8" 17
! 			 (and (eq_attr "cpu" "k8,generic64")
  			      (and (eq_attr "type" "fcmov")
  				   (eq_attr "memory" "load")))
  			 "athlon-vector,athlon-fploadk8,athlon-fvector")
  (define_insn_reservation "athlon_fcmov_k8" 15
! 			 (and (eq_attr "cpu" "k8,generic64")
  			      (eq_attr "type" "fcmov"))
  			 "athlon-vector,athlon-fpsched,athlon-fvector")
  ;; fcomi is vector decoded by uses only one pipe.
***************
*** 434,446 ****
  				        (eq_attr "memory" "load"))))
  			 "athlon-vector,athlon-fpload,athlon-fadd")
  (define_insn_reservation "athlon_fcomi_load_k8" 5
! 			 (and (eq_attr "cpu" "k8")
  			      (and (eq_attr "type" "fcmp")
  				   (and (eq_attr "athlon_decode" "vector")
  				        (eq_attr "memory" "load"))))
  			 "athlon-vector,athlon-fploadk8,athlon-fadd")
  (define_insn_reservation "athlon_fcomi" 3
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (and (eq_attr "athlon_decode" "vector")
  				   (eq_attr "type" "fcmp")))
  			 "athlon-vector,athlon-fpsched,athlon-fadd")
--- 434,446 ----
  				        (eq_attr "memory" "load"))))
  			 "athlon-vector,athlon-fpload,athlon-fadd")
  (define_insn_reservation "athlon_fcomi_load_k8" 5
! 			 (and (eq_attr "cpu" "k8,generic64")
  			      (and (eq_attr "type" "fcmp")
  				   (and (eq_attr "athlon_decode" "vector")
  				        (eq_attr "memory" "load"))))
  			 "athlon-vector,athlon-fploadk8,athlon-fadd")
  (define_insn_reservation "athlon_fcomi" 3
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (and (eq_attr "athlon_decode" "vector")
  				   (eq_attr "type" "fcmp")))
  			 "athlon-vector,athlon-fpsched,athlon-fadd")
***************
*** 450,467 ****
  				   (eq_attr "memory" "load")))
  			 "athlon-direct,athlon-fpload,athlon-fadd")
  (define_insn_reservation "athlon_fcom_load_k8" 4
! 			 (and (eq_attr "cpu" "k8")
  			      (and (eq_attr "type" "fcmp")
  				   (eq_attr "memory" "load")))
  			 "athlon-direct,athlon-fploadk8,athlon-fadd")
  (define_insn_reservation "athlon_fcom" 2
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (eq_attr "type" "fcmp"))
  			 "athlon-direct,athlon-fpsched,athlon-fadd")
  ;; Never seen by the scheduler because we still don't do post reg-stack
  ;; scheduling.
  ;(define_insn_reservation "athlon_fxch" 2
! ;			 (and (eq_attr "cpu" "athlon,k8")
  ;			      (eq_attr "type" "fxch"))
  ;			 "athlon-direct,athlon-fpsched,athlon-fany")
  
--- 450,467 ----
  				   (eq_attr "memory" "load")))
  			 "athlon-direct,athlon-fpload,athlon-fadd")
  (define_insn_reservation "athlon_fcom_load_k8" 4
! 			 (and (eq_attr "cpu" "k8,generic64")
  			      (and (eq_attr "type" "fcmp")
  				   (eq_attr "memory" "load")))
  			 "athlon-direct,athlon-fploadk8,athlon-fadd")
  (define_insn_reservation "athlon_fcom" 2
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (eq_attr "type" "fcmp"))
  			 "athlon-direct,athlon-fpsched,athlon-fadd")
  ;; Never seen by the scheduler because we still don't do post reg-stack
  ;; scheduling.
  ;(define_insn_reservation "athlon_fxch" 2
! ;			 (and (eq_attr "cpu" "athlon,k8,generic64")
  ;			      (eq_attr "type" "fxch"))
  ;			 "athlon-direct,athlon-fpsched,athlon-fany")
  
***************
*** 477,484 ****
  			      (and (eq_attr "type" "ssemov")
  				   (match_operand:DF 1 "memory_operand" "")))
  			 "athlon-direct,athlon-fploadk8,athlon-fstore")
  (define_insn_reservation "athlon_movaps_load_k8" 2
! 			 (and (eq_attr "cpu" "k8")
  			      (and (eq_attr "type" "ssemov")
  				   (and (eq_attr "mode" "V4SF,V2DF,TI")
  					(eq_attr "memory" "load"))))
--- 477,489 ----
  			      (and (eq_attr "type" "ssemov")
  				   (match_operand:DF 1 "memory_operand" "")))
  			 "athlon-direct,athlon-fploadk8,athlon-fstore")
+ (define_insn_reservation "athlon_movsd_load_generic64" 2
+ 			 (and (eq_attr "cpu" "generic64")
+ 			      (and (eq_attr "type" "ssemov")
+ 				   (match_operand:DF 1 "memory_operand" "")))
+ 			 "athlon-double,athlon-fploadk8,(athlon-fstore+athlon-fmul)")
  (define_insn_reservation "athlon_movaps_load_k8" 2
! 			 (and (eq_attr "cpu" "k8,generic64")
  			      (and (eq_attr "type" "ssemov")
  				   (and (eq_attr "mode" "V4SF,V2DF,TI")
  					(eq_attr "memory" "load"))))
***************
*** 496,502 ****
  					(eq_attr "memory" "load"))))
  			 "athlon-vector,athlon-fpload,(athlon-fany*2)")
  (define_insn_reservation "athlon_movss_load_k8" 1
! 			 (and (eq_attr "cpu" "k8")
  			      (and (eq_attr "type" "ssemov")
  				   (and (eq_attr "mode" "SF,DI")
  					(eq_attr "memory" "load"))))
--- 501,507 ----
  					(eq_attr "memory" "load"))))
  			 "athlon-vector,athlon-fpload,(athlon-fany*2)")
  (define_insn_reservation "athlon_movss_load_k8" 1
! 			 (and (eq_attr "cpu" "k8,generic64")
  			      (and (eq_attr "type" "ssemov")
  				   (and (eq_attr "mode" "SF,DI")
  					(eq_attr "memory" "load"))))
***************
*** 507,563 ****
  				   (eq_attr "memory" "load")))
  			 "athlon-direct,athlon-fpload,athlon-fany")
  (define_insn_reservation "athlon_mmxsseld_k8" 2
! 			 (and (eq_attr "cpu" "k8")
  			      (and (eq_attr "type" "mmxmov,ssemov")
  				   (eq_attr "memory" "load")))
  			 "athlon-direct,athlon-fploadk8,athlon-fstore")
  (define_insn_reservation "athlon_mmxssest" 3
! 			 (and (eq_attr "cpu" "k8")
  			      (and (eq_attr "type" "mmxmov,ssemov")
  				   (and (eq_attr "mode" "V4SF,V2DF,TI")
  					(eq_attr "memory" "store,both"))))
  			 "athlon-vector,(athlon-fpsched+athlon-agu),((athlon-fstore+athlon-store2)*2)")
  (define_insn_reservation "athlon_mmxssest_k8" 3
! 			 (and (eq_attr "cpu" "k8")
  			      (and (eq_attr "type" "mmxmov,ssemov")
  				   (and (eq_attr "mode" "V4SF,V2DF,TI")
  					(eq_attr "memory" "store,both"))))
  			 "athlon-double,(athlon-fpsched+athlon-agu),((athlon-fstore+athlon-store2)*2)")
  (define_insn_reservation "athlon_mmxssest_short" 2
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (and (eq_attr "type" "mmxmov,ssemov")
  				   (eq_attr "memory" "store,both")))
  			 "athlon-direct,(athlon-fpsched+athlon-agu),(athlon-fstore+athlon-store)")
! (define_insn_reservation "athlon_movaps" 2
! 			 (and (eq_attr "cpu" "k8")
  			      (and (eq_attr "type" "ssemov")
  				   (eq_attr "mode" "V4SF,V2DF,TI")))
! 			 "athlon-double,athlon-fpsched,(athlon-faddmul+athlon-faddmul)")
! (define_insn_reservation "athlon_movaps_k8" 2
  			 (and (eq_attr "cpu" "athlon")
  			      (and (eq_attr "type" "ssemov")
  				   (eq_attr "mode" "V4SF,V2DF,TI")))
  			 "athlon-vector,athlon-fpsched,(athlon-faddmul+athlon-faddmul)")
  (define_insn_reservation "athlon_mmxssemov" 2
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (eq_attr "type" "mmxmov,ssemov"))
  			 "athlon-direct,athlon-fpsched,athlon-faddmul")
  (define_insn_reservation "athlon_mmxmul_load" 4
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (and (eq_attr "type" "mmxmul")
  				   (eq_attr "memory" "load")))
  			 "athlon-direct,athlon-fpload,athlon-fmul")
  (define_insn_reservation "athlon_mmxmul" 3
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (eq_attr "type" "mmxmul"))
  			 "athlon-direct,athlon-fpsched,athlon-fmul")
  (define_insn_reservation "athlon_mmx_load" 3
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (and (eq_attr "unit" "mmx")
  				   (eq_attr "memory" "load")))
  			 "athlon-direct,athlon-fpload,athlon-faddmul")
  (define_insn_reservation "athlon_mmx" 2
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (eq_attr "unit" "mmx"))
  			 "athlon-direct,athlon-fpsched,athlon-faddmul")
  ;; SSE operations are handled by the i387 unit as well.  The latency
--- 512,568 ----
  				   (eq_attr "memory" "load")))
  			 "athlon-direct,athlon-fpload,athlon-fany")
  (define_insn_reservation "athlon_mmxsseld_k8" 2
! 			 (and (eq_attr "cpu" "k8,generic64")
  			      (and (eq_attr "type" "mmxmov,ssemov")
  				   (eq_attr "memory" "load")))
  			 "athlon-direct,athlon-fploadk8,athlon-fstore")
  (define_insn_reservation "athlon_mmxssest" 3
! 			 (and (eq_attr "cpu" "k8,generic64")
  			      (and (eq_attr "type" "mmxmov,ssemov")
  				   (and (eq_attr "mode" "V4SF,V2DF,TI")
  					(eq_attr "memory" "store,both"))))
  			 "athlon-vector,(athlon-fpsched+athlon-agu),((athlon-fstore+athlon-store2)*2)")
  (define_insn_reservation "athlon_mmxssest_k8" 3
! 			 (and (eq_attr "cpu" "k8,generic64")
  			      (and (eq_attr "type" "mmxmov,ssemov")
  				   (and (eq_attr "mode" "V4SF,V2DF,TI")
  					(eq_attr "memory" "store,both"))))
  			 "athlon-double,(athlon-fpsched+athlon-agu),((athlon-fstore+athlon-store2)*2)")
  (define_insn_reservation "athlon_mmxssest_short" 2
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (and (eq_attr "type" "mmxmov,ssemov")
  				   (eq_attr "memory" "store,both")))
  			 "athlon-direct,(athlon-fpsched+athlon-agu),(athlon-fstore+athlon-store)")
! (define_insn_reservation "athlon_movaps_k8" 2
! 			 (and (eq_attr "cpu" "k8,generic64")
  			      (and (eq_attr "type" "ssemov")
  				   (eq_attr "mode" "V4SF,V2DF,TI")))
! 			 "athlon-double,athlon-fpsched,((athlon-faddmul+athlon-faddmul) | (athlon-faddmul, athlon-faddmul))")
! (define_insn_reservation "athlon_movaps" 2
  			 (and (eq_attr "cpu" "athlon")
  			      (and (eq_attr "type" "ssemov")
  				   (eq_attr "mode" "V4SF,V2DF,TI")))
  			 "athlon-vector,athlon-fpsched,(athlon-faddmul+athlon-faddmul)")
  (define_insn_reservation "athlon_mmxssemov" 2
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (eq_attr "type" "mmxmov,ssemov"))
  			 "athlon-direct,athlon-fpsched,athlon-faddmul")
  (define_insn_reservation "athlon_mmxmul_load" 4
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (and (eq_attr "type" "mmxmul")
  				   (eq_attr "memory" "load")))
  			 "athlon-direct,athlon-fpload,athlon-fmul")
  (define_insn_reservation "athlon_mmxmul" 3
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (eq_attr "type" "mmxmul"))
  			 "athlon-direct,athlon-fpsched,athlon-fmul")
  (define_insn_reservation "athlon_mmx_load" 3
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (and (eq_attr "unit" "mmx")
  				   (eq_attr "memory" "load")))
  			 "athlon-direct,athlon-fpload,athlon-faddmul")
  (define_insn_reservation "athlon_mmx" 2
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (eq_attr "unit" "mmx"))
  			 "athlon-direct,athlon-fpsched,athlon-faddmul")
  ;; SSE operations are handled by the i387 unit as well.  The latency
***************
*** 569,575 ****
  				   (eq_attr "memory" "load")))
  			 "athlon-vector,athlon-fpload2,(athlon-fmul*2)")
  (define_insn_reservation "athlon_sselog_load_k8" 5
! 			 (and (eq_attr "cpu" "k8")
  			      (and (eq_attr "type" "sselog,sselog1")
  				   (eq_attr "memory" "load")))
  			 "athlon-double,athlon-fpload2k8,(athlon-fmul*2)")
--- 574,580 ----
  				   (eq_attr "memory" "load")))
  			 "athlon-vector,athlon-fpload2,(athlon-fmul*2)")
  (define_insn_reservation "athlon_sselog_load_k8" 5
! 			 (and (eq_attr "cpu" "k8,generic64")
  			      (and (eq_attr "type" "sselog,sselog1")
  				   (eq_attr "memory" "load")))
  			 "athlon-double,athlon-fpload2k8,(athlon-fmul*2)")
***************
*** 578,584 ****
  			      (eq_attr "type" "sselog,sselog1"))
  			 "athlon-vector,athlon-fpsched,athlon-fmul*2")
  (define_insn_reservation "athlon_sselog_k8" 3
! 			 (and (eq_attr "cpu" "k8")
  			      (eq_attr "type" "sselog,sselog1"))
  			 "athlon-double,athlon-fpsched,athlon-fmul")
  ;; ??? pcmp executes in addmul, probably not worthwhile to bother about that.
--- 583,589 ----
  			      (eq_attr "type" "sselog,sselog1"))
  			 "athlon-vector,athlon-fpsched,athlon-fmul*2")
  (define_insn_reservation "athlon_sselog_k8" 3
! 			 (and (eq_attr "cpu" "k8,generic64")
  			      (eq_attr "type" "sselog,sselog1"))
  			 "athlon-double,athlon-fpsched,athlon-fmul")
  ;; ??? pcmp executes in addmul, probably not worthwhile to bother about that.
***************
*** 589,601 ****
  					(eq_attr "memory" "load"))))
  			 "athlon-direct,athlon-fpload,athlon-fadd")
  (define_insn_reservation "athlon_ssecmp_load_k8" 4
! 			 (and (eq_attr "cpu" "k8")
  			      (and (eq_attr "type" "ssecmp")
  				   (and (eq_attr "mode" "SF,DF,DI,TI")
  					(eq_attr "memory" "load"))))
  			 "athlon-direct,athlon-fploadk8,athlon-fadd")
  (define_insn_reservation "athlon_ssecmp" 2
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (and (eq_attr "type" "ssecmp")
  				   (eq_attr "mode" "SF,DF,DI,TI")))
  			 "athlon-direct,athlon-fpsched,athlon-fadd")
--- 594,606 ----
  					(eq_attr "memory" "load"))))
  			 "athlon-direct,athlon-fpload,athlon-fadd")
  (define_insn_reservation "athlon_ssecmp_load_k8" 4
! 			 (and (eq_attr "cpu" "k8,generic64")
  			      (and (eq_attr "type" "ssecmp")
  				   (and (eq_attr "mode" "SF,DF,DI,TI")
  					(eq_attr "memory" "load"))))
  			 "athlon-direct,athlon-fploadk8,athlon-fadd")
  (define_insn_reservation "athlon_ssecmp" 2
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (and (eq_attr "type" "ssecmp")
  				   (eq_attr "mode" "SF,DF,DI,TI")))
  			 "athlon-direct,athlon-fpsched,athlon-fadd")
***************
*** 605,611 ****
  				   (eq_attr "memory" "load")))
  			 "athlon-vector,athlon-fpload2,(athlon-fadd*2)")
  (define_insn_reservation "athlon_ssecmpvector_load_k8" 5
! 			 (and (eq_attr "cpu" "k8")
  			      (and (eq_attr "type" "ssecmp")
  				   (eq_attr "memory" "load")))
  			 "athlon-double,athlon-fpload2k8,(athlon-fadd*2)")
--- 610,616 ----
  				   (eq_attr "memory" "load")))
  			 "athlon-vector,athlon-fpload2,(athlon-fadd*2)")
  (define_insn_reservation "athlon_ssecmpvector_load_k8" 5
! 			 (and (eq_attr "cpu" "k8,generic64")
  			      (and (eq_attr "type" "ssecmp")
  				   (eq_attr "memory" "load")))
  			 "athlon-double,athlon-fpload2k8,(athlon-fadd*2)")
***************
*** 614,620 ****
  			      (eq_attr "type" "ssecmp"))
  			 "athlon-vector,athlon-fpsched,(athlon-fadd*2)")
  (define_insn_reservation "athlon_ssecmpvector_k8" 3
! 			 (and (eq_attr "cpu" "k8")
  			      (eq_attr "type" "ssecmp"))
  			 "athlon-double,athlon-fpsched,(athlon-fadd*2)")
  (define_insn_reservation "athlon_ssecomi_load" 4
--- 619,625 ----
  			      (eq_attr "type" "ssecmp"))
  			 "athlon-vector,athlon-fpsched,(athlon-fadd*2)")
  (define_insn_reservation "athlon_ssecmpvector_k8" 3
! 			 (and (eq_attr "cpu" "k8,generic64")
  			      (eq_attr "type" "ssecmp"))
  			 "athlon-double,athlon-fpsched,(athlon-fadd*2)")
  (define_insn_reservation "athlon_ssecomi_load" 4
***************
*** 623,634 ****
  				   (eq_attr "memory" "load")))
  			 "athlon-vector,athlon-fpload,athlon-fadd")
  (define_insn_reservation "athlon_ssecomi_load_k8" 6
! 			 (and (eq_attr "cpu" "k8")
  			      (and (eq_attr "type" "ssecomi")
  				   (eq_attr "memory" "load")))
  			 "athlon-vector,athlon-fploadk8,athlon-fadd")
  (define_insn_reservation "athlon_ssecomi" 4
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (eq_attr "type" "ssecmp"))
  			 "athlon-vector,athlon-fpsched,athlon-fadd")
  (define_insn_reservation "athlon_sseadd_load" 4
--- 628,639 ----
  				   (eq_attr "memory" "load")))
  			 "athlon-vector,athlon-fpload,athlon-fadd")
  (define_insn_reservation "athlon_ssecomi_load_k8" 6
! 			 (and (eq_attr "cpu" "k8,generic64")
  			      (and (eq_attr "type" "ssecomi")
  				   (eq_attr "memory" "load")))
  			 "athlon-vector,athlon-fploadk8,athlon-fadd")
  (define_insn_reservation "athlon_ssecomi" 4
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (eq_attr "type" "ssecmp"))
  			 "athlon-vector,athlon-fpsched,athlon-fadd")
  (define_insn_reservation "athlon_sseadd_load" 4
***************
*** 638,650 ****
  					(eq_attr "memory" "load"))))
  			 "athlon-direct,athlon-fpload,athlon-fadd")
  (define_insn_reservation "athlon_sseadd_load_k8" 6
! 			 (and (eq_attr "cpu" "k8")
  			      (and (eq_attr "type" "sseadd")
  				   (and (eq_attr "mode" "SF,DF,DI")
  					(eq_attr "memory" "load"))))
  			 "athlon-direct,athlon-fploadk8,athlon-fadd")
  (define_insn_reservation "athlon_sseadd" 4
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (and (eq_attr "type" "sseadd")
  				   (eq_attr "mode" "SF,DF,DI")))
  			 "athlon-direct,athlon-fpsched,athlon-fadd")
--- 643,655 ----
  					(eq_attr "memory" "load"))))
  			 "athlon-direct,athlon-fpload,athlon-fadd")
  (define_insn_reservation "athlon_sseadd_load_k8" 6
! 			 (and (eq_attr "cpu" "k8,generic64")
  			      (and (eq_attr "type" "sseadd")
  				   (and (eq_attr "mode" "SF,DF,DI")
  					(eq_attr "memory" "load"))))
  			 "athlon-direct,athlon-fploadk8,athlon-fadd")
  (define_insn_reservation "athlon_sseadd" 4
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (and (eq_attr "type" "sseadd")
  				   (eq_attr "mode" "SF,DF,DI")))
  			 "athlon-direct,athlon-fpsched,athlon-fadd")
***************
*** 654,660 ****
  				   (eq_attr "memory" "load")))
  			 "athlon-vector,athlon-fpload2,(athlon-fadd*2)")
  (define_insn_reservation "athlon_sseaddvector_load_k8" 7
! 			 (and (eq_attr "cpu" "k8")
  			      (and (eq_attr "type" "sseadd")
  				   (eq_attr "memory" "load")))
  			 "athlon-double,athlon-fpload2k8,(athlon-fadd*2)")
--- 659,665 ----
  				   (eq_attr "memory" "load")))
  			 "athlon-vector,athlon-fpload2,(athlon-fadd*2)")
  (define_insn_reservation "athlon_sseaddvector_load_k8" 7
! 			 (and (eq_attr "cpu" "k8,generic64")
  			      (and (eq_attr "type" "sseadd")
  				   (eq_attr "memory" "load")))
  			 "athlon-double,athlon-fpload2k8,(athlon-fadd*2)")
***************
*** 663,669 ****
  			      (eq_attr "type" "sseadd"))
  			 "athlon-vector,athlon-fpsched,(athlon-fadd*2)")
  (define_insn_reservation "athlon_sseaddvector_k8" 5
! 			 (and (eq_attr "cpu" "k8")
  			      (eq_attr "type" "sseadd"))
  			 "athlon-double,athlon-fpsched,(athlon-fadd*2)")
  
--- 668,674 ----
  			      (eq_attr "type" "sseadd"))
  			 "athlon-vector,athlon-fpsched,(athlon-fadd*2)")
  (define_insn_reservation "athlon_sseaddvector_k8" 5
! 			 (and (eq_attr "cpu" "k8,generic64")
  			      (eq_attr "type" "sseadd"))
  			 "athlon-double,athlon-fpsched,(athlon-fadd*2)")
  
***************
*** 673,700 ****
  
  ;; cvtss2sd
  (define_insn_reservation "athlon_ssecvt_cvtss2sd_load_k8" 4
! 			 (and (eq_attr "cpu" "k8,athlon")
  			      (and (eq_attr "type" "ssecvt")
  				   (and (eq_attr "athlon_decode" "direct")
  					(and (eq_attr "mode" "DF")
  					     (eq_attr "memory" "load")))))
  			 "athlon-direct,athlon-fploadk8,athlon-fstore")
  (define_insn_reservation "athlon_ssecvt_cvtss2sd" 2
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (and (eq_attr "type" "ssecvt")
  				   (and (eq_attr "athlon_decode" "direct")
  					(eq_attr "mode" "DF"))))
  			 "athlon-direct,athlon-fpsched,athlon-fstore")
  ;; cvtps2pd.  Model same way the other double decoded FP conversions.
  (define_insn_reservation "athlon_ssecvt_cvtps2pd_load_k8" 5
! 			 (and (eq_attr "cpu" "k8,athlon")
  			      (and (eq_attr "type" "ssecvt")
  				   (and (eq_attr "athlon_decode" "double")
  					(and (eq_attr "mode" "V2DF,V4SF,TI")
  					     (eq_attr "memory" "load")))))
  			 "athlon-double,athlon-fpload2k8,(athlon-fstore*2)")
  (define_insn_reservation "athlon_ssecvt_cvtps2pd_k8" 3
! 			 (and (eq_attr "cpu" "k8,athlon")
  			      (and (eq_attr "type" "ssecvt")
  				   (and (eq_attr "athlon_decode" "double")
  					(eq_attr "mode" "V2DF,V4SF,TI"))))
--- 678,705 ----
  
  ;; cvtss2sd
  (define_insn_reservation "athlon_ssecvt_cvtss2sd_load_k8" 4
! 			 (and (eq_attr "cpu" "k8,athlon,generic64")
  			      (and (eq_attr "type" "ssecvt")
  				   (and (eq_attr "athlon_decode" "direct")
  					(and (eq_attr "mode" "DF")
  					     (eq_attr "memory" "load")))))
  			 "athlon-direct,athlon-fploadk8,athlon-fstore")
  (define_insn_reservation "athlon_ssecvt_cvtss2sd" 2
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (and (eq_attr "type" "ssecvt")
  				   (and (eq_attr "athlon_decode" "direct")
  					(eq_attr "mode" "DF"))))
  			 "athlon-direct,athlon-fpsched,athlon-fstore")
  ;; cvtps2pd.  Model same way the other double decoded FP conversions.
  (define_insn_reservation "athlon_ssecvt_cvtps2pd_load_k8" 5
! 			 (and (eq_attr "cpu" "k8,athlon,generic64")
  			      (and (eq_attr "type" "ssecvt")
  				   (and (eq_attr "athlon_decode" "double")
  					(and (eq_attr "mode" "V2DF,V4SF,TI")
  					     (eq_attr "memory" "load")))))
  			 "athlon-double,athlon-fpload2k8,(athlon-fstore*2)")
  (define_insn_reservation "athlon_ssecvt_cvtps2pd_k8" 3
! 			 (and (eq_attr "cpu" "k8,athlon,generic64")
  			      (and (eq_attr "type" "ssecvt")
  				   (and (eq_attr "athlon_decode" "double")
  					(eq_attr "mode" "V2DF,V4SF,TI"))))
***************
*** 717,723 ****
  					     (eq_attr "memory" "load")))))
  			 "athlon-vector,athlon-fpload,(athlon-fstore*2)")
  (define_insn_reservation "athlon_sseicvt_cvtsi2ss_load_k8" 9
! 			 (and (eq_attr "cpu" "k8")
  			      (and (eq_attr "type" "sseicvt")
  				   (and (eq_attr "athlon_decode" "double")
  					(and (eq_attr "mode" "SF,DF")
--- 722,728 ----
  					     (eq_attr "memory" "load")))))
  			 "athlon-vector,athlon-fpload,(athlon-fstore*2)")
  (define_insn_reservation "athlon_sseicvt_cvtsi2ss_load_k8" 9
! 			 (and (eq_attr "cpu" "k8,generic64")
  			      (and (eq_attr "type" "sseicvt")
  				   (and (eq_attr "athlon_decode" "double")
  					(and (eq_attr "mode" "SF,DF")
***************
*** 725,731 ****
  			 "athlon-double,athlon-fploadk8,(athlon-fstore*2)")
  ;; cvtsi2sd reg,reg is double decoded (vector on Athlon)
  (define_insn_reservation "athlon_sseicvt_cvtsi2sd_k8" 11
! 			 (and (eq_attr "cpu" "k8,athlon")
  			      (and (eq_attr "type" "sseicvt")
  				   (and (eq_attr "athlon_decode" "double")
  					(and (eq_attr "mode" "SF,DF")
--- 730,736 ----
  			 "athlon-double,athlon-fploadk8,(athlon-fstore*2)")
  ;; cvtsi2sd reg,reg is double decoded (vector on Athlon)
  (define_insn_reservation "athlon_sseicvt_cvtsi2sd_k8" 11
! 			 (and (eq_attr "cpu" "k8,athlon,generic64")
  			      (and (eq_attr "type" "sseicvt")
  				   (and (eq_attr "athlon_decode" "double")
  					(and (eq_attr "mode" "SF,DF")
***************
*** 733,739 ****
  			 "athlon-double,athlon-fploadk8,athlon-fstore")
  ;; cvtsi2ss reg, reg is doublepath
  (define_insn_reservation "athlon_sseicvt_cvtsi2ss" 14
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (and (eq_attr "type" "sseicvt")
  				   (and (eq_attr "athlon_decode" "vector")
  					(and (eq_attr "mode" "SF,DF")
--- 738,744 ----
  			 "athlon-double,athlon-fploadk8,athlon-fstore")
  ;; cvtsi2ss reg, reg is doublepath
  (define_insn_reservation "athlon_sseicvt_cvtsi2ss" 14
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (and (eq_attr "type" "sseicvt")
  				   (and (eq_attr "athlon_decode" "vector")
  					(and (eq_attr "mode" "SF,DF")
***************
*** 741,747 ****
  			 "athlon-vector,athlon-fploadk8,(athlon-fvector*2)")
  ;; cvtsd2ss mem,reg is doublepath, troughput unknown, latency 9
  (define_insn_reservation "athlon_ssecvt_cvtsd2ss_load_k8" 9
! 			 (and (eq_attr "cpu" "k8,athlon")
  			      (and (eq_attr "type" "ssecvt")
  				   (and (eq_attr "athlon_decode" "double")
  					(and (eq_attr "mode" "SF")
--- 746,752 ----
  			 "athlon-vector,athlon-fploadk8,(athlon-fvector*2)")
  ;; cvtsd2ss mem,reg is doublepath, troughput unknown, latency 9
  (define_insn_reservation "athlon_ssecvt_cvtsd2ss_load_k8" 9
! 			 (and (eq_attr "cpu" "k8,athlon,generic64")
  			      (and (eq_attr "type" "ssecvt")
  				   (and (eq_attr "athlon_decode" "double")
  					(and (eq_attr "mode" "SF")
***************
*** 749,762 ****
  			 "athlon-double,athlon-fploadk8,(athlon-fstore*3)")
  ;; cvtsd2ss reg,reg is vectorpath, troughput unknown, latency 12
  (define_insn_reservation "athlon_ssecvt_cvtsd2ss" 12
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (and (eq_attr "type" "ssecvt")
  				   (and (eq_attr "athlon_decode" "vector")
  					(and (eq_attr "mode" "SF")
  					     (eq_attr "memory" "none")))))
  			 "athlon-vector,athlon-fpsched,(athlon-fvector*3)")
  (define_insn_reservation "athlon_ssecvt_cvtpd2ps_load_k8" 8
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (and (eq_attr "type" "ssecvt")
  				   (and (eq_attr "athlon_decode" "vector")
  					(and (eq_attr "mode" "V4SF,V2DF,TI")
--- 754,767 ----
  			 "athlon-double,athlon-fploadk8,(athlon-fstore*3)")
  ;; cvtsd2ss reg,reg is vectorpath, troughput unknown, latency 12
  (define_insn_reservation "athlon_ssecvt_cvtsd2ss" 12
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (and (eq_attr "type" "ssecvt")
  				   (and (eq_attr "athlon_decode" "vector")
  					(and (eq_attr "mode" "SF")
  					     (eq_attr "memory" "none")))))
  			 "athlon-vector,athlon-fpsched,(athlon-fvector*3)")
  (define_insn_reservation "athlon_ssecvt_cvtpd2ps_load_k8" 8
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (and (eq_attr "type" "ssecvt")
  				   (and (eq_attr "athlon_decode" "vector")
  					(and (eq_attr "mode" "V4SF,V2DF,TI")
***************
*** 765,771 ****
  ;; cvtpd2ps mem,reg is vectorpath, troughput unknown, latency 10
  ;; ??? Why it is fater than cvtsd2ss?
  (define_insn_reservation "athlon_ssecvt_cvtpd2ps" 8
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (and (eq_attr "type" "ssecvt")
  				   (and (eq_attr "athlon_decode" "vector")
  					(and (eq_attr "mode" "V4SF,V2DF,TI")
--- 770,776 ----
  ;; cvtpd2ps mem,reg is vectorpath, troughput unknown, latency 10
  ;; ??? Why it is fater than cvtsd2ss?
  (define_insn_reservation "athlon_ssecvt_cvtpd2ps" 8
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (and (eq_attr "type" "ssecvt")
  				   (and (eq_attr "athlon_decode" "vector")
  					(and (eq_attr "mode" "V4SF,V2DF,TI")
***************
*** 773,779 ****
  			 "athlon-vector,athlon-fpsched,athlon-fvector*2")
  ;; cvtsd2si mem,reg is doublepath, troughput 1, latency 9
  (define_insn_reservation "athlon_secvt_cvtsX2si_load" 9
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (and (eq_attr "type" "sseicvt")
  				   (and (eq_attr "athlon_decode" "vector")
  					(and (eq_attr "mode" "SI,DI")
--- 778,784 ----
  			 "athlon-vector,athlon-fpsched,athlon-fvector*2")
  ;; cvtsd2si mem,reg is doublepath, troughput 1, latency 9
  (define_insn_reservation "athlon_secvt_cvtsX2si_load" 9
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (and (eq_attr "type" "sseicvt")
  				   (and (eq_attr "athlon_decode" "vector")
  					(and (eq_attr "mode" "SI,DI")
***************
*** 788,794 ****
  					     (eq_attr "memory" "none")))))
  			 "athlon-vector,athlon-fpsched,athlon-fvector")
  (define_insn_reservation "athlon_ssecvt_cvtsX2si_k8" 9
! 			 (and (eq_attr "cpu" "k8")
  			      (and (eq_attr "type" "sseicvt")
  				   (and (eq_attr "athlon_decode" "double")
  					(and (eq_attr "mode" "SI,DI")
--- 793,799 ----
  					     (eq_attr "memory" "none")))))
  			 "athlon-vector,athlon-fpsched,athlon-fvector")
  (define_insn_reservation "athlon_ssecvt_cvtsX2si_k8" 9
! 			 (and (eq_attr "cpu" "k8,generic64")
  			      (and (eq_attr "type" "sseicvt")
  				   (and (eq_attr "athlon_decode" "double")
  					(and (eq_attr "mode" "SI,DI")
***************
*** 803,815 ****
  					(eq_attr "memory" "load"))))
  			 "athlon-direct,athlon-fpload,athlon-fmul")
  (define_insn_reservation "athlon_ssemul_load_k8" 6
! 			 (and (eq_attr "cpu" "k8")
  			      (and (eq_attr "type" "ssemul")
  				   (and (eq_attr "mode" "SF,DF")
  					(eq_attr "memory" "load"))))
  			 "athlon-direct,athlon-fploadk8,athlon-fmul")
  (define_insn_reservation "athlon_ssemul" 4
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (and (eq_attr "type" "ssemul")
  				   (eq_attr "mode" "SF,DF")))
  			 "athlon-direct,athlon-fpsched,athlon-fmul")
--- 808,820 ----
  					(eq_attr "memory" "load"))))
  			 "athlon-direct,athlon-fpload,athlon-fmul")
  (define_insn_reservation "athlon_ssemul_load_k8" 6
! 			 (and (eq_attr "cpu" "k8,generic64")
  			      (and (eq_attr "type" "ssemul")
  				   (and (eq_attr "mode" "SF,DF")
  					(eq_attr "memory" "load"))))
  			 "athlon-direct,athlon-fploadk8,athlon-fmul")
  (define_insn_reservation "athlon_ssemul" 4
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (and (eq_attr "type" "ssemul")
  				   (eq_attr "mode" "SF,DF")))
  			 "athlon-direct,athlon-fpsched,athlon-fmul")
***************
*** 819,825 ****
  				   (eq_attr "memory" "load")))
  			 "athlon-vector,athlon-fpload2,(athlon-fmul*2)")
  (define_insn_reservation "athlon_ssemulvector_load_k8" 7
! 			 (and (eq_attr "cpu" "k8")
  			      (and (eq_attr "type" "ssemul")
  				   (eq_attr "memory" "load")))
  			 "athlon-double,athlon-fpload2k8,(athlon-fmul*2)")
--- 824,830 ----
  				   (eq_attr "memory" "load")))
  			 "athlon-vector,athlon-fpload2,(athlon-fmul*2)")
  (define_insn_reservation "athlon_ssemulvector_load_k8" 7
! 			 (and (eq_attr "cpu" "k8,generic64")
  			      (and (eq_attr "type" "ssemul")
  				   (eq_attr "memory" "load")))
  			 "athlon-double,athlon-fpload2k8,(athlon-fmul*2)")
***************
*** 828,834 ****
  			      (eq_attr "type" "ssemul"))
  			 "athlon-vector,athlon-fpsched,(athlon-fmul*2)")
  (define_insn_reservation "athlon_ssemulvector_k8" 5
! 			 (and (eq_attr "cpu" "k8")
  			      (eq_attr "type" "ssemul"))
  			 "athlon-double,athlon-fpsched,(athlon-fmul*2)")
  ;; divsd timings.  divss is faster
--- 833,839 ----
  			      (eq_attr "type" "ssemul"))
  			 "athlon-vector,athlon-fpsched,(athlon-fmul*2)")
  (define_insn_reservation "athlon_ssemulvector_k8" 5
! 			 (and (eq_attr "cpu" "k8,generic64")
  			      (eq_attr "type" "ssemul"))
  			 "athlon-double,athlon-fpsched,(athlon-fmul*2)")
  ;; divsd timings.  divss is faster
***************
*** 839,851 ****
  					(eq_attr "memory" "load"))))
  			 "athlon-direct,athlon-fpload,athlon-fmul*17")
  (define_insn_reservation "athlon_ssediv_load_k8" 22
! 			 (and (eq_attr "cpu" "k8")
  			      (and (eq_attr "type" "ssediv")
  				   (and (eq_attr "mode" "SF,DF")
  					(eq_attr "memory" "load"))))
  			 "athlon-direct,athlon-fploadk8,athlon-fmul*17")
  (define_insn_reservation "athlon_ssediv" 20
! 			 (and (eq_attr "cpu" "athlon,k8")
  			      (and (eq_attr "type" "ssediv")
  				   (eq_attr "mode" "SF,DF")))
  			 "athlon-direct,athlon-fpsched,athlon-fmul*17")
--- 844,856 ----
  					(eq_attr "memory" "load"))))
  			 "athlon-direct,athlon-fpload,athlon-fmul*17")
  (define_insn_reservation "athlon_ssediv_load_k8" 22
! 			 (and (eq_attr "cpu" "k8,generic64")
  			      (and (eq_attr "type" "ssediv")
  				   (and (eq_attr "mode" "SF,DF")
  					(eq_attr "memory" "load"))))
  			 "athlon-direct,athlon-fploadk8,athlon-fmul*17")
  (define_insn_reservation "athlon_ssediv" 20
! 			 (and (eq_attr "cpu" "athlon,k8,generic64")
  			      (and (eq_attr "type" "ssediv")
  				   (eq_attr "mode" "SF,DF")))
  			 "athlon-direct,athlon-fpsched,athlon-fmul*17")
***************
*** 855,861 ****
  				   (eq_attr "memory" "load")))
  			 "athlon-vector,athlon-fpload2,athlon-fmul*34")
  (define_insn_reservation "athlon_ssedivvector_load_k8" 35
! 			 (and (eq_attr "cpu" "k8")
  			      (and (eq_attr "type" "ssediv")
  				   (eq_attr "memory" "load")))
  			 "athlon-double,athlon-fpload2k8,athlon-fmul*34")
--- 860,866 ----
  				   (eq_attr "memory" "load")))
  			 "athlon-vector,athlon-fpload2,athlon-fmul*34")
  (define_insn_reservation "athlon_ssedivvector_load_k8" 35
! 			 (and (eq_attr "cpu" "k8,generic64")
  			      (and (eq_attr "type" "ssediv")
  				   (eq_attr "memory" "load")))
  			 "athlon-double,athlon-fpload2k8,athlon-fmul*34")
***************
*** 864,869 ****
  			      (eq_attr "type" "ssediv"))
  			 "athlon-vector,athlon-fmul*34")
  (define_insn_reservation "athlon_ssedivvector_k8" 39
! 			 (and (eq_attr "cpu" "k8")
  			      (eq_attr "type" "ssediv"))
  			 "athlon-double,athlon-fmul*34")
--- 869,874 ----
  			      (eq_attr "type" "ssediv"))
  			 "athlon-vector,athlon-fmul*34")
  (define_insn_reservation "athlon_ssedivvector_k8" 39
! 			 (and (eq_attr "cpu" "k8,generic64")
  			      (eq_attr "type" "ssediv"))
  			 "athlon-double,athlon-fmul*34")
Index: config/i386/ppro.md
===================================================================
*** config/i386/ppro.md	(revision 109820)
--- config/i386/ppro.md	(working copy)
***************
*** 137,161 ****
  ;; on decoder 0, and say that it takes a little while before the result
  ;; is available.
  (define_insn_reservation "ppro_complex_insn" 6
! 			 (and (eq_attr "cpu" "pentiumpro")
  			      (eq_attr "type" "other,multi,call,callv,str"))
  			 "decoder0")
  
  ;; imov with memory operands does not use the integer units.
  (define_insn_reservation "ppro_imov" 1
! 			 (and (eq_attr "cpu" "pentiumpro")
  			      (and (eq_attr "memory" "none")
  				   (eq_attr "type" "imov")))
  			 "decodern,(p0|p1)")
  
  (define_insn_reservation "ppro_imov_load" 4
! 			 (and (eq_attr "cpu" "pentiumpro")
  			      (and (eq_attr "memory" "load")
  				   (eq_attr "type" "imov")))
  			 "decodern,p2")
  
  (define_insn_reservation "ppro_imov_store" 1
! 			 (and (eq_attr "cpu" "pentiumpro")
  			      (and (eq_attr "memory" "store")
  				   (eq_attr "type" "imov")))
  			 "decoder0,p4+p3")
--- 137,161 ----
  ;; on decoder 0, and say that it takes a little while before the result
  ;; is available.
  (define_insn_reservation "ppro_complex_insn" 6
! 			 (and (eq_attr "cpu" "pentiumpro,generic32")
  			      (eq_attr "type" "other,multi,call,callv,str"))
  			 "decoder0")
  
  ;; imov with memory operands does not use the integer units.
  (define_insn_reservation "ppro_imov" 1
! 			 (and (eq_attr "cpu" "pentiumpro,generic32")
  			      (and (eq_attr "memory" "none")
  				   (eq_attr "type" "imov")))
  			 "decodern,(p0|p1)")
  
  (define_insn_reservation "ppro_imov_load" 4
! 			 (and (eq_attr "cpu" "pentiumpro,generic32")
  			      (and (eq_attr "memory" "load")
  				   (eq_attr "type" "imov")))
  			 "decodern,p2")
  
  (define_insn_reservation "ppro_imov_store" 1
! 			 (and (eq_attr "cpu" "pentiumpro,generic32")
  			      (and (eq_attr "memory" "store")
  				   (eq_attr "type" "imov")))
  			 "decoder0,p4+p3")
***************
*** 163,182 ****
  ;; imovx always decodes to one uop, and also doesn't use the integer
  ;; units if it has memory operands.
  (define_insn_reservation "ppro_imovx" 1
! 			 (and (eq_attr "cpu" "pentiumpro")
  			      (and (eq_attr "memory" "none")
  				   (eq_attr "type" "imovx")))
  			 "decodern,(p0|p1)")
  
  (define_insn_reservation "ppro_imovx_load" 4
! 			 (and (eq_attr "cpu" "pentiumpro")
  			      (and (eq_attr "memory" "load")
  				   (eq_attr "type" "imovx")))
  			 "decodern,p2")
  
  ;; lea executes on port 0 with latency one and throughput 1.
  (define_insn_reservation "ppro_lea" 1
! 			 (and (eq_attr "cpu" "pentiumpro")
  			      (and (eq_attr "memory" "none")
  				   (eq_attr "type" "lea")))
  			 "decodern,p0")
--- 163,182 ----
  ;; imovx always decodes to one uop, and also doesn't use the integer
  ;; units if it has memory operands.
  (define_insn_reservation "ppro_imovx" 1
! 			 (and (eq_attr "cpu" "pentiumpro,generic32")
  			      (and (eq_attr "memory" "none")
  				   (eq_attr "type" "imovx")))
  			 "decodern,(p0|p1)")
  
  (define_insn_reservation "ppro_imovx_load" 4
! 			 (and (eq_attr "cpu" "pentiumpro,generic32")
  			      (and (eq_attr "memory" "load")
  				   (eq_attr "type" "imovx")))
  			 "decodern,p2")
  
  ;; lea executes on port 0 with latency one and throughput 1.
  (define_insn_reservation "ppro_lea" 1
! 			 (and (eq_attr "cpu" "pentiumpro,generic32")
  			      (and (eq_attr "memory" "none")
  				   (eq_attr "type" "lea")))
  			 "decodern,p0")
***************
*** 185,203 ****
  ;; The load and store units need to be reserved when memory operands
  ;; are involved.
  (define_insn_reservation "ppro_shift_rotate" 1
! 			 (and (eq_attr "cpu" "pentiumpro")
  			      (and (eq_attr "memory" "none")
  				   (eq_attr "type" "ishift,ishift1,rotate,rotate1")))
  			 "decodern,p0")
  
  (define_insn_reservation "ppro_shift_rotate_mem" 4
! 			 (and (eq_attr "cpu" "pentiumpro")
  			      (and (eq_attr "memory" "!none")
  				   (eq_attr "type" "ishift,ishift1,rotate,rotate1")))
  			 "decoder0,p2+p0,p4+p3")
  
  (define_insn_reservation "ppro_cld" 2
! 			 (and (eq_attr "cpu" "pentiumpro")
  			      (eq_attr "type" "cld"))
  			 "decoder0,(p0+p1)*2")
  
--- 185,203 ----
  ;; The load and store units need to be reserved when memory operands
  ;; are involved.
  (define_insn_reservation "ppro_shift_rotate" 1
! 			 (and (eq_attr "cpu" "pentiumpro,generic32")
  			      (and (eq_attr "memory" "none")
  				   (eq_attr "type" "ishift,ishift1,rotate,rotate1")))
  			 "decodern,p0")
  
  (define_insn_reservation "ppro_shift_rotate_mem" 4
! 			 (and (eq_attr "cpu" "pentiumpro,generic32")
  			      (and (eq_attr "memory" "!none")
  				   (eq_attr "type" "ishift,ishift1,rotate,rotate1")))
  			 "decoder0,p2+p0,p4+p3")
  
  (define_insn_reservation "ppro_cld" 2
! 			 (and (eq_attr "cpu" "pentiumpro,generic32")
  			      (eq_attr "type" "cld"))
  			 "decoder0,(p0+p1)*2")
  
***************
*** 219,250 ****
  ;; results because we can assume these instructions can decode on all
  ;; decoders.
  (define_insn_reservation "ppro_branch" 1
! 			 (and (eq_attr "cpu" "pentiumpro")
  			      (and (eq_attr "memory" "none")
  				   (eq_attr "type" "ibr")))
  			 "decodern,p1")
  
  ;; ??? Indirect branches probably have worse latency than this.
  (define_insn_reservation "ppro_indirect_branch" 6
! 			 (and (eq_attr "cpu" "pentiumpro")
  			      (and (eq_attr "memory" "!none")
  				   (eq_attr "type" "ibr")))
  			 "decoder0,p2+p1")
  
  (define_insn_reservation "ppro_leave" 4
! 			 (and (eq_attr "cpu" "pentiumpro")
  			      (eq_attr "type" "leave"))
  			 "decoder0,p2+(p0|p1),(p0|p1)")
  
  ;; imul has throughput one, but latency 4, and can only execute on port 0.
  (define_insn_reservation "ppro_imul" 4
! 			 (and (eq_attr "cpu" "pentiumpro")
  			      (and (eq_attr "memory" "none")
  				   (eq_attr "type" "imul")))
  			 "decodern,p0")
  
  (define_insn_reservation "ppro_imul_mem" 4
! 			 (and (eq_attr "cpu" "pentiumpro")
  			      (and (eq_attr "memory" "!none")
  				   (eq_attr "type" "imul")))
  			 "decoder0,p2+p0")
--- 219,250 ----
  ;; results because we can assume these instructions can decode on all
  ;; decoders.
  (define_insn_reservation "ppro_branch" 1
! 			 (and (eq_attr "cpu" "pentiumpro,generic32")
  			      (and (eq_attr "memory" "none")
  				   (eq_attr "type" "ibr")))
  			 "decodern,p1")
  
  ;; ??? Indirect branches probably have worse latency than this.
  (define_insn_reservation "ppro_indirect_branch" 6
! 			 (and (eq_attr "cpu" "pentiumpro,generic32")
  			      (and (eq_attr "memory" "!none")
  				   (eq_attr "type" "ibr")))
  			 "decoder0,p2+p1")
  
  (define_insn_reservation "ppro_leave" 4
! 			 (and (eq_attr "cpu" "pentiumpro,generic32")
  			      (eq_attr "type" "leave"))
  			 "decoder0,p2+(p0|p1),(p0|p1)")
  
  ;; imul has throughput one, but latency 4, and can only execute on port 0.
  (define_insn_reservation "ppro_imul" 4
! 			 (and (eq_attr "cpu" "pentiumpro,generic32")
  			      (and (eq_attr "memory" "none")
  				   (eq_attr "type" "imul")))
  			 "decodern,p0")
  
  (define_insn_reservation "ppro_imul_mem" 4
! 			 (and (eq_attr "cpu" "pentiumpro,generic32")
  			      (and (eq_attr "memory" "!none")
  				   (eq_attr "type" "imul")))
  			 "decoder0,p2+p0")
***************
*** 253,294 ****
  ;; QI, HI, and SI have issue latency 12, 21, and 37, respectively.
  ;; These issue latencies are modelled via the ppro_div automaton.
  (define_insn_reservation "ppro_idiv_QI" 19
! 			 (and (eq_attr "cpu" "pentiumpro")
  			      (and (eq_attr "memory" "none")
  				   (and (eq_attr "mode" "QI")
  					(eq_attr "type" "idiv"))))
  			 "decoder0,(p0+idiv)*2,(p0|p1)+idiv,idiv*9")
  
  (define_insn_reservation "ppro_idiv_QI_load" 19
! 			 (and (eq_attr "cpu" "pentiumpro")
  			      (and (eq_attr "memory" "load")
  				   (and (eq_attr "mode" "QI")
  					(eq_attr "type" "idiv"))))
  			 "decoder0,p2+p0+idiv,p0+idiv,(p0|p1)+idiv,idiv*9")
  
  (define_insn_reservation "ppro_idiv_HI" 23
! 			 (and (eq_attr "cpu" "pentiumpro")
  			      (and (eq_attr "memory" "none")
  				   (and (eq_attr "mode" "HI")
  					(eq_attr "type" "idiv"))))
  			 "decoder0,(p0+idiv)*3,(p0|p1)+idiv,idiv*17")
  
  (define_insn_reservation "ppro_idiv_HI_load" 23
! 			 (and (eq_attr "cpu" "pentiumpro")
  			      (and (eq_attr "memory" "load")
  				   (and (eq_attr "mode" "HI")
  					(eq_attr "type" "idiv"))))
  			 "decoder0,p2+p0+idiv,p0+idiv,(p0|p1)+idiv,idiv*18")
  
  (define_insn_reservation "ppro_idiv_SI" 39
! 			 (and (eq_attr "cpu" "pentiumpro")
  			      (and (eq_attr "memory" "none")
  				   (and (eq_attr "mode" "SI")
  					(eq_attr "type" "idiv"))))
  			 "decoder0,(p0+idiv)*3,(p0|p1)+idiv,idiv*33")
  
  (define_insn_reservation "ppro_idiv_SI_load" 39
! 			 (and (eq_attr "cpu" "pentiumpro")
  			      (and (eq_attr "memory" "load")
  				   (and (eq_attr "mode" "SI")
  					(eq_attr "type" "idiv"))))
--- 253,294 ----
  ;; QI, HI, and SI have issue latency 12, 21, and 37, respectively.
  ;; These issue latencies are modelled via the ppro_div automaton.
  (define_insn_reservation "ppro_idiv_QI" 19
! 			 (and (eq_attr "cpu" "pentiumpro,generic32")
  			      (and (eq_attr "memory" "none")
  				   (and (eq_attr "mode" "QI")
  					(eq_attr "type" "idiv"))))
  			 "decoder0,(p0+idiv)*2,(p0|p1)+idiv,idiv*9")
  
  (define_insn_reservation "ppro_idiv_QI_load" 19
! 			 (and (eq_attr "cpu" "pentiumpro,generic32")
  			      (and (eq_attr "memory" "load")
  				   (and (eq_attr "mode" "QI")
  					(eq_attr "type" "idiv"))))
  			 "decoder0,p2+p0+idiv,p0+idiv,(p0|p1)+idiv,idiv*9")
  
  (define_insn_reservation "ppro_idiv_HI" 23
! 			 (and (eq_attr "cpu" "pentiumpro,generic32")
  			      (and (eq_attr "memory" "none")
  				   (and (eq_attr "mode" "HI")
  					(eq_attr "type" "idiv"))))
  			 "decoder0,(p0+idiv)*3,(p0|p1)+idiv,idiv*17")
  
  (define_insn_reservation "ppro_idiv_HI_load" 23
! 			 (and (eq_attr "cpu" "pentiumpro,generic32")
  			      (and (eq_attr "memory" "load")
  				   (and (eq_attr "mode" "HI")
  					(eq_attr "type" "idiv"))))
  			 "decoder0,p2+p0+idiv,p0+idiv,(p0|p1)+idiv,idiv*18")
  
  (define_insn_reservation "ppro_idiv_SI" 39
! 			 (and (eq_attr "cpu" "pentiumpro,generic32")
  			      (and (eq_attr "memory" "none")
  				   (and (eq_attr "mode" "SI")
  					(eq_attr "type" "idiv"))))
  			 "decoder0,(p0+idiv)*3,(p0|p1)+idiv,idiv*33")
  
  (define_insn_reservation "ppro_idiv_SI_load" 39
! 			 (and (eq_attr "cpu" "pentiumpro,generic32")
  			      (and (eq_attr "memory" "load")
  				   (and (eq_attr "mode" "SI")
  					(eq_attr "type" "idiv"))))
***************
*** 299,383 ****
  ;;     has throughput "1/cycle (align with FADD)".  What do they
  ;;     mean and how can we model that?
  (define_insn_reservation "ppro_fop" 3
! 			 (and (eq_attr "cpu" "pentiumpro")
  			      (and (eq_attr "memory" "none,unknown")
  				   (eq_attr "type" "fop")))
  			 "decodern,p0")
  
  (define_insn_reservation "ppro_fop_load" 5
! 			 (and (eq_attr "cpu" "pentiumpro")
  			      (and (eq_attr "memory" "load")
  				   (eq_attr "type" "fop")))
  			 "decoder0,p2+p0,p0")
  
  (define_insn_reservation "ppro_fop_store" 3
! 			 (and (eq_attr "cpu" "pentiumpro")
  			      (and (eq_attr "memory" "store")
  				   (eq_attr "type" "fop")))
  			 "decoder0,p0,p0,p0+p4+p3")
  
  (define_insn_reservation "ppro_fop_both" 5
! 			 (and (eq_attr "cpu" "pentiumpro")
  			      (and (eq_attr "memory" "both")
  				   (eq_attr "type" "fop")))
  			 "decoder0,p2+p0,p0+p4+p3")
  
  (define_insn_reservation "ppro_fsgn" 1
! 			 (and (eq_attr "cpu" "pentiumpro")
  			      (eq_attr "type" "fsgn"))
  			 "decodern,p0")
  
  (define_insn_reservation "ppro_fistp" 5
! 			 (and (eq_attr "cpu" "pentiumpro")
  			      (eq_attr "type" "fistp"))
  			 "decoder0,p0*2,p4+p3")
  
  (define_insn_reservation "ppro_fcmov" 2
! 			 (and (eq_attr "cpu" "pentiumpro")
  			      (eq_attr "type" "fcmov"))
  			 "decoder0,p0*2")
  
  (define_insn_reservation "ppro_fcmp" 1
! 			 (and (eq_attr "cpu" "pentiumpro")
  			      (and (eq_attr "memory" "none")
  				   (eq_attr "type" "fcmp")))
  			 "decodern,p0")
  
  (define_insn_reservation "ppro_fcmp_load" 4
! 			 (and (eq_attr "cpu" "pentiumpro")
  			      (and (eq_attr "memory" "load")
  				   (eq_attr "type" "fcmp")))
  			 "decoder0,p2+p0")
  
  (define_insn_reservation "ppro_fmov" 1
! 			 (and (eq_attr "cpu" "pentiumpro")
  			      (and (eq_attr "memory" "none")
  				   (eq_attr "type" "fmov")))
  			 "decodern,p0")
  
  (define_insn_reservation "ppro_fmov_load" 1
! 			 (and (eq_attr "cpu" "pentiumpro")
  			      (and (eq_attr "memory" "load")
  				   (and (eq_attr "mode" "!XF")
  					(eq_attr "type" "fmov"))))
  			 "decodern,p2")
  
  (define_insn_reservation "ppro_fmov_XF_load" 3
! 			 (and (eq_attr "cpu" "pentiumpro")
  			      (and (eq_attr "memory" "load")
  				   (and (eq_attr "mode" "XF")
  					(eq_attr "type" "fmov"))))
  			 "decoder0,(p2+p0)*2")
  
  (define_insn_reservation "ppro_fmov_store" 1
! 			 (and (eq_attr "cpu" "pentiumpro")
  			      (and (eq_attr "memory" "store")
  				   (and (eq_attr "mode" "!XF")
  					(eq_attr "type" "fmov"))))
  			 "decodern,p0")
  
  (define_insn_reservation "ppro_fmov_XF_store" 3
! 			 (and (eq_attr "cpu" "pentiumpro")
  			      (and (eq_attr "memory" "store")
  				   (and (eq_attr "mode" "XF")
  					(eq_attr "type" "fmov"))))
--- 299,383 ----
  ;;     has throughput "1/cycle (align with FADD)".  What do they
  ;;     mean and how can we model that?
  (define_insn_reservation "ppro_fop" 3
! 			 (and (eq_attr "cpu" "pentiumpro,generic32")
  			      (and (eq_attr "memory" "none,unknown")
  				   (eq_attr "type" "fop")))
  			 "decodern,p0")
  
  (define_insn_reservation "ppro_fop_load" 5
! 			 (and (eq_attr "cpu" "pentiumpro,generic32")
  			      (and (eq_attr "memory" "load")
  				   (eq_attr "type" "fop")))
  			 "decoder0,p2+p0,p0")
  
  (define_insn_reservation "ppro_fop_store" 3
! 			 (and (eq_attr "cpu" "pentiumpro,generic32")
  			      (and (eq_attr "memory" "store")
  				   (eq_attr "type" "fop")))
  			 "decoder0,p0,p0,p0+p4+p3")
  
  (define_insn_reservation "ppro_fop_both" 5
! 			 (and (eq_attr "cpu" "pentiumpro,generic32")
  			      (and (eq_attr "memory" "both")
  				   (eq_attr "type" "fop")))
  			 "decoder0,p2+p0,p0+p4+p3")
  
  (define_insn_reservation "ppro_fsgn" 1
! 			 (and (eq_attr "cpu" "pentiumpro,generic32")
  			      (eq_attr "type" "fsgn"))
  			 "decodern,p0")
  
  (define_insn_reservation "ppro_fistp" 5
! 			 (and (eq_attr "cpu" "pentiumpro,generic32")
  			      (eq_attr "type" "fistp"))
  			 "decoder0,p0*2,p4+p3")
  
  (define_insn_reservation "ppro_fcmov" 2
! 			 (and (eq_attr "cpu" "pentiumpro,generic32")
  			      (eq_attr "type" "fcmov"))
  			 "decoder0,p0*2")
  
  (define_insn_reservation "ppro_fcmp" 1
! 			 (and (eq_attr "cpu" "pentiumpro,generic32")
  			      (and (eq_attr "memory" "none")
  				   (eq_attr "type" "fcmp")))
  			 "decodern,p0")
  
  (define_insn_reservation "ppro_fcmp_load" 4
! 			 (and (eq_attr "cpu" "pentiumpro,generic32")
  			      (and (eq_attr "memory" "load")
  				   (eq_attr "type" "fcmp")))
  			 "decoder0,p2+p0")
  
  (define_insn_reservation "ppro_fmov" 1
! 			 (and (eq_attr "cpu" "pentiumpro,generic32")
  			      (and (eq_attr "memory" "none")
  				   (eq_attr "type" "fmov")))
  			 "decodern,p0")
  
  (define_insn_reservation "ppro_fmov_load" 1
! 			 (and (eq_attr "cpu" "pentiumpro,generic32")
  			      (and (eq_attr "memory" "load")
  				   (and (eq_attr "mode" "!XF")
  					(eq_attr "type" "fmov"))))
  			 "decodern,p2")
  
  (define_insn_reservation "ppro_fmov_XF_load" 3
! 			 (and (eq_attr "cpu" "pentiumpro,generic32")
  			      (and (eq_attr "memory" "load")
  				   (and (eq_attr "mode" "XF")
  					(eq_attr "type" "fmov"))))
  			 "decoder0,(p2+p0)*2")
  
  (define_insn_reservation "ppro_fmov_store" 1
! 			 (and (eq_attr "cpu" "pentiumpro,generic32")
  			      (and (eq_attr "memory" "store")
  				   (and (eq_attr "mode" "!XF")
  					(eq_attr "type" "fmov"))))
  			 "decodern,p0")
  
  (define_insn_reservation "ppro_fmov_XF_store" 3
! 			 (and (eq_attr "cpu" "pentiumpro,generic32")
  			      (and (eq_attr "memory" "store")
  				   (and (eq_attr "mode" "XF")
  					(eq_attr "type" "fmov"))))
***************
*** 386,398 ****
  ;; fmul executes on port 0 with latency 5.  It has issue latency 2,
  ;; but we don't model this.
  (define_insn_reservation "ppro_fmul" 5
! 			 (and (eq_attr "cpu" "pentiumpro")
  			      (and (eq_attr "memory" "none")
  				   (eq_attr "type" "fmul")))
  			 "decoder0,p0*2")
  
  (define_insn_reservation "ppro_fmul_load" 6
! 			 (and (eq_attr "cpu" "pentiumpro")
  			      (and (eq_attr "memory" "load")
  				   (eq_attr "type" "fmul")))
  			 "decoder0,p2+p0,p0")
--- 386,398 ----
  ;; fmul executes on port 0 with latency 5.  It has issue latency 2,
  ;; but we don't model this.
  (define_insn_reservation "ppro_fmul" 5
! 			 (and (eq_attr "cpu" "pentiumpro,generic32")
  			      (and (eq_attr "memory" "none")
  				   (eq_attr "type" "fmul")))
  			 "decoder0,p0*2")
  
  (define_insn_reservation "ppro_fmul_load" 6
! 			 (and (eq_attr "cpu" "pentiumpro,generic32")
  			      (and (eq_attr "memory" "load")
  				   (eq_attr "type" "fmul")))
  			 "decoder0,p2+p0,p0")
***************
*** 403,444 ****
  ;; that.  Throughput is equal to latency - 1, which we model using the
  ;; ppro_div automaton.
  (define_insn_reservation "ppro_fdiv_SF" 18
! 			 (and (eq_attr "cpu" "pentiumpro")
  			      (and (eq_attr "memory" "none")
  				   (and (eq_attr "mode" "SF")
  					(eq_attr "type" "fdiv,fpspc"))))
  			 "decodern,p0+fdiv,fdiv*16")
  
  (define_insn_reservation "ppro_fdiv_SF_load" 19
! 			 (and (eq_attr "cpu" "pentiumpro")
  			      (and (eq_attr "memory" "load")
  				   (and (eq_attr "mode" "SF")
  					(eq_attr "type" "fdiv,fpspc"))))
  			 "decoder0,p2+p0+fdiv,fdiv*16")
  
  (define_insn_reservation "ppro_fdiv_DF" 32
! 			 (and (eq_attr "cpu" "pentiumpro")
  			      (and (eq_attr "memory" "none")
  				   (and (eq_attr "mode" "DF")
  					(eq_attr "type" "fdiv,fpspc"))))
  			 "decodern,p0+fdiv,fdiv*30")
  
  (define_insn_reservation "ppro_fdiv_DF_load" 33
! 			 (and (eq_attr "cpu" "pentiumpro")
  			      (and (eq_attr "memory" "load")
  				   (and (eq_attr "mode" "DF")
  					(eq_attr "type" "fdiv,fpspc"))))
  			 "decoder0,p2+p0+fdiv,fdiv*30")
  
  (define_insn_reserva