This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

head: MIPS: Complete the R4000 multiply/shift errata workaround


Hello,

 As you may know there are errata in the original R4000 processors, both
the initial public revision 2.2 and in the subsequent revision 3.0, that
under certain conditions lead to incorrect execution of certain shift
instructions while an integer multiplication is being processed.  There
are initial hooks to handle this problem currently in gcc, but the
workaround was never finished.

 I've been able to trigger the errata when I ported the 64-bit Linux
kernel to a system using an R4000SC rev.3.0 processor -- they caused the
vm subsystem to malfunction due to "impossible" conditions.  After
studying the problem in details I'm now able to reliably reproduce failing
instruction sequences.

 Here is a patch that I've prepared to take care of the unhandled cases.  
Like with the bits already present, these cases are handled implicitly
when the R4000 processor is selected for code generation (-march=r4000).  
With a predecessor of this patch that was prepared for gcc 2.95, I've been
running 64-bit Linux kernels successfully and reliably for over a year.

2004-02-19  Maciej W. Rozycki  <macro@ds2.pg.gda.pl>

	* config/mips/mips.md: Complete the unfinished R4000 
	multiply/shift errata workaround.  Improve documentation.
	(hazard): Take the mips16 setting into account for the "imul" 
	type.
	(muldi3): Take the mips16 setting into account.
	(muldi3_mult3, muldi3_r4000): New patterns, replacing 
	muldi3_internal2.
	(muldi3_internal2): Removed.
	(mulsidi3_32bit_internal, mulsidi3_32bit_r4000): New patterns, 
	replacing mulsidi3_32bit.
	(mulsidi3_32bit): Removed.
	(mulsidi3, mulsidi3_64bit, mulsidi3_64bit_parts): Take the errata 
	into account.
	(mulsidi3_64bit_r4000, mulsidi3_64bit_parts_r4000): New patterns.
	(umulsidi3_32bit_internal, umulsidi3_32bit_r4000): New patterns,
	replacing umulsidi3_32bit.
	(umulsidi3): Take the errata into account.
	(umulsi3_highpart, umulsi3_highpart_internal): Take the errata 
	into account.
	(umulsi3_highpart_r4000): New pattern.
	(smulsi3_highpart, smulsi3_highpart_internal): Take the errata
	into account.
	(smulsi3_highpart_r4000): New pattern.
	(smuldi3_highpart): Take the errata into account.
	(smuldi3_highpart_internal, smuldi3_highpart_r4000): New patterns.
	(umuldi3_highpart): Take the errata into account.
	(umuldi3_highpart_internal, umuldi3_highpart_r4000): New patterns.

 The patch was tested with a gcc 3.4 snapshot from Nov 7th, 2003.  A build
of a bare-bones C cross-compiler (I have no suitable C library yet) for
the i386-linux host and the mips64el-linux target completed successfully.  
A 64-bit Linux kernel binary built with this compiler works correctly.  
Manual inspection of that binary revealed that all multiplications emitted
by gcc were correctly followed by an "mflo" instruction (a few inline
assembly frags uses an "mfhi" instead), therefore assuring a possible
subsequent shift operation wouldn't start before the respective
multiplications have completed.  The patch applies cleanly to the current
trunk and I have no reasons to believe it needs updates due to changes
that happened meanwhile.

 There is room for improvement here, but I give priority to correctness
over performance and flexibility.  Please apply.

 Possible improvements:

1. Differentiate between R4000 and R4400 as the latter doesn't suffer from 
the problem.

2. Add an option to toggle the workaround regardless of the processor 
selected.

3. Add a hazard lasting for 11 instructions between a multiplication and 
the affected shifts instead of forcing a move from lo.

4. Maybe others.

I'll look into them as my time permits.

  Maciej

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--------------------------------------------------------------+
+        e-mail: macro@ds2.pg.gda.pl, PGP key available        +

gcc-3.4-20031107-mips-r4000-mult.patch
diff -up --recursive --new-file gcc-3.4-20031107.macro/gcc/config/mips/mips.md gcc-3.4-20031107/gcc/config/mips/mips.md
--- gcc-3.4-20031107.macro/gcc/config/mips/mips.md	2003-11-07 08:14:32.000000000 +0000
+++ gcc-3.4-20031107/gcc/config/mips/mips.md	2004-02-19 12:19:18.000000000 +0000
@@ -232,7 +232,8 @@
 
 	 ;; The r4000 multiplication patterns include an mflo instruction.
 	 (and (eq_attr "type" "imul")
-	      (ne (symbol_ref "TARGET_MIPS4000") (const_int 0)))
+	      (and (ne (symbol_ref "TARGET_MIPS4000") (const_int 0))
+		   (eq (symbol_ref "TARGET_MIPS16") (const_int 0))))
 	 (const_string "hilo")
 
 	 (and (eq_attr "type" "hilo")
@@ -1419,9 +1420,48 @@
    (set_attr "length"	"8")])
 
 
-;; ??? The R4000 (only) has a cpu bug.  If a double-word shift executes while
-;; a multiply is in progress, it may give an incorrect result.  Avoid
-;; this by keeping the mflo with the mult on the R4000.
+;; The original R4000 has a cpu bug.  If a double-word or a variable
+;; shift executes while a multiply is in progress, it may give an
+;; incorrect result.  Avoid this by keeping the mflo with the mult on
+;; the R4000.
+;;
+;; From "MIPS R4000PC/SC Errata, Processor Revision 2.2 and 3.0"
+;; (also valid for MIPS R4000MC processors):
+;;
+;; "16. R4000PC, R4000SC: Please refer to errata 28 for an update to
+;;	this errata description.
+;;	The following code sequence causes the R4000 to incorrectly
+;;	execute the Double Shift Right Arithmetic 32 (dsra32)
+;;	instruction.  If the dsra32 instruction is executed during an
+;;	integer multiply, the dsra32 will only shift by the amount in
+;;	specified in the instruction rather than the amount plus 32
+;;	bits.
+;;	instruction 1:		mult	rs,rt		integer multiply
+;;	instruction 2-12:	dsra32	rd,rt,rs	doubleword shift
+;;							right arithmetic + 32
+;;	Workaround: A dsra32 instruction placed after an integer
+;;	multiply should not be one of the 11 instructions after the
+;;	multiply instruction."
+;;
+;; and:
+;;
+;; "28. R4000PC, R4000SC: The text from errata 16 should be replaced by
+;;	the following description.
+;;	All extended shifts (shift by n+32) and variable shifts (32 and
+;;	64-bit versions) may produce incorrect results under the
+;;	following conditions:
+;;	1) An integer multiply is currently executing
+;;	2) These types of shift instructions are executed immediately
+;;	   following an integer divide instruction.
+;;	Workaround:
+;;	1) Make sure no integer multiply is running wihen these
+;;	   instruction are executed.  If this cannot be predicted at
+;;	   compile time, then insert a "mfhi" to R0 instruction
+;;	   immediately after the integer multiply instruction.  This
+;;	   will cause the integer multiply to complete before the shift
+;;	   is executed.
+;;	2) Separate integer divide and these two classes of shift
+;;	   instructions by another instruction or a noop."
 
 (define_expand "mulsi3"
   [(set (match_operand:SI 0 "register_operand" "")
@@ -1828,42 +1868,47 @@
 		 (match_operand:DI 2 "register_operand" "")))]
   "TARGET_64BIT"
 {
-  if (GENERATE_MULT3_DI || TARGET_MIPS4000)
-    emit_insn (gen_muldi3_internal2 (operands[0], operands[1], operands[2]));
-  else
+  if (GENERATE_MULT3_DI)
+    emit_insn (gen_muldi3_mult3 (operands[0], operands[1], operands[2]));
+  else if (!TARGET_MIPS4000 || TARGET_MIPS16)
     emit_insn (gen_muldi3_internal (operands[0], operands[1], operands[2]));
+  else
+    emit_insn (gen_muldi3_r4000 (operands[0], operands[1], operands[2]));
   DONE;
 })
 
+(define_insn "muldi3_mult3"
+  [(set (match_operand:DI 0 "register_operand" "=d")
+	(mult:DI (match_operand:DI 1 "register_operand" "d")
+		 (match_operand:DI 2 "register_operand" "d")))
+   (clobber (match_scratch:DI 3 "=h"))
+   (clobber (match_scratch:DI 4 "=l"))]
+  "TARGET_64BIT && GENERATE_MULT3_DI"
+  "dmult\t%0,%1,%2"
+  [(set_attr "type"	"imul")
+   (set_attr "mode"	"DI")])
+
 (define_insn "muldi3_internal"
   [(set (match_operand:DI 0 "register_operand" "=l")
 	(mult:DI (match_operand:DI 1 "register_operand" "d")
 		 (match_operand:DI 2 "register_operand" "d")))
    (clobber (match_scratch:DI 3 "=h"))]
-  "TARGET_64BIT && !TARGET_MIPS4000"
+  "TARGET_64BIT && (!TARGET_MIPS4000 || TARGET_MIPS16)"
   "dmult\t%1,%2"
   [(set_attr "type"	"imul")
    (set_attr "mode"	"DI")])
 
-(define_insn "muldi3_internal2"
+(define_insn "muldi3_r4000"
   [(set (match_operand:DI 0 "register_operand" "=d")
 	(mult:DI (match_operand:DI 1 "register_operand" "d")
 		 (match_operand:DI 2 "register_operand" "d")))
    (clobber (match_scratch:DI 3 "=h"))
    (clobber (match_scratch:DI 4 "=l"))]
-  "TARGET_64BIT && (GENERATE_MULT3_DI || TARGET_MIPS4000)"
-{
-  if (GENERATE_MULT3_DI)
-    return "dmult\t%0,%1,%2";
-  else
-    return "dmult\t%1,%2\;mflo\t%0";
-}
+  "TARGET_64BIT && (TARGET_MIPS4000 && !TARGET_MIPS16)"
+  "dmult\t%1,%2\;mflo\t%0"
   [(set_attr "type"	"imul")
    (set_attr "mode"	"DI")
-   (set (attr "length")
-	(if_then_else (ne (symbol_ref "GENERATE_MULT3_DI") (const_int 0))
-		      (const_int 4)
-		      (const_int 8)))])
+   (set_attr "length"	"8")])
 
 ;; ??? We could define a mulditi3 pattern when TARGET_64BIT.
 
@@ -1880,21 +1925,37 @@
 {
   if (!TARGET_64BIT)
     {
-      emit_insn (gen_mulsidi3_32bit (operands[0], operands[1], operands[2]));
+      if (!TARGET_MIPS4000 || TARGET_MIPS16)
+	emit_insn (gen_mulsidi3_32bit_internal (operands[0], operands[1],
+						operands[2]));
+      else
+	emit_insn (gen_mulsidi3_32bit_r4000 (operands[0], operands[1],
+					     operands[2]));
       DONE;
     }
 })
 
-(define_insn "mulsidi3_32bit"
+(define_insn "mulsidi3_32bit_internal"
   [(set (match_operand:DI 0 "register_operand" "=x")
 	(mult:DI
 	   (sign_extend:DI (match_operand:SI 1 "register_operand" "d"))
 	   (sign_extend:DI (match_operand:SI 2 "register_operand" "d"))))]
-  "!TARGET_64BIT"
+  "!TARGET_64BIT && (!TARGET_MIPS4000 || TARGET_MIPS16)"
   "mult\t%1,%2"
   [(set_attr "type"	"imul")
    (set_attr "mode"	"SI")])
 
+(define_insn "mulsidi3_32bit_r4000"
+  [(set (match_operand:DI 0 "register_operand" "=x")
+	(mult:DI
+	   (sign_extend:DI (match_operand:SI 1 "register_operand" "d"))
+	   (sign_extend:DI (match_operand:SI 2 "register_operand" "d"))))]
+  "!TARGET_64BIT && (TARGET_MIPS4000 && !TARGET_MIPS16)"
+  "mult\t%1,%2\;mflo\t%."
+  [(set_attr "type"	"imul")
+   (set_attr "mode"	"SI")
+   (set_attr "length"	"8")])
+
 (define_insn_and_split "*mulsidi3_64bit"
   [(set (match_operand:DI 0 "register_operand" "=d")
 	(mult:DI (match_operator:DI 1 "extend_operator"
@@ -1904,7 +1965,8 @@
    (clobber (match_scratch:DI 5 "=l"))
    (clobber (match_scratch:DI 6 "=h"))
    (clobber (match_scratch:DI 7 "=d"))]
-  "TARGET_64BIT && GET_CODE (operands[1]) == GET_CODE (operands[2])"
+  "TARGET_64BIT && (!TARGET_MIPS4000 || TARGET_MIPS16)
+   && GET_CODE (operands[1]) == GET_CODE (operands[2])"
   "#"
   "&& reload_completed"
   [(parallel
@@ -1955,7 +2017,8 @@
 	      (match_operator:DI 4 "extend_operator" [(match_dup 2)])
 	      (match_operator:DI 5 "extend_operator" [(match_dup 3)]))
 	   (const_int 32)))]
-  "TARGET_64BIT && GET_CODE (operands[4]) == GET_CODE (operands[5])"
+  "TARGET_64BIT && (!TARGET_MIPS4000 || TARGET_MIPS16)
+   && GET_CODE (operands[4]) == GET_CODE (operands[5])"
 {
   if (GET_CODE (operands[4]) == SIGN_EXTEND)
     return "mult\t%2,%3";
@@ -1965,6 +2028,80 @@
   [(set_attr "type" "imul")
    (set_attr "mode" "SI")])
 
+(define_insn_and_split "*mulsidi3_64bit_r4000"
+  [(set (match_operand:DI 0 "register_operand" "=d")
+	(mult:DI (match_operator:DI 1 "extend_operator"
+		    [(match_operand:SI 3 "register_operand" "d")])
+		 (match_operator:DI 2 "extend_operator"
+		    [(match_operand:SI 4 "register_operand" "d")])))
+   (clobber (match_scratch:DI 5 "=l"))
+   (clobber (match_scratch:DI 6 "=h"))
+   (clobber (match_scratch:DI 7 "=d"))]
+  "TARGET_64BIT && (TARGET_MIPS4000 && !TARGET_MIPS16)
+   && GET_CODE (operands[1]) == GET_CODE (operands[2])"
+  "#"
+  "&& reload_completed"
+  [(parallel
+       [(set (match_dup 7)
+	     (sign_extend:DI
+		(mult:SI (match_dup 3)
+		         (match_dup 4))))
+	(set (match_dup 6)
+	     (ashiftrt:DI
+		(mult:DI (match_dup 1)
+			 (match_dup 2))
+		(const_int 32)))
+	(clobber (match_dup 5))])
+
+   ;; OP0 <- HI
+   (set (match_dup 0) (match_dup 6))
+
+   ;; Zero-extend OP7.
+   (set (match_dup 7)
+	(ashift:DI (match_dup 7)
+		   (const_int 32)))
+   (set (match_dup 7)
+	(lshiftrt:DI (match_dup 7)
+		     (const_int 32)))
+
+   ;; Shift OP0 into place.
+   (set (match_dup 0)
+	(ashift:DI (match_dup 0)
+		   (const_int 32)))
+
+   ;; OR the two halves together
+   (set (match_dup 0)
+	(ior:DI (match_dup 0)
+		(match_dup 7)))]
+  ""
+  [(set_attr "type"	"imul")
+   (set_attr "mode"	"SI")
+   (set_attr "length"	"24")])
+
+(define_insn "*mulsidi3_64bit_parts_r4000"
+  [(set (match_operand:DI 0 "register_operand" "=d")
+	(sign_extend:DI
+	   (mult:SI (match_operand:SI 2 "register_operand" "d")
+		    (match_operand:SI 3 "register_operand" "d"))))
+   (set (match_operand:DI 1 "register_operand" "=h")
+	(ashiftrt:DI
+	   (mult:DI
+	      (match_operator:DI 4 "extend_operator" [(match_dup 2)])
+	      (match_operator:DI 5 "extend_operator" [(match_dup 3)]))
+	   (const_int 32)))
+   (clobber (match_operand:DI 6 "register_operand" "=l"))]
+  "TARGET_64BIT && (TARGET_MIPS4000 && !TARGET_MIPS16)
+   && GET_CODE (operands[4]) == GET_CODE (operands[5])"
+{
+  if (GET_CODE (operands[4]) == SIGN_EXTEND)
+    return "mult\t%2,%3\;mflo\t%0";
+  else
+    return "multu\t%2,%3\;mflo\t%0";
+}
+  [(set_attr "type"	"imul")
+   (set_attr "mode"	"SI")
+   (set_attr "length"	"8")])
+
 (define_expand "umulsidi3"
   [(parallel
       [(set (match_operand:DI 0 "register_operand" "")
@@ -1978,22 +2115,37 @@
 {
   if (!TARGET_64BIT)
     {
-      emit_insn (gen_umulsidi3_32bit (operands[0], operands[1],
-				      operands[2]));
+      if (!TARGET_MIPS4000 || TARGET_MIPS16)
+	emit_insn (gen_umulsidi3_32bit_internal (operands[0], operands[1],
+						 operands[2]));
+      else
+	emit_insn (gen_umulsidi3_32bit_r4000 (operands[0], operands[1],
+					      operands[2]));
       DONE;
     }
 })
 
-(define_insn "umulsidi3_32bit"
+(define_insn "umulsidi3_32bit_internal"
   [(set (match_operand:DI 0 "register_operand" "=x")
 	(mult:DI
 	   (zero_extend:DI (match_operand:SI 1 "register_operand" "d"))
 	   (zero_extend:DI (match_operand:SI 2 "register_operand" "d"))))]
-  "!TARGET_64BIT"
+  "!TARGET_64BIT && (!TARGET_MIPS4000 || TARGET_MIPS16)"
   "multu\t%1,%2"
   [(set_attr "type"	"imul")
    (set_attr "mode"	"SI")])
 
+(define_insn "umulsidi3_32bit_r4000"
+  [(set (match_operand:DI 0 "register_operand" "=x")
+	(mult:DI
+	   (zero_extend:DI (match_operand:SI 1 "register_operand" "d"))
+	   (zero_extend:DI (match_operand:SI 2 "register_operand" "d"))))]
+  "!TARGET_64BIT && (TARGET_MIPS4000 && !TARGET_MIPS16)"
+  "multu\t%1,%2\;mflo\t%."
+  [(set_attr "type"	"imul")
+   (set_attr "mode"	"SI")
+   (set_attr "length"	"8")])
+
 ;; Widening multiply with negation.
 (define_insn "*muls_di"
   [(set (match_operand:DI 0 "register_operand" "=x")
@@ -2068,9 +2220,12 @@
   if (ISA_HAS_MULHI)
     emit_insn (gen_umulsi3_highpart_mulhi_internal (operands[0], operands[1],
 						    operands[2]));
-  else
+  else if (!TARGET_MIPS4000 || TARGET_MIPS16)
     emit_insn (gen_umulsi3_highpart_internal (operands[0], operands[1],
 					      operands[2]));
+  else
+    emit_insn (gen_umulsi3_highpart_r4000 (operands[0], operands[1],
+					   operands[2]));
   DONE;
 })
 
@@ -2082,7 +2237,7 @@
 		   (zero_extend:DI (match_operand:SI 2 "register_operand" "d")))
 	  (const_int 32))))
    (clobber (match_scratch:SI 3 "=l"))]
-  "!ISA_HAS_MULHI"
+  "!ISA_HAS_MULHI && (!TARGET_MIPS4000 || TARGET_MIPS16)"
   "multu\t%1,%2"
   [(set_attr "type"   "imul")
    (set_attr "mode"   "SI")
@@ -2105,6 +2260,21 @@
    (set_attr "mode"   "SI")
    (set_attr "length" "4")])
 
+(define_insn "umulsi3_highpart_r4000"
+  [(set (match_operand:SI 0 "register_operand" "=d")
+	(truncate:SI
+	 (lshiftrt:DI
+	  (mult:DI (zero_extend:DI (match_operand:SI 1 "register_operand" "d"))
+		   (zero_extend:DI (match_operand:SI 2 "register_operand" "d")))
+	  (const_int 32))))
+   (clobber (match_scratch:SI 3 "=l"))
+   (clobber (match_scratch:SI 4 "=h"))]
+  "!ISA_HAS_MULHI && (TARGET_MIPS4000 && !TARGET_MIPS16)"
+  "multu\t%1,%2\;mflo\t%0"
+  [(set_attr "type"	"imul")
+   (set_attr "mode"	"SI")
+   (set_attr "length"	"8")])
+
 (define_insn "umulsi3_highpart_neg_mulhi_internal"
   [(set (match_operand:SI 0 "register_operand" "=h,d")
         (truncate:SI
@@ -2135,9 +2305,12 @@
   if (ISA_HAS_MULHI)
     emit_insn (gen_smulsi3_highpart_mulhi_internal (operands[0], operands[1],
 						    operands[2]));
-  else
+  else if (!TARGET_MIPS4000 || TARGET_MIPS16)
     emit_insn (gen_smulsi3_highpart_internal (operands[0], operands[1],
 					      operands[2]));
+  else
+    emit_insn (gen_smulsi3_highpart_r4000 (operands[0], operands[1],
+					   operands[2]));
   DONE;
 })
 
@@ -2149,7 +2322,7 @@
 		   (sign_extend:DI (match_operand:SI 2 "register_operand" "d")))
 	  (const_int 32))))
    (clobber (match_scratch:SI 3 "=l"))]
-  "!ISA_HAS_MULHI"
+  "!ISA_HAS_MULHI && (!TARGET_MIPS4000 || TARGET_MIPS16)"
   "mult\t%1,%2"
   [(set_attr "type"	"imul")
    (set_attr "mode"	"SI")
@@ -2172,6 +2345,21 @@
    (set_attr "mode"   "SI")
    (set_attr "length" "4")])
 
+(define_insn "smulsi3_highpart_r4000"
+  [(set (match_operand:SI 0 "register_operand" "=d")
+	(truncate:SI
+	 (lshiftrt:DI
+	  (mult:DI (sign_extend:DI (match_operand:SI 1 "register_operand" "d"))
+		   (sign_extend:DI (match_operand:SI 2 "register_operand" "d")))
+	  (const_int 32))))
+   (clobber (match_scratch:SI 3 "=l"))
+   (clobber (match_scratch:SI 4 "=h"))]
+  "!ISA_HAS_MULHI && (TARGET_MIPS4000 && !TARGET_MIPS16)"
+  "mult\t%1,%2\;mflo\t%0"
+  [(set_attr "type"	"imul")
+   (set_attr "mode"	"SI")
+   (set_attr "length"   "8")])
+
 (define_insn "smulsi3_highpart_neg_mulhi_internal"
   [(set (match_operand:SI 0 "register_operand" "=h,d")
         (truncate:SI
@@ -2189,7 +2377,26 @@
   [(set_attr "type"   "imul")
    (set_attr "mode"   "SI")])
 
-(define_insn "smuldi3_highpart"
+(define_expand "smuldi3_highpart"
+  [(set (match_operand:DI 0 "register_operand" "")
+	(truncate:DI
+	 (lshiftrt:TI
+	  (mult:TI
+	   (sign_extend:TI (match_operand:DI 1 "register_operand" "d"))
+	   (sign_extend:TI (match_operand:DI 2 "register_operand" "d")))
+         (const_int 64))))]
+  "TARGET_64BIT"
+{
+  if (!TARGET_MIPS4000 || TARGET_MIPS16)
+    emit_insn (gen_smuldi3_highpart_internal (operands[0], operands[1],
+					      operands[2]));
+  else
+    emit_insn (gen_smuldi3_highpart_r4000 (operands[0], operands[1],
+					   operands[2]));
+  DONE;
+})
+
+(define_insn "smuldi3_highpart_internal"
   [(set (match_operand:DI 0 "register_operand" "=h")
 	(truncate:DI
 	 (lshiftrt:TI
@@ -2198,12 +2405,47 @@
 	   (sign_extend:TI (match_operand:DI 2 "register_operand" "d")))
          (const_int 64))))
    (clobber (match_scratch:DI 3 "=l"))]
-  "TARGET_64BIT"
+  "TARGET_64BIT && (!TARGET_MIPS4000 || TARGET_MIPS16)"
   "dmult\t%1,%2"
   [(set_attr "type"	"imul")
    (set_attr "mode"	"DI")])
 
-(define_insn "umuldi3_highpart"
+(define_insn "smuldi3_highpart_r4000"
+  [(set (match_operand:DI 0 "register_operand" "=d")
+	(truncate:DI
+	 (lshiftrt:TI
+	  (mult:TI
+	   (sign_extend:TI (match_operand:DI 1 "register_operand" "d"))
+	   (sign_extend:TI (match_operand:DI 2 "register_operand" "d")))
+         (const_int 64))))
+   (clobber (match_scratch:DI 3 "=l"))
+   (clobber (match_scratch:SI 4 "=h"))]
+  "TARGET_64BIT && (TARGET_MIPS4000 && !TARGET_MIPS16)"
+  "dmult\t%1,%2\;mflo\t%0"
+  [(set_attr "type"	"imul")
+   (set_attr "mode"	"DI")
+   (set_attr "length"	"8")])
+
+(define_expand "umuldi3_highpart"
+  [(set (match_operand:DI 0 "register_operand" "")
+	(truncate:DI
+	 (lshiftrt:TI
+	  (mult:TI
+	   (zero_extend:TI (match_operand:DI 1 "register_operand" "d"))
+	   (zero_extend:TI (match_operand:DI 2 "register_operand" "d")))
+	  (const_int 64))))]
+  "TARGET_64BIT"
+{
+  if (!TARGET_MIPS4000 || TARGET_MIPS16)
+    emit_insn (gen_umuldi3_highpart_internal (operands[0], operands[1],
+					      operands[2]));
+  else
+    emit_insn (gen_umuldi3_highpart_r4000 (operands[0], operands[1],
+					   operands[2]));
+  DONE;
+})
+
+(define_insn "umuldi3_highpart_internal"
   [(set (match_operand:DI 0 "register_operand" "=h")
 	(truncate:DI
 	 (lshiftrt:TI
@@ -2212,11 +2454,27 @@
 	   (zero_extend:TI (match_operand:DI 2 "register_operand" "d")))
 	  (const_int 64))))
    (clobber (match_scratch:DI 3 "=l"))]
-  "TARGET_64BIT"
+  "TARGET_64BIT && (!TARGET_MIPS4000 || TARGET_MIPS16)"
   "dmultu\t%1,%2"
   [(set_attr "type"	"imul")
    (set_attr "mode"	"DI")])
 
+(define_insn "umuldi3_highpart_r4000"
+  [(set (match_operand:DI 0 "register_operand" "=d")
+	(truncate:DI
+	 (lshiftrt:TI
+	  (mult:TI
+	   (zero_extend:TI (match_operand:DI 1 "register_operand" "d"))
+	   (zero_extend:TI (match_operand:DI 2 "register_operand" "d")))
+	  (const_int 64))))
+   (clobber (match_scratch:DI 3 "=l"))
+   (clobber (match_scratch:SI 4 "=h"))]
+  "TARGET_64BIT && (TARGET_MIPS4000 && !TARGET_MIPS16)"
+  "dmultu\t%1,%2\;mflo\t%0"
+  [(set_attr "type"	"imul")
+   (set_attr "mode"	"DI")
+   (set_attr "length"	"8")])
+
 
 ;; The R4650 supports a 32 bit multiply/ 64 bit accumulate
 ;; instruction.  The HI/LO registers are used as a 64 bit accumulator.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]