This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: head: MIPS: Complete the R4000 multiply/shift errata workaround
On Tue, 24 Feb 2004, Richard Sandiford wrote:
> If Eric doesn't object, I'll install it with that change once
> we get the assignment thing sorted out.
I can see you haven't applied the change yet, so I'm sending the patch
again with these fixes applied:
1. Lengths changed to "12" for mulsidi3_32bit_r4000 and
umulsidi3_32bit_r4000.
2. Missing clobbers added to mulsidi3_32bit_r4000 and
umulsidi3_32bit_r4000.
The updated patch was tested by building a cross-compiler for the
mips64el-linux target, building a 64-bit Linux kernel, inspecting it with
objdump and running on an R4000 and an R4400.
Here's a ChangeLog entry, unchanged, but for completeness.
2004-02-26 Maciej W. Rozycki <macro@ds2.pg.gda.pl>
* config/mips/mips.md: Complete the unfinished R4000
multiply/shift errata workaround. Improve documentation.
(hazard): Use TARGET_FIX_4000 for the "imul" type.
(mulsi3, mulsi3_internal, mulsi3_r4000): Use TARGET_FIX_4000.
(muldi3, muldi3_internal): Use TARGET_FIX_4000.
(muldi3_mult3, muldi3_r4000): New patterns, replacing
muldi3_internal2.
(muldi3_internal2): Removed.
(mulsidi3): Take the errata into account.
(mulsidi3_32bit_internal, mulsidi3_32bit_r4000): New patterns,
replacing mulsidi3_32bit.
(mulsidi3_32bit): Removed.
(mulsidi3_64bit, mulsidi3_64bit_parts): Disable if
TARGET_FIX_4000.
(umulsidi3): Take the errata into account.
(umulsidi3_32bit_internal, umulsidi3_32bit_r4000): New patterns,
replacing umulsidi3_32bit.
(umulsidi3_32bit): Removed.
(umulsi3_highpart, umulsi3_highpart_internal): Disable if
TARGET_FIX_4000.
(smulsi3_highpart, smulsi3_highpart_internal): Likewise.
(smuldi3_highpart, umuldi3_highpart): Likewise.
* doc/invoke.texi: Document the errata workaround.
Thanks for your patience and please apply.
Maciej
--
+ Maciej W. Rozycki, Technical University of Gdansk, Poland +
+--------------------------------------------------------------+
+ e-mail: macro@ds2.pg.gda.pl, PGP key available +
gcc-3.4-20031107-mips-r4000-mult.patch
diff -up --recursive --new-file gcc-3.4-20031107.macro/gcc/config/mips/mips.md gcc-3.4-20031107/gcc/config/mips/mips.md
--- gcc-3.4-20031107.macro/gcc/config/mips/mips.md 2004-02-24 06:31:22.000000000 +0000
+++ gcc-3.4-20031107/gcc/config/mips/mips.md 2004-02-26 00:13:25.000000000 +0000
@@ -229,7 +229,7 @@
;; The r4000 multiplication patterns include an mflo instruction.
(and (eq_attr "type" "imul")
- (ne (symbol_ref "TARGET_MIPS4000") (const_int 0)))
+ (ne (symbol_ref "TARGET_FIX_4000") (const_int 0)))
(const_string "hilo")
(and (eq_attr "type" "hilo")
@@ -1416,9 +1416,51 @@
(set_attr "length" "8")])
-;; ??? The R4000 (only) has a cpu bug. If a double-word shift executes while
-;; a multiply is in progress, it may give an incorrect result. Avoid
-;; this by keeping the mflo with the mult on the R4000.
+;; The original R4000 has a cpu bug. If a double-word or a variable
+;; shift executes while an integer multiplication is in progress, the
+;; shift may give an incorrect result. Avoid this by keeping the mflo
+;; with the mult on the R4000.
+;;
+;; From "MIPS R4000PC/SC Errata, Processor Revision 2.2 and 3.0"
+;; (also valid for MIPS R4000MC processors):
+;;
+;; "16. R4000PC, R4000SC: Please refer to errata 28 for an update to
+;; this errata description.
+;; The following code sequence causes the R4000 to incorrectly
+;; execute the Double Shift Right Arithmetic 32 (dsra32)
+;; instruction. If the dsra32 instruction is executed during an
+;; integer multiply, the dsra32 will only shift by the amount in
+;; specified in the instruction rather than the amount plus 32
+;; bits.
+;; instruction 1: mult rs,rt integer multiply
+;; instruction 2-12: dsra32 rd,rt,rs doubleword shift
+;; right arithmetic + 32
+;; Workaround: A dsra32 instruction placed after an integer
+;; multiply should not be one of the 11 instructions after the
+;; multiply instruction."
+;;
+;; and:
+;;
+;; "28. R4000PC, R4000SC: The text from errata 16 should be replaced by
+;; the following description.
+;; All extended shifts (shift by n+32) and variable shifts (32 and
+;; 64-bit versions) may produce incorrect results under the
+;; following conditions:
+;; 1) An integer multiply is currently executing
+;; 2) These types of shift instructions are executed immediately
+;; following an integer divide instruction.
+;; Workaround:
+;; 1) Make sure no integer multiply is running wihen these
+;; instruction are executed. If this cannot be predicted at
+;; compile time, then insert a "mfhi" to R0 instruction
+;; immediately after the integer multiply instruction. This
+;; will cause the integer multiply to complete before the shift
+;; is executed.
+;; 2) Separate integer divide and these two classes of shift
+;; instructions by another instruction or a noop."
+;;
+;; These processors have PRId values of 0x00004220 and 0x00004300,
+;; respectively.
(define_expand "mulsi3"
[(set (match_operand:SI 0 "register_operand" "")
@@ -1428,7 +1470,7 @@
{
if (GENERATE_MULT3_SI || TARGET_MAD)
emit_insn (gen_mulsi3_mult3 (operands[0], operands[1], operands[2]));
- else if (!TARGET_MIPS4000 || TARGET_MIPS16)
+ else if (!TARGET_FIX_4000)
emit_insn (gen_mulsi3_internal (operands[0], operands[1], operands[2]));
else
emit_insn (gen_mulsi3_r4000 (operands[0], operands[1], operands[2]));
@@ -1494,7 +1536,7 @@
(mult:SI (match_operand:SI 1 "register_operand" "d")
(match_operand:SI 2 "register_operand" "d")))
(clobber (match_scratch:SI 3 "=h"))]
- "!TARGET_MIPS4000 || TARGET_MIPS16"
+ "!TARGET_FIX_4000"
"mult\t%1,%2"
[(set_attr "type" "imul")
(set_attr "mode" "SI")])
@@ -1505,7 +1547,7 @@
(match_operand:SI 2 "register_operand" "d")))
(clobber (match_scratch:SI 3 "=h"))
(clobber (match_scratch:SI 4 "=l"))]
- "TARGET_MIPS4000 && !TARGET_MIPS16"
+ "TARGET_FIX_4000"
"mult\t%1,%2\;mflo\t%0"
[(set_attr "type" "imul")
(set_attr "mode" "SI")
@@ -1825,42 +1867,47 @@
(match_operand:DI 2 "register_operand" "")))]
"TARGET_64BIT"
{
- if (GENERATE_MULT3_DI || TARGET_MIPS4000)
- emit_insn (gen_muldi3_internal2 (operands[0], operands[1], operands[2]));
- else
+ if (GENERATE_MULT3_DI)
+ emit_insn (gen_muldi3_mult3 (operands[0], operands[1], operands[2]));
+ else if (!TARGET_FIX_4000)
emit_insn (gen_muldi3_internal (operands[0], operands[1], operands[2]));
+ else
+ emit_insn (gen_muldi3_r4000 (operands[0], operands[1], operands[2]));
DONE;
})
+(define_insn "muldi3_mult3"
+ [(set (match_operand:DI 0 "register_operand" "=d")
+ (mult:DI (match_operand:DI 1 "register_operand" "d")
+ (match_operand:DI 2 "register_operand" "d")))
+ (clobber (match_scratch:DI 3 "=h"))
+ (clobber (match_scratch:DI 4 "=l"))]
+ "TARGET_64BIT && GENERATE_MULT3_DI"
+ "dmult\t%0,%1,%2"
+ [(set_attr "type" "imul")
+ (set_attr "mode" "DI")])
+
(define_insn "muldi3_internal"
[(set (match_operand:DI 0 "register_operand" "=l")
(mult:DI (match_operand:DI 1 "register_operand" "d")
(match_operand:DI 2 "register_operand" "d")))
(clobber (match_scratch:DI 3 "=h"))]
- "TARGET_64BIT && !TARGET_MIPS4000"
+ "TARGET_64BIT && !TARGET_FIX_4000"
"dmult\t%1,%2"
[(set_attr "type" "imul")
(set_attr "mode" "DI")])
-(define_insn "muldi3_internal2"
+(define_insn "muldi3_r4000"
[(set (match_operand:DI 0 "register_operand" "=d")
(mult:DI (match_operand:DI 1 "register_operand" "d")
(match_operand:DI 2 "register_operand" "d")))
(clobber (match_scratch:DI 3 "=h"))
(clobber (match_scratch:DI 4 "=l"))]
- "TARGET_64BIT && (GENERATE_MULT3_DI || TARGET_MIPS4000)"
-{
- if (GENERATE_MULT3_DI)
- return "dmult\t%0,%1,%2";
- else
- return "dmult\t%1,%2\;mflo\t%0";
-}
+ "TARGET_64BIT && TARGET_FIX_4000"
+ "dmult\t%1,%2\;mflo\t%0"
[(set_attr "type" "imul")
(set_attr "mode" "DI")
- (set (attr "length")
- (if_then_else (ne (symbol_ref "GENERATE_MULT3_DI") (const_int 0))
- (const_int 4)
- (const_int 8)))])
+ (set_attr "length" "8")])
;; ??? We could define a mulditi3 pattern when TARGET_64BIT.
@@ -1873,25 +1920,43 @@
(clobber (scratch:DI))
(clobber (scratch:DI))
(clobber (scratch:DI))])]
- ""
+ "!TARGET_64BIT || !TARGET_FIX_4000"
{
if (!TARGET_64BIT)
{
- emit_insn (gen_mulsidi3_32bit (operands[0], operands[1], operands[2]));
+ if (!TARGET_FIX_4000)
+ emit_insn (gen_mulsidi3_32bit_internal (operands[0], operands[1],
+ operands[2]));
+ else
+ emit_insn (gen_mulsidi3_32bit_r4000 (operands[0], operands[1],
+ operands[2]));
DONE;
}
})
-(define_insn "mulsidi3_32bit"
+(define_insn "mulsidi3_32bit_internal"
[(set (match_operand:DI 0 "register_operand" "=x")
(mult:DI
(sign_extend:DI (match_operand:SI 1 "register_operand" "d"))
(sign_extend:DI (match_operand:SI 2 "register_operand" "d"))))]
- "!TARGET_64BIT"
+ "!TARGET_64BIT && !TARGET_FIX_4000"
"mult\t%1,%2"
[(set_attr "type" "imul")
(set_attr "mode" "SI")])
+(define_insn "mulsidi3_32bit_r4000"
+ [(set (match_operand:DI 0 "register_operand" "=d")
+ (mult:DI
+ (sign_extend:DI (match_operand:SI 1 "register_operand" "d"))
+ (sign_extend:DI (match_operand:SI 2 "register_operand" "d"))))
+ (clobber (match_scratch:DI 3 "=l"))
+ (clobber (match_scratch:DI 4 "=h"))]
+ "!TARGET_64BIT && TARGET_FIX_4000"
+ "mult\t%1,%2\;mflo\t%L0;mfhi\t%M0"
+ [(set_attr "type" "imul")
+ (set_attr "mode" "SI")
+ (set_attr "length" "12")])
+
(define_insn_and_split "*mulsidi3_64bit"
[(set (match_operand:DI 0 "register_operand" "=d")
(mult:DI (match_operator:DI 1 "extend_operator"
@@ -1901,7 +1966,8 @@
(clobber (match_scratch:DI 5 "=l"))
(clobber (match_scratch:DI 6 "=h"))
(clobber (match_scratch:DI 7 "=d"))]
- "TARGET_64BIT && GET_CODE (operands[1]) == GET_CODE (operands[2])"
+ "TARGET_64BIT && !TARGET_FIX_4000
+ && GET_CODE (operands[1]) == GET_CODE (operands[2])"
"#"
"&& reload_completed"
[(parallel
@@ -1952,7 +2018,8 @@
(match_operator:DI 4 "extend_operator" [(match_dup 2)])
(match_operator:DI 5 "extend_operator" [(match_dup 3)]))
(const_int 32)))]
- "TARGET_64BIT && GET_CODE (operands[4]) == GET_CODE (operands[5])"
+ "TARGET_64BIT && !TARGET_FIX_4000
+ && GET_CODE (operands[4]) == GET_CODE (operands[5])"
{
if (GET_CODE (operands[4]) == SIGN_EXTEND)
return "mult\t%2,%3";
@@ -1971,26 +2038,43 @@
(clobber (scratch:DI))
(clobber (scratch:DI))
(clobber (scratch:DI))])]
- ""
+ "!TARGET_64BIT || !TARGET_FIX_4000"
{
if (!TARGET_64BIT)
{
- emit_insn (gen_umulsidi3_32bit (operands[0], operands[1],
- operands[2]));
+ if (!TARGET_FIX_4000)
+ emit_insn (gen_umulsidi3_32bit_internal (operands[0], operands[1],
+ operands[2]));
+ else
+ emit_insn (gen_umulsidi3_32bit_r4000 (operands[0], operands[1],
+ operands[2]));
DONE;
}
})
-(define_insn "umulsidi3_32bit"
+(define_insn "umulsidi3_32bit_internal"
[(set (match_operand:DI 0 "register_operand" "=x")
(mult:DI
(zero_extend:DI (match_operand:SI 1 "register_operand" "d"))
(zero_extend:DI (match_operand:SI 2 "register_operand" "d"))))]
- "!TARGET_64BIT"
+ "!TARGET_64BIT && !TARGET_FIX_4000"
"multu\t%1,%2"
[(set_attr "type" "imul")
(set_attr "mode" "SI")])
+(define_insn "umulsidi3_32bit_r4000"
+ [(set (match_operand:DI 0 "register_operand" "=d")
+ (mult:DI
+ (zero_extend:DI (match_operand:SI 1 "register_operand" "d"))
+ (zero_extend:DI (match_operand:SI 2 "register_operand" "d"))))
+ (clobber (match_scratch:DI 3 "=l"))
+ (clobber (match_scratch:DI 4 "=h"))]
+ "!TARGET_64BIT && TARGET_FIX_4000"
+ "multu\t%1,%2\;mflo\t%L0;mfhi\t%M0"
+ [(set_attr "type" "imul")
+ (set_attr "mode" "SI")
+ (set_attr "length" "12")])
+
;; Widening multiply with negation.
(define_insn "*muls_di"
[(set (match_operand:DI 0 "register_operand" "=x")
@@ -2060,7 +2144,7 @@
(mult:DI (zero_extend:DI (match_operand:SI 1 "register_operand" ""))
(zero_extend:DI (match_operand:SI 2 "register_operand" "")))
(const_int 32))))]
- ""
+ "ISA_HAS_MULHI || !TARGET_FIX_4000"
{
if (ISA_HAS_MULHI)
emit_insn (gen_umulsi3_highpart_mulhi_internal (operands[0], operands[1],
@@ -2079,7 +2163,7 @@
(zero_extend:DI (match_operand:SI 2 "register_operand" "d")))
(const_int 32))))
(clobber (match_scratch:SI 3 "=l"))]
- "!ISA_HAS_MULHI"
+ "!ISA_HAS_MULHI && !TARGET_FIX_4000"
"multu\t%1,%2"
[(set_attr "type" "imul")
(set_attr "mode" "SI")
@@ -2127,7 +2211,7 @@
(mult:DI (sign_extend:DI (match_operand:SI 1 "register_operand" ""))
(sign_extend:DI (match_operand:SI 2 "register_operand" "")))
(const_int 32))))]
- ""
+ "ISA_HAS_MULHI || !TARGET_FIX_4000"
{
if (ISA_HAS_MULHI)
emit_insn (gen_smulsi3_highpart_mulhi_internal (operands[0], operands[1],
@@ -2146,7 +2230,7 @@
(sign_extend:DI (match_operand:SI 2 "register_operand" "d")))
(const_int 32))))
(clobber (match_scratch:SI 3 "=l"))]
- "!ISA_HAS_MULHI"
+ "!ISA_HAS_MULHI && !TARGET_FIX_4000"
"mult\t%1,%2"
[(set_attr "type" "imul")
(set_attr "mode" "SI")
@@ -2195,7 +2279,7 @@
(sign_extend:TI (match_operand:DI 2 "register_operand" "d")))
(const_int 64))))
(clobber (match_scratch:DI 3 "=l"))]
- "TARGET_64BIT"
+ "TARGET_64BIT && !TARGET_FIX_4000"
"dmult\t%1,%2"
[(set_attr "type" "imul")
(set_attr "mode" "DI")])
@@ -2209,7 +2293,7 @@
(zero_extend:TI (match_operand:DI 2 "register_operand" "d")))
(const_int 64))))
(clobber (match_scratch:DI 3 "=l"))]
- "TARGET_64BIT"
+ "TARGET_64BIT && !TARGET_FIX_4000"
"dmultu\t%1,%2"
[(set_attr "type" "imul")
(set_attr "mode" "DI")])
diff -up --recursive --new-file gcc-3.4-20031107.macro/gcc/doc/invoke.texi gcc-3.4-20031107/gcc/doc/invoke.texi
--- gcc-3.4-20031107.macro/gcc/doc/invoke.texi 2004-02-23 21:31:42.000000000 +0000
+++ gcc-3.4-20031107/gcc/doc/invoke.texi 2004-02-23 22:02:26.000000000 +0000
@@ -8456,6 +8456,9 @@ Work around certain R4000 CPU errata:
@item
A double-word or a variable shift may give an incorrect result if executed
immediately after starting an integer division.
+@item
+A double-word or a variable shift may give an incorrect result if executed
+while an integer multiplication is in progress.
@end itemize
@item -no-crt0