Avoiding MIPS mthi/mflo and mtlo/mfhi hazards

Richard Sandiford rsandifo@redhat.com
Sat Apr 17 10:02:00 GMT 2004


On MIPS targets, moving an mthi before an mflo will invalidate the
result of the mflo.  The same goes for mflo and mthi.

Unfortunately, GCC isn't aware of this, so it's possible to convince it
to do an invalid swap with certain combinations of inline asm.  Whether
this happens or not depends very much on the scheduler and on the code
surrounding the asm.  One example that seems to fail reliably is attached.

It's also possible to trigger an invalid swap on architectures that use
multiply-accumulate instructions.  For example, if we have a highpart
multiplication followed by a multiply-accumulate chain, the highpart
multiplication might end with an mfhi and the multiply-accumulate chain
might start with an mtlo.  There's nothing to stop the scheduler
swapping them around.

There really needs to be some sort of rtl dependence between mflo
and mthi, and between mfhi and mtlo.  This means that at least one
instruction from each pair can no longer be treated as a simple move.

>From an optimisation standpoint, I thought the best approach would be to
change the two least-used instructions.  The question is: which are they?
I think it's safe to say that mthi is the least used of the four and that
mflo is the most used.  On the other hand, the choice between mtlo and mfhi
depends quite a lot on the application and target architecture.  (Note that
a future patch will convert "mult;mflo" into "mtlo;macc" when generating
VR41xx code, at which point the choice becomes even murkier.)

The problem with changing mthi is that it somehow needs to set LO
(in order to introduce a WAR dependence with previous mflos), while at
the same time not making previous mtlos be seen as dead.  The obvious
choice seemed to be:

    (parallel
       [(set (hi) (src))
        (set (lo) (unspec [(lo)] UNSPEC_MTHI))])

And this does indeed seem to work.  The problem is that double sets are
harder to optimise than single sets, and patterns like these will introduce
a fake (and unwanted) RAW dependence with mtlos.  This extra dependence can
harm scheduling.

After a bit of experimentation, I think the best approach is to change
mflo and mfhi so that they read both accumulator registers.  The patterns
will still be single sets, it's just that they'll have two source operands
rather than one:

    (set (dest) (unspec [(lo) (hi)] UNSPEC_MFHILO))  for mflo
    (set (dest) (unspec [(hi) (lo)] UNSPEC_MFHILO))  for mfhi

One potential drawback of this is that pre-reload move instructions
can no longer become mflos or mfhis.  However, I don't think it should
affect optimisation too much, since very few mflos and mfhis are
instances of pre-reload move instructions.  Almost all of them are
introduced when reloading multiplications or divisions.

I compared the -O2 output for c-torture before and after this change
and only three files changed.  All three changes were cases in which
double mflos or double mfhis were converted into single mflos or mfhis
followed by a move.  For example, "mflo r1...mflo r2" would become
"mflo r1...move r2,r1".  This is certainly an improvement on some
architectures, and in one case, the "move" could be put into a
branch delay slot whereas the "mflo" couldn't (because of potential
hazards with the branch target).

I also tested the change on a proprietary benchmark suite and saw no
change in performance.  (FWIW, this was using a *-elf configuration and
a "bare-metal" environment, so there's almost no variation between runs.)

Bootstrapped & regression tested on mips64{,el}-linux-gnu.  Also tested
on mips64vrel-elf.  Eric, does this look OK to you?

Richard


	* config/mips/mips.c (mips_legitimize_move): Generate special patterns
	for mflo and mfhi instructions.
	(mips_output_move): Remove mflo and mfhi handling.
	* config/mips/mips.md (UNSPEC_MFHILO): New unspec.
	(*mulsidi3_64bit): Update for new mfhi/mflo representation.
	Likewise various define_peephole2s.
	(*movdi_32bit, *movdi_64bit, *movsi_internal): Merge x<-J and x<-d
	alternatives.
	(*movdi_64bit, *movdi_64bit_mips16, *mov[shq]i_internal)
	(*mov[shq]i_mips16): Remove mflo and mfhi alternatives.
	(mfhilo_di, mfhilo_si): New patterns.

gcc/testsuite/
	* gcc.dg/torture/mips-hilo-1.c: New test.

Index: config/mips/mips.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/mips/mips.c,v
retrieving revision 1.402
diff -u -p -F^\([(a-zA-Z0-9_]\|#define\) -r1.402 mips.c
--- config/mips/mips.c	17 Apr 2004 07:02:29 -0000	1.402
+++ config/mips/mips.c	17 Apr 2004 08:37:45 -0000
@@ -2005,6 +2005,23 @@ mips_legitimize_move (enum machine_mode 
       return true;
     }
 
+  /* Check for individual, fully-reloaded mflo and mfhi instructions.  */
+  if (GET_MODE_SIZE (mode) <= UNITS_PER_WORD
+      && REG_P (src) && MD_REG_P (REGNO (src))
+      && REG_P (dest) && GP_REG_P (REGNO (dest)))
+    {
+      int other_regno = REGNO (src) == HI_REGNUM ? LO_REGNUM : HI_REGNUM;
+      if (GET_MODE_SIZE (mode) <= 4)
+	emit_insn (gen_mfhilo_si (gen_rtx_REG (SImode, REGNO (dest)),
+				  gen_rtx_REG (SImode, REGNO (src)),
+				  gen_rtx_REG (SImode, other_regno)));
+      else
+	emit_insn (gen_mfhilo_di (gen_rtx_REG (DImode, REGNO (dest)),
+				  gen_rtx_REG (DImode, REGNO (src)),
+				  gen_rtx_REG (DImode, other_regno)));
+      return true;
+    }
+
   /* We need to deal with constants that would be legitimate
      immediate_operands but not legitimate move_operands.  */
   if (CONSTANT_P (src) && !move_operand (src, mode))
@@ -2664,9 +2681,6 @@ mips_output_move (rtx dest, rtx src)
     {
       if (src_code == REG)
 	{
-	  if (MD_REG_P (REGNO (src)))
-	    return "mf%1\t%0";
-
 	  if (ST_REG_P (REGNO (src)) && ISA_HAS_8CC)
 	    return "lui\t%0,0x3f80\n\tmovf\t%0,%.,%1";
 
Index: config/mips/mips.md
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/mips/mips.md,v
retrieving revision 1.231
diff -u -p -F^\([(a-zA-Z0-9_]\|#define\) -r1.231 mips.md
--- config/mips/mips.md	17 Apr 2004 07:02:29 -0000	1.231
+++ config/mips/mips.md	17 Apr 2004 08:37:46 -0000
@@ -57,6 +57,7 @@ (define_constants
    (UNSPEC_LOAD_CALL		27)
    (UNSPEC_LOAD_GOT		28)
    (UNSPEC_GP			29)
+   (UNSPEC_MFHILO		30)
 
    (UNSPEC_ADDRESS_FIRST	100)
 
@@ -1550,11 +1551,8 @@ (define_peephole2
         (clobber (match_operand:SI 3 "register_operand" ""))
         (clobber (scratch:SI))])
    (set (match_operand:SI 4 "register_operand" "")
-        (match_dup 0))]
-  "GENERATE_MULT3_SI
-   && true_regnum (operands[0]) == LO_REGNUM
-   && GP_REG_P (true_regnum (operands[4]))
-   && peep2_reg_dead_p (2, operands[0])"
+	(unspec [(match_dup 0) (match_dup 3)] UNSPEC_MFHILO))]
+  "GENERATE_MULT3_SI && peep2_reg_dead_p (2, operands[0])"
   [(parallel
        [(set (match_dup 4)
 	     (mult:SI (match_dup 1)
@@ -1743,9 +1741,8 @@ (define_peephole2
 	(clobber (match_operand:SI 2 "register_operand" ""))
 	(clobber (scratch:SI))])
    (set (match_operand:SI 3 "register_operand" "")
-	(match_dup 0))]
-  "true_regnum (operands[0]) == LO_REGNUM
-   && GP_REG_P (true_regnum (operands[3]))"
+	(unspec:SI [(match_dup 0) (match_dup 2)] UNSPEC_MFHILO))]
+  ""
   [(parallel [(set (match_dup 0)
 		   (match_dup 1))
 	      (set (match_dup 3)
@@ -1816,11 +1813,8 @@ (define_peephole2
 	(clobber (scratch:SI))])
    (match_dup 0)
    (set (match_operand:SI 5 "register_operand" "")
-	(match_dup 1))]
-  "GENERATE_MULT3_SI
-   && true_regnum (operands[1]) == LO_REGNUM
-   && peep2_reg_dead_p (3, operands[1])
-   && GP_REG_P (true_regnum (operands[5]))"
+	(unspec:SI [(match_dup 1) (match_dup 4)] UNSPEC_MFHILO))]
+  "GENERATE_MULT3_SI && peep2_reg_dead_p (3, operands[1])"
   [(parallel [(set (match_dup 0)
 		   (match_dup 6))
 	      (clobber (match_dup 4))
@@ -2024,8 +2018,8 @@ (define_insn_and_split "*mulsidi3_64bit"
 		(const_int 32)))])
 
    ;; OP7 <- LO, OP0 <- HI
-   (set (match_dup 7) (match_dup 5))
-   (set (match_dup 0) (match_dup 6))
+   (set (match_dup 7) (unspec:DI [(match_dup 5) (match_dup 6)] UNSPEC_MFHILO))
+   (set (match_dup 0) (unspec:DI [(match_dup 6) (match_dup 5)] UNSPEC_MFHILO))
 
    ;; Zero-extend OP7.
    (set (match_dup 7)
@@ -4562,15 +4556,15 @@ (define_insn ""
    (set_attr "mode"	"DI")])
 
 (define_insn "*movdi_32bit"
-  [(set (match_operand:DI 0 "nonimmediate_operand" "=d,d,d,m,*x,*d,*x,*B*C*D,*B*C*D,*d,*m")
-	(match_operand:DI 1 "move_operand" "d,i,m,d,J,*x,*d,*d,*m,*B*C*D,*B*C*D"))]
+  [(set (match_operand:DI 0 "nonimmediate_operand" "=d,d,d,m,*x,*d,*B*C*D,*B*C*D,*d,*m")
+	(match_operand:DI 1 "move_operand" "d,i,m,d,*J*d,*x,*d,*m,*B*C*D,*B*C*D"))]
   "!TARGET_64BIT && !TARGET_MIPS16
    && (register_operand (operands[0], DImode)
        || reg_or_0_operand (operands[1], DImode))"
   { return mips_output_move (operands[0], operands[1]); }
-  [(set_attr "type"	"arith,arith,load,store,mthilo,mfhilo,mthilo,xfer,load,xfer,store")
+  [(set_attr "type"	"arith,arith,load,store,mthilo,mfhilo,xfer,load,xfer,store")
    (set_attr "mode"	"DI")
-   (set_attr "length"   "8,16,*,*,8,8,8,8,*,8,*")])
+   (set_attr "length"   "8,16,*,*,8,8,8,*,8,*")])
 
 (define_insn "*movdi_32bit_mips16"
   [(set (match_operand:DI 0 "nonimmediate_operand" "=d,y,d,d,d,d,m,*d")
@@ -4584,24 +4578,24 @@ (define_insn "*movdi_32bit_mips16"
    (set_attr "length"	"8,8,8,8,12,*,*,8")])
 
 (define_insn "*movdi_64bit"
-  [(set (match_operand:DI 0 "nonimmediate_operand" "=d,d,e,d,m,*f,*f,*f,*d,*m,*x,*d,*x,*B*C*D,*B*C*D,*d,*m")
-	(match_operand:DI 1 "move_operand" "d,U,T,m,dJ,*f,*d*J,*m,*f,*f,*J,*x,*d,*d,*m,*B*C*D,*B*C*D"))]
+  [(set (match_operand:DI 0 "nonimmediate_operand" "=d,d,e,d,m,*f,*f,*f,*d,*m,*x,*B*C*D,*B*C*D,*d,*m")
+	(match_operand:DI 1 "move_operand" "d,U,T,m,dJ,*f,*d*J,*m,*f,*f,*J*d,*d,*m,*B*C*D,*B*C*D"))]
   "TARGET_64BIT && !TARGET_MIPS16
    && (register_operand (operands[0], DImode)
        || reg_or_0_operand (operands[1], DImode))"
   { return mips_output_move (operands[0], operands[1]); }
-  [(set_attr "type"	"arith,const,const,load,store,fmove,xfer,fpload,xfer,fpstore,mthilo,mfhilo,mthilo,xfer,load,xfer,store")
+  [(set_attr "type"	"arith,const,const,load,store,fmove,xfer,fpload,xfer,fpstore,mthilo,xfer,load,xfer,store")
    (set_attr "mode"	"DI")
-   (set_attr "length"	"4,*,*,*,*,4,4,*,4,*,4,4,4,8,*,8,*")])
+   (set_attr "length"	"4,*,*,*,*,4,4,*,4,*,4,8,*,8,*")])
 
 (define_insn "*movdi_64bit_mips16"
-  [(set (match_operand:DI 0 "nonimmediate_operand" "=d,y,d,d,d,d,d,m,*d")
-	(match_operand:DI 1 "move_operand" "d,d,y,K,N,U,m,d,*x"))]
+  [(set (match_operand:DI 0 "nonimmediate_operand" "=d,y,d,d,d,d,d,m")
+	(match_operand:DI 1 "move_operand" "d,d,y,K,N,U,m,d"))]
   "TARGET_64BIT && TARGET_MIPS16
    && (register_operand (operands[0], DImode)
        || register_operand (operands[1], DImode))"
   { return mips_output_move (operands[0], operands[1]); }
-  [(set_attr "type"	"arith,arith,arith,arith,arith,const,load,store,mfhilo")
+  [(set_attr "type"	"arith,arith,arith,arith,arith,const,load,store")
    (set_attr "mode"	"DI")
    (set_attr_alternative "length"
 		[(const_int 4)
@@ -4615,8 +4609,7 @@ (define_insn "*movdi_64bit_mips16"
 			       (const_int 12))
 		 (const_string "*")
 		 (const_string "*")
-		 (const_string "*")
-		 (const_int 4)])])
+		 (const_string "*")])])
 
 
 ;; On the mips16, we can split ld $r,N($r) into an add and a load,
@@ -4708,24 +4701,24 @@ (define_insn ""
 ;; in FP registers (off by default, use -mdebugh to enable).
 
 (define_insn "*movsi_internal"
-  [(set (match_operand:SI 0 "nonimmediate_operand" "=d,d,e,d,m,*f,*f,*f,*d,*m,*d,*z,*x,*d,*x,*B*C*D,*B*C*D,*d,*m")
-	(match_operand:SI 1 "move_operand" "d,U,T,m,dJ,*f,*d*J,*m,*f,*f,*z,*d,J,*x,*d,*d,*m,*B*C*D,*B*C*D"))]
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=d,d,e,d,m,*f,*f,*f,*d,*m,*d,*z,*x,*B*C*D,*B*C*D,*d,*m")
+	(match_operand:SI 1 "move_operand" "d,U,T,m,dJ,*f,*d*J,*m,*f,*f,*z,*d,*J*d,*d,*m,*B*C*D,*B*C*D"))]
   "!TARGET_MIPS16
    && (register_operand (operands[0], SImode)
        || reg_or_0_operand (operands[1], SImode))"
   { return mips_output_move (operands[0], operands[1]); }
-  [(set_attr "type"	"arith,const,const,load,store,fmove,xfer,fpload,xfer,fpstore,xfer,xfer,mthilo,mfhilo,mthilo,xfer,load,xfer,store")
+  [(set_attr "type"	"arith,const,const,load,store,fmove,xfer,fpload,xfer,fpstore,xfer,xfer,mthilo,xfer,load,xfer,store")
    (set_attr "mode"	"SI")
-   (set_attr "length"	"4,*,*,*,*,4,4,*,4,*,4,4,4,4,4,4,*,4,*")])
+   (set_attr "length"	"4,*,*,*,*,4,4,*,4,*,4,4,4,4,*,4,*")])
 
 (define_insn "*movsi_mips16"
-  [(set (match_operand:SI 0 "nonimmediate_operand" "=d,y,d,d,d,d,d,m,*d")
-	(match_operand:SI 1 "move_operand" "d,d,y,K,N,U,m,d,*x"))]
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=d,y,d,d,d,d,d,m")
+	(match_operand:SI 1 "move_operand" "d,d,y,K,N,U,m,d"))]
   "TARGET_MIPS16
    && (register_operand (operands[0], SImode)
        || register_operand (operands[1], SImode))"
   { return mips_output_move (operands[0], operands[1]); }
-  [(set_attr "type"	"arith,arith,arith,arith,arith,const,load,store,mfhilo")
+  [(set_attr "type"	"arith,arith,arith,arith,arith,const,load,store")
    (set_attr "mode"	"SI")
    (set_attr_alternative "length"
 		[(const_int 4)
@@ -4739,8 +4732,7 @@ (define_insn "*movsi_mips16"
 			       (const_int 12))
 		 (const_string "*")
 		 (const_string "*")
-		 (const_string "*")
-		 (const_int 4)])])
+		 (const_string "*")])])
 
 ;; On the mips16, we can split lw $r,N($r) into an add and a load,
 ;; when the original load is a 4 byte instruction but the add and the
@@ -4980,8 +4972,8 @@ (define_expand "movhi"
 })
 
 (define_insn "*movhi_internal"
-  [(set (match_operand:HI 0 "nonimmediate_operand" "=d,d,d,m,*d,*f,*f,*x,*d")
-	(match_operand:HI 1 "move_operand"         "d,I,m,dJ,*f,*d,*f,*d,*x"))]
+  [(set (match_operand:HI 0 "nonimmediate_operand" "=d,d,d,m,*d,*f,*f,*x")
+	(match_operand:HI 1 "move_operand"         "d,I,m,dJ,*f,*d,*f,*d"))]
   "!TARGET_MIPS16
    && (register_operand (operands[0], HImode)
        || reg_or_0_operand (operands[1], HImode))"
@@ -4993,15 +4985,14 @@ (define_insn "*movhi_internal"
     mfc1\t%0,%1
     mtc1\t%1,%0
     mov.s\t%0,%1
-    mt%0\t%1
-    mf%1\t%0"
-  [(set_attr "type"	"arith,arith,load,store,xfer,xfer,fmove,mthilo,mfhilo")
+    mt%0\t%1"
+  [(set_attr "type"	"arith,arith,load,store,xfer,xfer,fmove,mthilo")
    (set_attr "mode"	"HI")
-   (set_attr "length"	"4,4,*,*,4,4,4,4,4")])
+   (set_attr "length"	"4,4,*,*,4,4,4,4")])
 
 (define_insn "*movhi_mips16"
-  [(set (match_operand:HI 0 "nonimmediate_operand" "=d,y,d,d,d,d,m,*d")
-	(match_operand:HI 1 "move_operand"         "d,d,y,K,N,m,d,*x"))]
+  [(set (match_operand:HI 0 "nonimmediate_operand" "=d,y,d,d,d,d,m")
+	(match_operand:HI 1 "move_operand"         "d,d,y,K,N,m,d"))]
   "TARGET_MIPS16
    && (register_operand (operands[0], HImode)
        || register_operand (operands[1], HImode))"
@@ -5012,9 +5003,8 @@ (define_insn "*movhi_mips16"
     li\t%0,%1
     li\t%0,%n1\;neg\t%0
     lhu\t%0,%1
-    sh\t%1,%0
-    mf%1\t%0"
-  [(set_attr "type"	"arith,arith,arith,arith,arith,load,store,mfhilo")
+    sh\t%1,%0"
+  [(set_attr "type"	"arith,arith,arith,arith,arith,load,store")
    (set_attr "mode"	"HI")
    (set_attr_alternative "length"
 		[(const_int 4)
@@ -5027,8 +5017,7 @@ (define_insn "*movhi_mips16"
 			       (const_int 8)
 			       (const_int 12))
 		 (const_string "*")
-		 (const_string "*")
-		 (const_int 4)])])
+		 (const_string "*")])])
 
 
 ;; On the mips16, we can split lh $r,N($r) into an add and a load,
@@ -5090,8 +5079,8 @@ (define_expand "movqi"
 })
 
 (define_insn "*movqi_internal"
-  [(set (match_operand:QI 0 "nonimmediate_operand" "=d,d,d,m,*d,*f,*f,*x,*d")
-	(match_operand:QI 1 "move_operand"         "d,I,m,dJ,*f,*d,*f,*d,*x"))]
+  [(set (match_operand:QI 0 "nonimmediate_operand" "=d,d,d,m,*d,*f,*f,*x")
+	(match_operand:QI 1 "move_operand"         "d,I,m,dJ,*f,*d,*f,*d"))]
   "!TARGET_MIPS16
    && (register_operand (operands[0], QImode)
        || reg_or_0_operand (operands[1], QImode))"
@@ -5103,15 +5092,14 @@ (define_insn "*movqi_internal"
     mfc1\t%0,%1
     mtc1\t%1,%0
     mov.s\t%0,%1
-    mt%0\t%1
-    mf%1\t%0"
-  [(set_attr "type"	"arith,arith,load,store,xfer,xfer,fmove,mthilo,mfhilo")
+    mt%0\t%1"
+  [(set_attr "type"	"arith,arith,load,store,xfer,xfer,fmove,mthilo")
    (set_attr "mode"	"QI")
-   (set_attr "length"	"4,4,*,*,4,4,4,4,4")])
+   (set_attr "length"	"4,4,*,*,4,4,4,4")])
 
 (define_insn "*movqi_mips16"
-  [(set (match_operand:QI 0 "nonimmediate_operand" "=d,y,d,d,d,d,m,*d")
-	(match_operand:QI 1 "move_operand"         "d,d,y,K,N,m,d,*x"))]
+  [(set (match_operand:QI 0 "nonimmediate_operand" "=d,y,d,d,d,d,m")
+	(match_operand:QI 1 "move_operand"         "d,d,y,K,N,m,d"))]
   "TARGET_MIPS16
    && (register_operand (operands[0], QImode)
        || register_operand (operands[1], QImode))"
@@ -5122,11 +5110,10 @@ (define_insn "*movqi_mips16"
     li\t%0,%1
     li\t%0,%n1\;neg\t%0
     lbu\t%0,%1
-    sb\t%1,%0
-    mf%1\t%0"
-  [(set_attr "type"	"arith,arith,arith,arith,arith,load,store,mfhilo")
+    sb\t%1,%0"
+  [(set_attr "type"	"arith,arith,arith,arith,arith,load,store")
    (set_attr "mode"	"QI")
-   (set_attr "length"	"4,4,4,4,8,*,*,4")])
+   (set_attr "length"	"4,4,4,4,8,*,*")])
 
 ;; On the mips16, we can split lb $r,N($r) into an add and a load,
 ;; when the original load is a 4 byte instruction but the add and the
@@ -5279,6 +5266,31 @@ (define_split
   mips_split_64bit_move (operands[0], operands[1]);
   DONE;
 })
+
+;; The HI and LO registers are not truly independent.  If we move an mthi
+;; instruction before an mflo instruction, it will make the result of the
+;; mflo unpredicatable.  The same goes for mtlo and mfhi.
+;;
+;; We cope with this by making the mflo and mfhi patterns use both HI and LO.
+;; Operand 1 is the register we want, operand 2 is the other one.
+
+(define_insn "mfhilo_di"
+  [(set (match_operand:DI 0 "register_operand" "=d,d")
+	(unspec:DI [(match_operand:DI 1 "register_operand" "h,l")
+		    (match_operand:DI 2 "register_operand" "l,h")]
+		   UNSPEC_MFHILO))]
+  "TARGET_64BIT"
+  "mf%1\t%0"
+  [(set_attr "type" "mfhilo")])
+
+(define_insn "mfhilo_si"
+  [(set (match_operand:SI 0 "register_operand" "=d,d")
+	(unspec:SI [(match_operand:SI 1 "register_operand" "h,l")
+		    (match_operand:SI 2 "register_operand" "l,h")]
+		   UNSPEC_MFHILO))]
+  ""
+  "mf%1\t%0"
+  [(set_attr "type" "mfhilo")])
 
 ;; Patterns for loading or storing part of a paired floating point
 ;; register.  We need them because odd-numbered floating-point registers
--- /dev/null	Tue Jun 17 23:06:41 2003
+++ testsuite/gcc.dg/torture/mips-hilo-1.c	Sat Apr 17 09:38:04 2004
@@ -0,0 +1,75 @@
+/* f1 checks that an mtlo is not moved before an mfhi.  f2 does the same
+   for an mthi and an mflo.  */
+/* { dg-do run { target mips*-*-* } } */
+/* { dg-options "-mtune=rm7000" } */
+
+#if !defined(__mips16)
+
+#define DECLARE(TYPE)							\
+  TYPE __attribute__ ((noinline))					\
+  f1##TYPE (TYPE x1, TYPE x2, TYPE x3)					\
+  {									\
+    TYPE t1, t2;							\
+									\
+    asm ("mult\t%1,%2" : "=h" (t1) : "d" (x1), "d" (x2) : "lo");	\
+    asm ("mflo\t%0" : "=r" (t2) : "l" (x3) : "hi");			\
+    return t1 + t2;							\
+  }									\
+									\
+  TYPE __attribute__ ((noinline))					\
+  f2##TYPE (TYPE x1, TYPE x2, TYPE x3)					\
+  {									\
+    TYPE t1, t2;							\
+									\
+    asm ("mult\t%1,%2" : "=l" (t1) : "d" (x1), "d" (x2) : "hi");	\
+    asm ("mfhi\t%0" : "=r" (t2) : "h" (x3) : "lo");			\
+    return t1 + t2;							\
+  }
+
+#define TEST(TYPE)							\
+  if (f1##TYPE (1, 2, 10) != 10)					\
+    abort ();								\
+  if (f2##TYPE (1, 2, 40) != 42)					\
+    abort ()
+
+typedef char c;
+typedef signed char sc;
+typedef unsigned char uc;
+typedef short s;
+typedef unsigned short us;
+typedef int i;
+typedef unsigned int ui;
+typedef long long ll;
+typedef unsigned long long ull;
+
+DECLARE (c)
+DECLARE (sc)
+DECLARE (uc)
+DECLARE (s)
+DECLARE (us)
+DECLARE (i)
+DECLARE (ui)
+#if defined (__mips64)
+DECLARE (ll)
+DECLARE (ull)
+#endif
+
+int
+main ()
+{
+  TEST (c);
+  TEST (sc);
+  TEST (uc);
+  TEST (s);
+  TEST (us);
+  TEST (i);
+  TEST (ui);
+#if defined (__mips64)
+  TEST (ll);
+  TEST (ull);
+#endif
+  exit (0);
+}
+#else
+int main () { return 0; }
+#endif



More information about the Gcc-patches mailing list