This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]
Re: RFC: MIPS: Workaround the "daddi" and "daddiu" errata

From: "Maciej W. Rozycki" <macro at ds2 dot pg dot gda dot pl>
To: Richard Sandiford <rsandifo at redhat dot com>
Cc: gcc-patches at gcc dot gnu dot org
Date: Mon, 31 May 2004 16:04:02 +0200 (CEST)
Subject: Re: RFC: MIPS: Workaround the "daddi" and "daddiu" errata
Organization: Technical University of Gdansk
References: <Pine.LNX.4.55.0403031531360.3561@jurand.ds.pg.gda.pl><877jxyz19w.fsf@redhat.com> <Pine.LNX.4.55.0403180100020.14525@jurand.ds.pg.gda.pl><871xnqvckt.fsf@redhat.com>
On Thu, 18 Mar 2004, Richard Sandiford wrote:

> >  I'm not so sure.  I wanted to make absolutely sure the output is
> > verifiable for correctness and this achieved.  You can run e.g. `objdump
> > -S <file> | grep 'daddi[u]*' on a binary and see a problem immediately.
> 
> I don't agree that that's a good enough reason.  You can't generally
> expect to verify compiler changes with a simple check like this.  For
> one thing, even if the output is free of daddius, how do you know the
> replacements work correctly?  You still have to test the binary at the
> end of the day.

 Of course it does not check functional correctness of code, but it 
verifies there are no instructions that are known to be unreliable.  One 
can determine if generated code is bad by inspecting it and comparing to 
the respective source, but with the errata the code may be correct 
according to the ISA, but it may still produce bad results.  One has to be 
aware of the errata, with their all details, and of the data used to 
decide if code is bad or not.  By ruling out the instructions we get rid 
of this problem -- consider it a variation of the MIPS III ISA that lacks 
the "daddi"/"daddiu" instructions.

> > It can also be handled quite easily by the kernel or the dynamic linker
> > when mmap()ping executable pages -- simply by scanning for the bad
> > opcodes.  I'd say your proposal is more fragile -- you have to study
> > generated code thoroughly for every "daddiu" instruction to check if the
> > erratum can be triggered.  And run-time validation is essentially
> > impossible (we don't want to do static analysis in the kernel, do we?).
> 
> Are you talking about the kernel refusing to run suspect binaries?
> Or at least warning about them somehow?  If so, ELF offers far better
> ways of detecting a "safe" binary than scanning all executable pages.

 Yes, I think of a kind of a debugging facility that would do such 
validation.

 How can any ELF feature tell you there are no bad instructions in code in 
a binary?  Any flags or special sections may at most show an effort was 
done to deal with the problem.  This is useful, but not completely safe.  

> >  Note that even if the address is invalid, code should work correctly --
> > the address may be further offsetted later.  Or code may expect a bad
> > address and deal with it somehow.  Or there may be a real bug in code --
> > it should really get an exception then, instead of silently using a 
> > different address.
> 
> But you seem to be missing my point.  Within GCC, I think all symbolic
> constants are associated with pointers.  I don't know of any well-defined
> way of creating a legitimate address by using an offset from an invalid
> pointer.  And besides, in the normal run of things, an invalid constant
> address could only be defined explicitly by the user, perhaps using a
> linker script.

 Well, something like:

((void (*)(void))0xffffffffbfc00000)();

used to work for me to call the reset vector from Linux. ;-)  There's no
need to resort to linker scripts to create arbitrary pointers -- the
expression above could be changed to define e.g. an array.

 Anyway, one would expect code to execute deterministically and this
includes e.g. getting SIGSEGV with an invalid pointer which may no longer
happen due to a crop happening with the erratum.

> >> Your patch also includes changes to patterns like ffsdi.  This seems
> >> a bit over-the-top; there's no way counting from 0 to 64 can trigger
> >> overflow.

 This is of course an overkill -- a careful inspection of code leads to a
conclusion "addiu" can be safely used here avoiding the problem
altogether.

> >  It's just the consequence of getting rid of the instructions universally, 
> > which was my goal.
> 
> But this for me is the main example of why this goal seems wrong.
> We know full well that stack pointer allocations can safely be done
> with daddiu.  Why double the number of instructions needed just so
> that a simple grep will yield no matches?

 To make a user sure code will behave as specified in any circumstances.
The subtlety of the errata makes one try to avoid hitting them as they may
lead to data corruption that may long survive unnoticed.

 Here's the next iteration of the patch.  After doing more research I
discovered defining a new constraint may help avoiding a lot of code
duplication.  This is indeed the case -- while functionally equivalent,
the new version leads to a significantly shorter code and improves
maintainability.

 The patch is tested with a snapshot of Linux 2.4.24 and appears to work
fine -- I'm still debugging a problem with the fcntl syscall, but I think
it's unrelated, being more a generic gcc 3.5.x compatibility problem, like
a few I've already resolved.

 What do you think?

  Maciej

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--------------------------------------------------------------+
+        e-mail: macro@ds2.pg.gda.pl, PGP key available        +

gcc-3.5.0-20040524-mips-nodaddi.patch
diff -up --recursive --new-file gcc-3.5.0-20040524.macro/gcc/config/mips/mips.c gcc-3.5.0-20040524/gcc/config/mips/mips.c
--- gcc-3.5.0-20040524.macro/gcc/config/mips/mips.c	2004-05-14 01:32:53.000000000 +0000
+++ gcc-3.5.0-20040524/gcc/config/mips/mips.c	2004-05-24 14:07:58.000000000 +0000
@@ -1153,7 +1153,7 @@ mips_symbol_insns (enum mips_symbol_type
 
 	 The final address is then $at + %lo(symbol).  With 32-bit
 	 symbols we just need a preparatory lui.  */
-      return (ABI_HAS_64BIT_SYMBOLS ? 6 : 2);
+      return (ABI_HAS_64BIT_SYMBOLS ? (!TARGET_NO_DADDI ? 6 : 8) : 2);
 
     case SYMBOL_SMALL_DATA:
       return 1;
@@ -5110,6 +5110,12 @@ override_options (void)
   if ((target_flags_explicit & MASK_FIX_R4400) == 0
       && mips_matching_cpu_name_p (mips_arch_info->name, "r4400"))
     target_flags |= MASK_FIX_R4400;
+
+  /* Default to working around daddi/daddiu errata when either R4000
+     or R4400 errata workarounds are enabled.  */
+  if ((target_flags_explicit & MASK_NO_DADDI) == 0
+      && (TARGET_FIX_R4000 || TARGET_FIX_R4400))
+    target_flags |= MASK_NO_DADDI;
 }
 
 /* Implement CONDITIONAL_REGISTER_USAGE.  */
@@ -5166,6 +5172,9 @@ mips_conditional_register_usage (void)
       for (regno = FP_REG_FIRST + 21; regno <= FP_REG_FIRST + 31; regno+=2)
 	call_really_used_regs[regno] = call_used_regs[regno] = 1;
     }
+  /* t8 is used for unconditional jumps when -mno-daddi is active.  */
+  if (Pmode == DImode && TARGET_NO_DADDI)
+    fixed_regs[24] = 1;
 }
 
 /* Allocate a chunk of memory for per-function machine-dependent data.  */
@@ -7624,7 +7633,10 @@ mips_secondary_reload_class (enum reg_cl
       if (GET_CODE (x) == MEM)
 	{
 	  /* In this case we can use lwc1, swc1, ldc1 or sdc1.  */
-	  return NO_REGS;
+	  if (Pmode == DImode && TARGET_NO_DADDI && mips_fetch_insns (x) != 1)
+	    return gr_regs;
+	  else
+	    return NO_REGS;
 	}
       else if (CONSTANT_P (x) && GET_MODE_CLASS (mode) == MODE_FLOAT)
 	{
@@ -7668,6 +7680,12 @@ mips_secondary_reload_class (enum reg_cl
 	}
     }
 
+  if (Pmode == DImode && TARGET_NO_DADDI
+      && GET_CODE (x) == MEM && mips_fetch_insns (x) != 1
+      && ((! in_p && class == GR_REGS)
+	  || class == COP0_REGS || class == COP2_REGS || class == COP3_REGS))
+    return gr_regs;
+
   return NO_REGS;
 }
 
@@ -9131,8 +9149,7 @@ mips_adjust_insn_length (rtx insn, int l
 }
 
 
-/* Return an asm sequence to start a noat block and load the address
-   of a label into $1.  */
+/* Return an asm sequence to load the address of a label into %1.  */
 
 const char *
 mips_output_load_label (void)
@@ -9141,22 +9158,26 @@ mips_output_load_label (void)
     switch (mips_abi)
       {
       case ABI_N32:
-	return "%[lw\t%@,%%got_page(%0)(%+)\n\taddiu\t%@,%@,%%got_ofst(%0)";
+	return "lw\t%1,%%got_page(%0)(%+)\n\taddiu\t%1,%1,%%got_ofst(%0)";
 
       case ABI_64:
-	return "%[ld\t%@,%%got_page(%0)(%+)\n\tdaddiu\t%@,%@,%%got_ofst(%0)";
+	if (!TARGET_NO_DADDI)
+	  return "ld\t%1,%%got_page(%0)(%+)\n\tdaddiu\t%1,%1,%%got_ofst(%0)";
+	else
+	  return "ld\t%1,%%got_page(%0)(%+)\n"
+		 "\t%[addiu\t%@,%.,%%got_ofst(%0)\n\tdaddu\t%1,%1,%@%]";
 
       default:
 	if (ISA_HAS_LOAD_DELAY)
-	  return "%[lw\t%@,%%got(%0)(%+)%#\n\taddiu\t%@,%@,%%lo(%0)";
-	return "%[lw\t%@,%%got(%0)(%+)\n\taddiu\t%@,%@,%%lo(%0)";
+	  return "lw\t%1,%%got(%0)(%+)%#\n\taddiu\t%1,%1,%%lo(%0)";
+	return "lw\t%1,%%got(%0)(%+)\n\taddiu\t%1,%1,%%lo(%0)";
       }
   else
     {
       if (Pmode == DImode)
-	return "%[dla\t%@,%0";
+	return "dla\t%1,%0\n\t";
       else
-	return "%[la\t%@,%0";
+	return "la\t%1,%0";
     }
 }
 
@@ -9344,8 +9365,21 @@ mips_output_conditional_branch (rtx insn
 	  output_asm_insn ("j\t%0", &orig_target);
 	else
 	  {
-	    output_asm_insn (mips_output_load_label (), &orig_target);
-	    output_asm_insn ("jr\t%@%]", 0);
+	    rtx la_operands[2];
+	    la_operands[0] = orig_target;
+	    if (Pmode != DImode || !TARGET_NO_DADDI)
+	      {
+		la_operands[1] = gen_rtx_REG (SImode, 1);
+		sprintf (buffer, "%s%s", "%[", mips_output_load_label ());
+		output_asm_insn (buffer, la_operands);
+		output_asm_insn ("jr\t$1%]", la_operands);
+	      }
+	    else
+	      {
+		la_operands[1] = gen_rtx_REG (SImode, 24);
+		output_asm_insn (mips_output_load_label (), la_operands);
+		output_asm_insn ("jr\t$24", la_operands);
+	      }
 	  }
 
         if (length != 16 && length != 28 && mips_branch_likely)
diff -up --recursive --new-file gcc-3.5.0-20040524.macro/gcc/config/mips/mips.h gcc-3.5.0-20040524/gcc/config/mips/mips.h
--- gcc-3.5.0-20040524.macro/gcc/config/mips/mips.h	2004-05-23 01:30:03.000000000 +0000
+++ gcc-3.5.0-20040524/gcc/config/mips/mips.h	2004-05-27 11:24:11.000000000 +0000
@@ -173,6 +173,7 @@ extern const struct mips_cpu_info *mips_
 #define MASK_FIX_VR4120	   0x04000000   /* Work around VR4120 errata.  */
 #define MASK_VR4130_ALIGN  0x08000000	/* Perform VR4130 alignment opts.  */
 #define MASK_FP_EXCEPTIONS 0x10000000   /* FP exceptions are enabled.  */
+#define MASK_NO_DADDI	   0x20000000	/* Don't use "daddi" and "daddiu".  */
 
 					/* Debug switches, not documented */
 #define MASK_DEBUG	0		/* unused */
@@ -255,6 +256,9 @@ extern const struct mips_cpu_info *mips_
 
 #define TARGET_FP_EXCEPTIONS	(target_flags & MASK_FP_EXCEPTIONS)
 
+					/* Don't use "daddi" and "daddiu".  */
+#define TARGET_NO_DADDI		(target_flags & MASK_NO_DADDI)
+
 /* True if we should use NewABI-style relocation operators for
    symbolic addresses.  This is never true for mips16 code,
    which has its own conventions.  */
@@ -646,6 +650,10 @@ extern const struct mips_cpu_info *mips_
      N_("Work around certain VR4120 errata")},				\
   {"no-fix-vr4120",	 -MASK_FIX_VR4120,				\
      N_("Don't work around certain VR4120 errata")},			\
+  {"daddi",		 -MASK_NO_DADDI,				\
+     N_("Use ""daddi"" and ""daddiu""")},				\
+  {"no-daddi",		  MASK_NO_DADDI,				\
+     N_("Don't use ""daddi"" and ""daddiu"" (for 4000 and early 4400 errata)")}, \
   {"check-zero-division",-MASK_NO_CHECK_ZERO_DIV,			\
      N_("Trap on integer divide by zero")},				\
   {"no-check-zero-division", MASK_NO_CHECK_ZERO_DIV,			\
@@ -2077,7 +2085,9 @@ extern enum reg_class mips_char_to_class
    `W' is for memory references that are based on a member of BASE_REG_CLASS.
 	 This is true for all non-mips16 references (although it can sometimes
 	 be indirect if !TARGET_EXPLICIT_RELOCS).  For mips16, it excludes
-	 stack and constant-pool references.  */
+	 stack and constant-pool references.
+   `Y' is for memory references that require at most a single scratch register
+	 to be loaded.  */
 
 #define EXTRA_CONSTRAINT(OP,CODE)					\
   (((CODE) == 'Q')	  ? const_arith_operand (OP, VOIDmode)		\
@@ -2096,10 +2106,14 @@ extern enum reg_class mips_char_to_class
 			     && (!TARGET_MIPS16				\
 				 || (!stack_operand (OP, VOIDmode)	\
 				     && !CONSTANT_P (XEXP (OP, 0)))))	\
+   : ((CODE) == 'Y')	  ? (GET_CODE (OP) == MEM			\
+			     && (!(Pmode == DImode && TARGET_NO_DADDI)	\
+				 || mips_fetch_insns (OP) == 1))	\
    : FALSE)
 
 /* Say which of the above are memory constraints.  */
-#define EXTRA_MEMORY_CONSTRAINT(C, STR) ((C) == 'R' || (C) == 'W')
+#define EXTRA_MEMORY_CONSTRAINT(C, STR)					\
+  ((C) == 'R' || (C) == 'W' || (C) == 'Y')
 
 #define PREFERRED_RELOAD_CLASS(X,CLASS)					\
   mips_preferred_reload_class (X, CLASS)
@@ -3318,13 +3332,33 @@ do {									\
 #define ASM_OUTPUT_REG_PUSH(STREAM,REGNO)				\
 do									\
   {									\
-    fprintf (STREAM, "\t%s\t%s,%s,8\n\t%s\t%s,0(%s)\n",			\
-	     TARGET_64BIT ? "dsubu" : "subu",				\
-	     reg_names[STACK_POINTER_REGNUM],				\
-	     reg_names[STACK_POINTER_REGNUM],				\
-	     TARGET_64BIT ? "sd" : "sw",				\
-	     reg_names[REGNO],						\
-	     reg_names[STACK_POINTER_REGNUM]);				\
+    if (!TARGET_64BIT || !TARGET_NO_DADDI)				\
+      {									\
+	fprintf (STREAM, "\t%s\t%s,%s,-8\n\t%s\t%s,0(%s)\n",		\
+		 TARGET_64BIT ? "daddiu" : "addiu",			\
+		 reg_names[STACK_POINTER_REGNUM],			\
+		 reg_names[STACK_POINTER_REGNUM],			\
+		 TARGET_64BIT ? "sd" : "sw",				\
+		 reg_names[REGNO],					\
+		 reg_names[STACK_POINTER_REGNUM]);			\
+      }									\
+    else								\
+      {									\
+	if (! set_noat)							\
+	  fprintf (STREAM, "\t.set\tnoat\n");				\
+									\
+	fprintf (STREAM, "\taddiu\t%s,%s,-8\n\tdaddu\t%s,%s\n"		\
+			 "\tsd\t%s,0(%s)\n",				\
+		 reg_names[AT_REGNUM],					\
+		 reg_names[GP_REG_FIRST],				\
+		 reg_names[STACK_POINTER_REGNUM],			\
+		 reg_names[AT_REGNUM],					\
+		 reg_names[REGNO],					\
+		 reg_names[STACK_POINTER_REGNUM]);			\
+									\
+	if (! set_noat)							\
+	  fprintf (STREAM, "\t.set\tat\n");				\
+      }									\
   }									\
 while (0)
 
@@ -3334,13 +3368,33 @@ do									\
     if (! set_noreorder)						\
       fprintf (STREAM, "\t.set\tnoreorder\n");				\
 									\
-    fprintf (STREAM, "\t%s\t%s,0(%s)\n\t%s\t%s,%s,8\n",			\
-	     TARGET_64BIT ? "ld" : "lw",				\
-	     reg_names[REGNO],						\
-	     reg_names[STACK_POINTER_REGNUM],				\
-	     TARGET_64BIT ? "daddu" : "addu",				\
-	     reg_names[STACK_POINTER_REGNUM],				\
-	     reg_names[STACK_POINTER_REGNUM]);				\
+    if (!TARGET_64BIT || !TARGET_NO_DADDI)				\
+      {									\
+	fprintf (STREAM, "\t%s\t%s,0(%s)\n\t%s\t%s,%s,8\n",		\
+		 TARGET_64BIT ? "ld" : "lw",				\
+		 reg_names[REGNO],					\
+		 reg_names[STACK_POINTER_REGNUM],			\
+		 TARGET_64BIT ? "daddiu" : "addiu",			\
+		 reg_names[STACK_POINTER_REGNUM],			\
+		 reg_names[STACK_POINTER_REGNUM]);			\
+      }									\
+    else								\
+      {									\
+	if (! set_noat)							\
+	  fprintf (STREAM, "\t.set\tnoat\n");				\
+									\
+	fprintf (STREAM, "\tld\t%s,0(%s)\n"				\
+			 "\taddiu\t%s,%s,8\n\tdaddu\t%s,%s\n",		\
+		 reg_names[REGNO],					\
+		 reg_names[STACK_POINTER_REGNUM],			\
+		 reg_names[AT_REGNUM],					\
+		 reg_names[GP_REG_FIRST],				\
+		 reg_names[STACK_POINTER_REGNUM],			\
+		 reg_names[AT_REGNUM],					\
+									\
+	if (! set_noat)							\
+	  fprintf (STREAM, "\t.set\tat\n");				\
+      }									\
 									\
     if (! set_noreorder)						\
       fprintf (STREAM, "\t.set\treorder\n");				\
diff -up --recursive --new-file gcc-3.5.0-20040524.macro/gcc/config/mips/mips.md gcc-3.5.0-20040524/gcc/config/mips/mips.md
--- gcc-3.5.0-20040524.macro/gcc/config/mips/mips.md	2004-05-18 01:31:11.000000000 +0000
+++ gcc-3.5.0-20040524/gcc/config/mips/mips.md	2004-05-27 11:30:04.000000000 +0000
@@ -889,6 +889,42 @@
     }
 })
 
+;; The original R4000 and the initial revision of the R4400 have a cpu
+;; bug.  Under an overflow condition double-word immediate addition
+;; may give an incorrect result.  We handle the problem by using a
+;; sequence of a single-word immediate addition to load a constant to
+;; a temporary register and then a double-word register addition.  We
+;; also provide an aid for the assembler to deal with this problem by
+;; avoiding macros that require a "daddiu" instruction in their
+;; expansion.
+;;
+;; From "MIPS R4000PC/SC Errata, Processor Revision 2.2 and 3.0"
+;; (also valid for MIPS R4000MC processors):
+;;
+;; "23. R4000PC, R4000SC: The 64-bit instruction, daddi, fails to take
+;;	an overflow exception.
+;;	Workaround: There is no workaround for this problem."
+;;
+;; and:
+;;
+;; "41. R4000PC, R4000SC: Under the following condition, the DADDIU
+;;	instruction can produce an incorrect result.  If this
+;;	instruction generates a result value that would cause an
+;;	overflow condition to occur (even though this instruction does
+;;	not take an overflow exception) then the result value will be
+;;	correct in bits 0-31 but bit 31 will be replicated through
+;;	bits 32-63 (so it looks like a 32bit signextended value).  The
+;;	overflow condition is defined when the carries out of bits 62
+;;	and 63 differ (two's compliment overflow).
+;;	Workaround: There is no workaround for this problem."
+;;
+;; Erratum #41 is also present in "MIPS R4400PC/SC Errata, Processor
+;; Revision 1.0" (also valid for MIPS R4400MC processors) as erratum
+;; #7.
+;;
+;; These processors have PRId values of 0x00004220 and 0x00004300 for
+;; the R4000 and 0x00004400 for the R4400.
+
 (define_expand "adddi3"
   [(set (match_operand:DI 0 "register_operand")
 	(plus:DI (match_operand:DI 1 "register_operand")
@@ -917,17 +953,29 @@
     }
 })
 
-(define_insn "adddi3_internal"
+(define_insn "adddi3_internal_daddi"
   [(set (match_operand:DI 0 "register_operand" "=d,d")
 	(plus:DI (match_operand:DI 1 "reg_or_0_operand" "dJ,dJ")
 		 (match_operand:DI 2 "arith_operand" "d,Q")))]
-  "TARGET_64BIT && !TARGET_MIPS16"
+  "TARGET_64BIT && !TARGET_MIPS16 && !TARGET_NO_DADDI"
   "@
     daddu\t%0,%z1,%2
     daddiu\t%0,%z1,%2"
   [(set_attr "type"	"arith")
    (set_attr "mode"	"DI")])
 
+(define_insn "adddi3_internal_3_no_daddi"
+  [(set (match_operand:DI 0 "register_operand" "=d,d")
+	(plus:DI (match_operand:DI 1 "reg_or_0_operand" "dJ,dJ")
+		 (match_operand:DI 2 "arith_operand" "d,Q")))]
+  "TARGET_64BIT && !TARGET_MIPS16 && TARGET_NO_DADDI"
+  "@
+    daddu\t%0,%z1,%2
+    %[addiu\\t%@,%.,%2\;daddu\t%0,%z1,%@%]"
+  [(set_attr "type"	"arith")
+   (set_attr "mode"	"DI")
+   (set_attr "length"	"4,8")])
+
 ;; For the mips16, we need to recognize stack pointer additions
 ;; explicitly, since we don't have a constraint for $sp.  These insns
 ;; will be generated by the save_restore_insns functions.
@@ -2692,7 +2740,7 @@ srl\t%3,%3,1\n\
 move\t%0,%.\;\
 beq\t%1,%.,2f\n\
 %~1:\tand\t%2,%1,0x0001\;\
-daddu\t%0,%0,1\;\
+addiu\t%0,%0,1\;\
 beq\t%2,%.,1b\;\
 dsrl\t%1,%1,1\n\
 %~2:%)";
@@ -2702,7 +2750,7 @@ move\t%0,%.\;\
 move\t%3,%1\;\
 beq\t%3,%.,2f\n\
 %~1:\tand\t%2,%3,0x0001\;\
-daddu\t%0,%0,1\;\
+addiu\t%0,%0,1\;\
 beq\t%2,%.,1b\;\
 dsrl\t%3,%3,1\n\
 %~2:%)";
@@ -3075,7 +3123,7 @@ dsrl\t%3,%3,1\n\
 ;; Step A needs a real instruction but step B does not.
 
 (define_insn "truncdisi2"
-  [(set (match_operand:SI 0 "nonimmediate_operand" "=d,m")
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=d,Y")
         (truncate:SI (match_operand:DI 1 "register_operand" "d,d")))]
   "TARGET_64BIT"
   "@
@@ -3086,7 +3134,7 @@ dsrl\t%3,%3,1\n\
    (set_attr "extended_mips16" "yes,*")])
 
 (define_insn "truncdihi2"
-  [(set (match_operand:HI 0 "nonimmediate_operand" "=d,m")
+  [(set (match_operand:HI 0 "nonimmediate_operand" "=d,Y")
         (truncate:HI (match_operand:DI 1 "register_operand" "d,d")))]
   "TARGET_64BIT"
   "@
@@ -3097,7 +3145,7 @@ dsrl\t%3,%3,1\n\
    (set_attr "extended_mips16" "yes,*")])
 
 (define_insn "truncdiqi2"
-  [(set (match_operand:QI 0 "nonimmediate_operand" "=d,m")
+  [(set (match_operand:QI 0 "nonimmediate_operand" "=d,Y")
         (truncate:QI (match_operand:DI 1 "register_operand" "d,d")))]
   "TARGET_64BIT"
   "@
@@ -3991,8 +4039,8 @@ dsrl\t%3,%3,1\n\
 
 (define_insn "mov_lwl"
   [(set (match_operand:SI 0 "register_operand" "=d")
-	(unspec:SI [(match_operand:BLK 1 "memory_operand" "m")
-		    (match_operand:QI 2 "memory_operand" "m")]
+	(unspec:SI [(match_operand:BLK 1 "memory_operand" "Y")
+		    (match_operand:QI 2 "memory_operand" "Y")]
 		   UNSPEC_LWL))]
   "!TARGET_MIPS16"
   "lwl\t%0,%2"
@@ -4002,8 +4050,8 @@ dsrl\t%3,%3,1\n\
 
 (define_insn "mov_lwr"
   [(set (match_operand:SI 0 "register_operand" "=d")
-	(unspec:SI [(match_operand:BLK 1 "memory_operand" "m")
-		    (match_operand:QI 2 "memory_operand" "m")
+	(unspec:SI [(match_operand:BLK 1 "memory_operand" "Y")
+		    (match_operand:QI 2 "memory_operand" "Y")
 		    (match_operand:SI 3 "register_operand" "0")]
 		   UNSPEC_LWR))]
   "!TARGET_MIPS16"
@@ -4013,9 +4061,9 @@ dsrl\t%3,%3,1\n\
 
 
 (define_insn "mov_swl"
-  [(set (match_operand:BLK 0 "memory_operand" "=m")
+  [(set (match_operand:BLK 0 "memory_operand" "=Y")
 	(unspec:BLK [(match_operand:SI 1 "reg_or_0_operand" "dJ")
-		     (match_operand:QI 2 "memory_operand" "m")]
+		     (match_operand:QI 2 "memory_operand" "Y")]
 		    UNSPEC_SWL))]
   "!TARGET_MIPS16"
   "swl\t%z1,%2"
@@ -4023,9 +4071,9 @@ dsrl\t%3,%3,1\n\
    (set_attr "mode" "SI")])
 
 (define_insn "mov_swr"
-  [(set (match_operand:BLK 0 "memory_operand" "+m")
+  [(set (match_operand:BLK 0 "memory_operand" "+Y")
 	(unspec:BLK [(match_operand:SI 1 "reg_or_0_operand" "dJ")
-		     (match_operand:QI 2 "memory_operand" "m")
+		     (match_operand:QI 2 "memory_operand" "Y")
 		     (match_dup 0)]
 		    UNSPEC_SWR))]
   "!TARGET_MIPS16"
@@ -4036,8 +4084,8 @@ dsrl\t%3,%3,1\n\
 
 (define_insn "mov_ldl"
   [(set (match_operand:DI 0 "register_operand" "=d")
-	(unspec:DI [(match_operand:BLK 1 "memory_operand" "m")
-		    (match_operand:QI 2 "memory_operand" "m")]
+	(unspec:DI [(match_operand:BLK 1 "memory_operand" "Y")
+		    (match_operand:QI 2 "memory_operand" "Y")]
 		   UNSPEC_LDL))]
   "TARGET_64BIT && !TARGET_MIPS16"
   "ldl\t%0,%2"
@@ -4046,8 +4094,8 @@ dsrl\t%3,%3,1\n\
 
 (define_insn "mov_ldr"
   [(set (match_operand:DI 0 "register_operand" "=d")
-	(unspec:DI [(match_operand:BLK 1 "memory_operand" "m")
-		    (match_operand:QI 2 "memory_operand" "m")
+	(unspec:DI [(match_operand:BLK 1 "memory_operand" "Y")
+		    (match_operand:QI 2 "memory_operand" "Y")
 		    (match_operand:DI 3 "register_operand" "0")]
 		   UNSPEC_LDR))]
   "TARGET_64BIT && !TARGET_MIPS16"
@@ -4057,9 +4105,9 @@ dsrl\t%3,%3,1\n\
 
 
 (define_insn "mov_sdl"
-  [(set (match_operand:BLK 0 "memory_operand" "=m")
+  [(set (match_operand:BLK 0 "memory_operand" "=Y")
 	(unspec:BLK [(match_operand:DI 1 "reg_or_0_operand" "dJ")
-		     (match_operand:QI 2 "memory_operand" "m")]
+		     (match_operand:QI 2 "memory_operand" "Y")]
 		    UNSPEC_SDL))]
   "TARGET_64BIT && !TARGET_MIPS16"
   "sdl\t%z1,%2"
@@ -4067,9 +4115,9 @@ dsrl\t%3,%3,1\n\
    (set_attr "mode" "DI")])
 
 (define_insn "mov_sdr"
-  [(set (match_operand:BLK 0 "memory_operand" "+m")
+  [(set (match_operand:BLK 0 "memory_operand" "+Y")
 	(unspec:BLK [(match_operand:DI 1 "reg_or_0_operand" "dJ")
-		     (match_operand:QI 2 "memory_operand" "m")
+		     (match_operand:QI 2 "memory_operand" "Y")
 		     (match_dup 0)]
 		    UNSPEC_SDR))]
   "TARGET_64BIT && !TARGET_MIPS16"
@@ -4289,15 +4337,25 @@ dsrl\t%3,%3,1\n\
   [(set_attr "type"	"arith")
    (set_attr "mode"	"SI")])
 
-(define_insn "*lowdi"
+(define_insn "*lowdi_daddi"
   [(set (match_operand:DI 0 "register_operand" "=d")
 	(lo_sum:DI (match_operand:DI 1 "register_operand" "d")
 		   (match_operand:DI 2 "immediate_operand" "")))]
-  "!TARGET_MIPS16 && TARGET_64BIT"
+  "!TARGET_MIPS16 && !TARGET_NO_DADDI && TARGET_64BIT"
   "daddiu\t%0,%1,%R2"
   [(set_attr "type"	"arith")
    (set_attr "mode"	"DI")])
 
+(define_insn "*lowdi_no_daddi"
+  [(set (match_operand:DI 0 "register_operand" "=d")
+	(lo_sum:DI (match_operand:DI 1 "register_operand" "d")
+		   (match_operand:DI 2 "immediate_operand" "")))]
+  "!TARGET_MIPS16 && TARGET_NO_DADDI && TARGET_64BIT"
+  "%[addiu\\t%@,%.,%R2\;daddu\t%0,%1,%@%]"
+  [(set_attr "type"	"arith")
+   (set_attr "mode"	"DI")
+   (set_attr "length"	"8")])
+
 (define_insn "*lowsi_mips16"
   [(set (match_operand:SI 0 "register_operand" "=d")
 	(lo_sum:SI (match_operand:SI 1 "register_operand" "0")
@@ -4368,8 +4426,8 @@ dsrl\t%3,%3,1\n\
    (set_attr "length"	"8,8,8,8,12,*,*,8")])
 
 (define_insn "*movdi_64bit"
-  [(set (match_operand:DI 0 "nonimmediate_operand" "=d,d,e,d,m,*f,*f,*f,*d,*m,*x,*B*C*D,*B*C*D,*d,*m")
-	(match_operand:DI 1 "move_operand" "d,U,T,m,dJ,*f,*d*J,*m,*f,*f,*J*d,*d,*m,*B*C*D,*B*C*D"))]
+  [(set (match_operand:DI 0 "nonimmediate_operand" "=d,d,e,d,Y,*f,*f,*f,*d,*Y,*x,*B*C*D,*B*C*D,*d,*Y")
+	(match_operand:DI 1 "move_operand" "d,U,T,m,dJ,*f,*d*J,*Y,*f,*f,*J*d,*d,*Y,*B*C*D,*B*C*D"))]
   "TARGET_64BIT && !TARGET_MIPS16
    && (register_operand (operands[0], DImode)
        || reg_or_0_operand (operands[1], DImode))"
@@ -4401,6 +4459,25 @@ dsrl\t%3,%3,1\n\
 		 (const_string "*")
 		 (const_string "*")])])
 
+(define_expand "reload_outdi"
+  [(set (match_operand:DI 0 "memory_operand" "=m")
+	(match_operand:DI 1 "" "b"))
+   (clobber (match_operand:DI 2 "" "=&d"))]
+  "TARGET_64BIT && !TARGET_MIPS16"
+  "
+{
+  if (hilo_operand (operands[1], GET_MODE (operands[1])))
+    {
+      emit_move_insn (operands[2], operands[1]);
+      operands[1] = operands[2];
+    }
+  else
+    {
+      emit_move_insn (operands[2], XEXP (operands[0], 0));
+      operands[0] = gen_rtx_MEM (GET_MODE (operands[0]), operands[2]);
+    }
+}")
+
 
 ;; On the mips16, we can split ld $r,N($r) into an add and a load,
 ;; when the original load is a 4 byte instruction but the add and the
@@ -4474,8 +4551,8 @@ dsrl\t%3,%3,1\n\
 ;; in FP registers (off by default, use -mdebugh to enable).
 
 (define_insn "*movsi_internal"
-  [(set (match_operand:SI 0 "nonimmediate_operand" "=d,d,e,d,m,*f,*f,*f,*d,*m,*d,*z,*x,*B*C*D,*B*C*D,*d,*m")
-	(match_operand:SI 1 "move_operand" "d,U,T,m,dJ,*f,*d*J,*m,*f,*f,*z,*d,*J*d,*d,*m,*B*C*D,*B*C*D"))]
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=d,d,e,d,Y,*f,*f,*f,*d,*Y,*d,*z,*x,*B*C*D,*B*C*D,*d,*Y")
+	(match_operand:SI 1 "move_operand" "d,U,T,m,dJ,*f,*d*J,*Y,*f,*f,*z,*d,*J*d,*d,*Y,*B*C*D,*B*C*D"))]
   "!TARGET_MIPS16
    && (register_operand (operands[0], SImode)
        || reg_or_0_operand (operands[1], SImode))"
@@ -4507,6 +4584,27 @@ dsrl\t%3,%3,1\n\
 		 (const_string "*")
 		 (const_string "*")])])
 
+(define_expand "reload_outsi"
+  [(set (match_operand:SI 0 "memory_operand" "=m")
+	(match_operand:SI 1 "" "b"))
+   (clobber (match_operand:DI 2 "" "=&d"))]
+  "TARGET_64BIT && !TARGET_MIPS16"
+  "
+{
+  if (hilo_operand (operands[1], GET_MODE (operands[1])))
+    {
+      operands[2] = gen_rtx_REG (GET_MODE (operands[1]), REGNO (operands[2]));
+      emit_move_insn (operands[2], operands[1]);
+      operands[1] = operands[2];
+    }
+  else
+    {
+      emit_move_insn (operands[2], XEXP (operands[0], 0));
+      operands[0] = gen_rtx_MEM (GET_MODE (operands[0]), operands[2]);
+    }
+}")
+
+
 ;; On the mips16, we can split lw $r,N($r) into an add and a load,
 ;; when the original load is a 4 byte instruction but the add and the
 ;; load are 2 2 byte instructions.
@@ -4745,7 +4843,7 @@ dsrl\t%3,%3,1\n\
 })
 
 (define_insn "*movhi_internal"
-  [(set (match_operand:HI 0 "nonimmediate_operand" "=d,d,d,m,*d,*f,*f,*x")
+  [(set (match_operand:HI 0 "nonimmediate_operand" "=d,d,d,Y,*d,*f,*f,*x")
 	(match_operand:HI 1 "move_operand"         "d,I,m,dJ,*f,*d,*f,*d"))]
   "!TARGET_MIPS16
    && (register_operand (operands[0], HImode)
@@ -4792,6 +4890,17 @@ dsrl\t%3,%3,1\n\
 		 (const_string "*")
 		 (const_string "*")])])
 
+(define_expand "reload_outhi"
+  [(set (match_operand:HI 0 "memory_operand" "=m")
+	(match_operand:HI 1 "" "b"))
+   (clobber (match_operand:DI 2 "" "=&d"))]
+  "TARGET_64BIT && !TARGET_MIPS16"
+  "
+{
+  emit_move_insn (operands[2], XEXP (operands[0], 0));
+  operands[0] = gen_rtx_MEM (GET_MODE (operands[0]), operands[2]);
+}")
+
 
 ;; On the mips16, we can split lh $r,N($r) into an add and a load,
 ;; when the original load is a 4 byte instruction but the add and the
@@ -4852,7 +4961,7 @@ dsrl\t%3,%3,1\n\
 })
 
 (define_insn "*movqi_internal"
-  [(set (match_operand:QI 0 "nonimmediate_operand" "=d,d,d,m,*d,*f,*f,*x")
+  [(set (match_operand:QI 0 "nonimmediate_operand" "=d,d,d,Y,*d,*f,*f,*x")
 	(match_operand:QI 1 "move_operand"         "d,I,m,dJ,*f,*d,*f,*d"))]
   "!TARGET_MIPS16
    && (register_operand (operands[0], QImode)
@@ -4888,6 +4997,18 @@ dsrl\t%3,%3,1\n\
    (set_attr "mode"	"QI")
    (set_attr "length"	"4,4,4,4,8,*,*")])
 
+(define_expand "reload_outqi"
+  [(set (match_operand:QI 0 "memory_operand" "=m")
+	(match_operand:QI 1 "" "b"))
+   (clobber (match_operand:DI 2 "" "=&d"))]
+  "TARGET_64BIT && !TARGET_MIPS16"
+  "
+{
+  emit_move_insn (operands[2], XEXP (operands[0], 0));
+  operands[0] = gen_rtx_MEM (GET_MODE (operands[0]), operands[2]);
+}")
+
+
 ;; On the mips16, we can split lb $r,N($r) into an add and a load,
 ;; when the original load is a 4 byte instruction but the add and the
 ;; load are 2 2 byte instructions.
@@ -4930,8 +5051,8 @@ dsrl\t%3,%3,1\n\
 })
 
 (define_insn "*movsf_hardfloat"
-  [(set (match_operand:SF 0 "nonimmediate_operand" "=f,f,f,m,*f,*d,*d,*d,*m")
-	(match_operand:SF 1 "move_operand" "f,G,m,fG,*d,*f,*G*d,*m,*d"))]
+  [(set (match_operand:SF 0 "nonimmediate_operand" "=f,f,f,Y,*f,*d,*d,*d,*Y")
+	(match_operand:SF 1 "move_operand" "f,G,Y,fG,*d,*f,*G*d,*m,*d"))]
   "TARGET_HARD_FLOAT
    && (register_operand (operands[0], SFmode)
        || reg_or_0_operand (operands[1], SFmode))"
@@ -4941,7 +5062,7 @@ dsrl\t%3,%3,1\n\
    (set_attr "length"	"4,4,*,*,4,4,4,*,*")])
 
 (define_insn "*movsf_softfloat"
-  [(set (match_operand:SF 0 "nonimmediate_operand" "=d,d,m")
+  [(set (match_operand:SF 0 "nonimmediate_operand" "=d,d,Y")
 	(match_operand:SF 1 "move_operand" "Gd,m,d"))]
   "TARGET_SOFT_FLOAT && !TARGET_MIPS16
    && (register_operand (operands[0], SFmode)
@@ -4962,6 +5083,28 @@ dsrl\t%3,%3,1\n\
    (set_attr "mode"	"SF")
    (set_attr "length"	"4,4,4,*,*")])
 
+(define_expand "reload_insf"
+  [(set (match_operand:SF 0 "" "=b")
+	(match_operand:SF 1 "memory_operand" "m"))
+   (clobber (match_operand:DI 2 "" "=&d"))]
+  ""
+  "
+{
+  emit_move_insn (operands[2], XEXP (operands[1], 0));
+  operands[1] = gen_rtx_MEM (GET_MODE (operands[1]), operands[2]);
+}")
+
+(define_expand "reload_outsf"
+  [(set (match_operand:SF 0 "memory_operand" "=m")
+	(match_operand:SF 1 "" "b"))
+   (clobber (match_operand:DI 2 "" "=&d"))]
+  ""
+  "
+{
+  emit_move_insn (operands[2], XEXP (operands[0], 0));
+  operands[0] = gen_rtx_MEM (GET_MODE (operands[0]), operands[2]);
+}")
+
 
 ;; 64-bit floating point moves
 
@@ -4975,8 +5118,8 @@ dsrl\t%3,%3,1\n\
 })
 
 (define_insn "*movdf_hardfloat_64bit"
-  [(set (match_operand:DF 0 "nonimmediate_operand" "=f,f,f,m,*f,*d,*d,*d,*m")
-	(match_operand:DF 1 "move_operand" "f,G,m,fG,*d,*f,*d*G,*m,*d"))]
+  [(set (match_operand:DF 0 "nonimmediate_operand" "=f,f,f,Y,*f,*d,*d,*d,*Y")
+	(match_operand:DF 1 "move_operand" "f,G,Y,fG,*d,*f,*d*G,*m,*d"))]
   "TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT && TARGET_64BIT
    && (register_operand (operands[0], DFmode)
        || reg_or_0_operand (operands[1], DFmode))"
@@ -4997,7 +5140,7 @@ dsrl\t%3,%3,1\n\
    (set_attr "length"	"4,8,*,*,8,8,8,*,*")])
 
 (define_insn "*movdf_softfloat"
-  [(set (match_operand:DF 0 "nonimmediate_operand" "=d,d,m,d,f,f")
+  [(set (match_operand:DF 0 "nonimmediate_operand" "=d,d,Y,d,f,f")
 	(match_operand:DF 1 "move_operand" "dG,m,dG,f,d,f"))]
   "(TARGET_SOFT_FLOAT || TARGET_SINGLE_FLOAT) && !TARGET_MIPS16
    && (register_operand (operands[0], DFmode)
@@ -5018,6 +5161,29 @@ dsrl\t%3,%3,1\n\
    (set_attr "mode"	"DF")
    (set_attr "length"	"8,8,8,*,*")])
 
+(define_expand "reload_indf"
+  [(set (match_operand:DF 0 "" "=b")
+	(match_operand:DF 1 "memory_operand" "m"))
+   (clobber (match_operand:DI 2 "" "=&d"))]
+  ""
+  "
+{
+  emit_move_insn (operands[2], XEXP (operands[1], 0));
+  operands[1] = gen_rtx_MEM (GET_MODE (operands[1]), operands[2]);
+}")
+
+(define_expand "reload_outdf"
+  [(set (match_operand:DF 0 "memory_operand" "=m")
+	(match_operand:DF 1 "" "b"))
+   (clobber (match_operand:DI 2 "" "=&d"))]
+  ""
+  "
+{
+  emit_move_insn (operands[2], XEXP (operands[0], 0));
+  operands[0] = gen_rtx_MEM (GET_MODE (operands[0]), operands[2]);
+}")
+
+
 (define_split
   [(set (match_operand:DI 0 "nonimmediate_operand")
 	(match_operand:DI 1 "move_operand"))]
@@ -7345,8 +7511,20 @@ dsrl\t%3,%3,1\n\
 	return "%*b\t%l0%/";
       else
 	{
-	  output_asm_insn (mips_output_load_label (), operands);
-	  return "%*jr\t%@%/%]";
+	  if (Pmode != DImode || !TARGET_NO_DADDI)
+	    {
+	      static char buffer[200];
+	      operands[1] = gen_rtx_REG (SImode, 1);
+	      sprintf (buffer, "%s%s", "%[", mips_output_load_label ());
+	      output_asm_insn (buffer, operands);
+	      return "%*jr\t$1%/%]";
+	    }
+	  else
+	    {
+	      operands[1] = gen_rtx_REG (SImode, 24);
+	      output_asm_insn (mips_output_load_label (), operands);
+	      return "%*jr\t$24%/";
+	    }
 	}
     }
   else
diff -up --recursive --new-file gcc-3.5.0-20040524.macro/gcc/doc/invoke.texi gcc-3.5.0-20040524/gcc/doc/invoke.texi
--- gcc-3.5.0-20040524.macro/gcc/doc/invoke.texi	2004-05-23 01:30:17.000000000 +0000
+++ gcc-3.5.0-20040524/gcc/doc/invoke.texi	2004-05-24 14:07:58.000000000 +0000
@@ -497,7 +497,7 @@ in the following sections.
 -mmad  -mno-mad  -mfused-madd  -mno-fused-madd  -nocpp @gol
 -mfix-r4000  -mno-fix-r4000  -mfix-r4400  -mno-fix-r4400 @gol
 -mfix-vr4120  -mno-fix-vr4120  -mfix-sb1  -mno-fix-sb1 @gol
--mflush-func=@var{func}  -mno-flush-func @gol
+-mno-daddi  -mdaddi  -mflush-func=@var{func}  -mno-flush-func @gol
 -mbranch-likely  -mno-branch-likely @gol
 -mfp-exceptions -mno-fp-exceptions @gol
 -mvr4130-align -mno-vr4130-align}
@@ -8447,6 +8447,30 @@ Work around certain SB-1 CPU core errata
 (This flag currently works around the SB-1 revision 2
 ``F1'' and ``F2'' floating point errata.)
 
+@item -mno-daddi
+@itemx -mdaddi
+@opindex mno-daddi
+@opindex mdaddi
+Provide support for eliminating the @samp{daddiu} instruction from generated
+code by taking the following precautions:
+@itemize @minus
+@item
+Do not emit @samp{daddiu} instructions.
+@item
+Do not emit macros that expand to @samp{daddiu} instructions.
+@item
+Emit only such address references that can be expanded by the assembler
+without the use of @samp{daddiu} instructions.
+@end itemize
+This options requires appropriate support from the assembler to be
+effective.  Otherwise the generated code will still be correct, but
+@samp{daddiu} instructions may appear.
+
+This is needed for the R4000 processor and the initial revision of the
+R4400 processor as they have errata leading to @samp{daddi} and @samp{daddiu}
+instructions being executed incorrectly.  The @option{-mno-daddi} setting is
+implied by @option{-mfix-4000} and @option{-mfix-4400}.
+
 @item -mflush-func=@var{func}
 @itemx -mno-flush-func
 @opindex mflush-func
Follow-Ups:
- Re: RFC: MIPS: Workaround the "daddi" and "daddiu" errata
  - From: Eric Christopher
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]