[mips] fix $gp restore bug

Richard Sandiford rdsandiford@googlemail.com
Mon Aug 31 21:38:00 GMT 2009


Remember this?  After many false starts, I think I've finally got
something that seems to work and that doesn't impact the normal
case too much.

To recap, the problem is that PIC long-branch sequences need access to
the GOT, and we only know at the end of compilation whether a function
has any long branches.  On the other hand, we try to optimise away
GP-related instructions soon after prologue and epilogue creation.
The two ideas I suggested before were:

  (a) Use placeholder instructions to manipulate pic_offset_table_rtx,
      until we know for sure that we need it.  Make branches use
      pic_offset_table_rtx.

  (b) Avoid using the GOT in long-branch sequences.  Use a BAL
      to get a PC value, then add "TARGET-." to the result.

The problem with (a) was that -- quite reasonably -- unconditional branch
patterns can't use registers.  We need to able to insert unconditional
branches without perturbing liveness.

The problem with (b) was that GAS doesn't support %hi(TARGET-.) or
%lo(TARGET-.), even if TARGET is in the same section.  Seems a bit
dumb, and it's probably worth fixing, but I don't want to tie the
GCC fix to an unreleased version of binutils.

I finally ended up with a variant of (a) in which we use EH_USES
to make the global pointer _value_ available at every block boundary
bar function entry and exit.  In NewABI functions and leaf OldABI
functions, this value is guaranteed be in the global pointer register
itself.  In OldABI non-leaf functions, it is guaranteed to be in
the cprestore slot.

See the big comment above mips_chosen_to_use_gp_p in the patch for
details.  I'm trying not to explain too much here since that comment
really needs to be understandable on its own.  If it isn't, let me know.

One historical note, though: the patch waits until epilogue_completed
rather than reload_completed before exposing loads from the cprestore
slot.  If I remember right, that was the original intention of the code,
because I think at the time it was added, there were no splits between
reload and prologue/epilogue generation.  In the patch, we have the
added reason that we need to test mips_chosen_to_use_gp_p.

Requoting the original discussion about (a):

Adam Nemet <anemet@caviumnetworks.com> writes:
> Richard Sandiford writes:
>> FWIW, the idea I'm toying with at the moment is:
>> 
>>   - Make every branch use pic_offset_table_rtx.
>> 
>>   - Try to detect cases where a function only needs a global pointer
>>     because of branches.  In this case, emit easily-identifiable "ghost"
>>     instructions (in the .md "type" attribute sense) to set, save and
>>     restore the gp.
>> 
>>     These ghost instructions would replace the sequences we'd normally use.
>>     They would have 0 length at this stage, and in theory shouldn't interfere
>>     with things like scheduling.
>> 
>>   - At the end of md_reorg, run shorten_branches to see whether
>>     all branches are in range.  This should be safe because we run
>>     dbr_schedule within mips_reorg.  Nothing after mips_reorg should
>>     change the insn stream besides shorten_branches itself.
>> 
>>   - If some branches are not in range, go through the insn stream and
>>     replace all the ghost instructions with real implementations.
>>     (This will of course invalidate the results of the shorten_branches
>>     call, but the "real" shorten_branches pass will then calculate new
>>     lengths.)
>
> This sounds like a good plan to me, FWIW.
>
>> Besides the complication, the main drawback I can see is that we'd be
>> unable to fill the delay slots of branches with insns that restore $gp,
>> even in the usual case where no long branches are needed.
>> 
>> That's certainly a problem for -Os.  I'm not sure it's much of a concern
>> for other options.  dbr_schedule doesn't take the pipeline into account
>> at all when filling delay slots, and postponing a restoration of $gp
>> isn't always a good idea performance-wise.
>
> I guess this applies more to the old ABIs.  For the new ABIs all we lose is
> the possibility to optimally schedule the pic_offset_table_rtx instructions
> and that the register in pic_offset_table_rtx won't be available to the
> register allocator even though it might be never used or used for a shorter
> time than assumed.

Using EH_USES instead of a (use ...) in the branch patterns avoids
the delay-slot problem.  Unfortunately, the problem Adam mentioned
is still there: for non-absolute abicalls, we must assume that the
function might need a global pointer, so we have to allocate one.
There are three cases:

  (a) For NewABI leaf functions that have a spare call-clobbered
      register, the patch claims one such register as a local GP,
      which stops the register from being used for anything else
      after reload (such as in regrename).  This can pessimise
      functions that didn't need a global pointer at all.

  (b) For OldABI leaf functions, there's no problem.  $28 is always
      reserved anyway, and there's no need to save and restore it.

  (c) Otherwise, we'll need to create a frame so that $28 can be
      saved and restored.  This pessimises functions that didn't
      need a global pointer and didn't need a frame.

So the three main affected functions are:

  (1) NewABI leaf functions that didn't need all call-clobbered
      registers for register allocation, but which benefit from
      using all call-clobbered registers after reload (such as
      through regrename).

  (2) NewABI leaf functions that need all call-clobbered registers
      but that don't need a global pointer or a frame.

  (3) Non-leaf functions that don't need a global pointer and don't
      need a frame.  (This means that the function must be noreturn
      and make all calls indirectly.  Note that sibcalls don't make
      a function non-leaf.)

All these cases ought to be rare.  (2) and (3) simply mean a function
allocates and deallocates a bit of stack, so shouldn't affect
performance too much.  (1) -- the problem Adam mentioned -- is a bit
more suspect, because it could stop us from scheduling something
as well as we might.  But I still think this is the best compromise
between correctness and efficiency.

I diffed the testsuite output for:

   -mshared
   -mno-shared -mno-plt
   -mno-shared -mplt

to get a feel for how the patch was working.  The results looked
reasonable.  (I included the mips_variable_issue hunk in the
pre-patch version so that there were fewer unrelated scheduling
differences.)

The length calculation for long branches is now a bit more complicated
than before, because there are more cases to consider.  Ideally we'd
simply use (symbol_ref ...) to escape to a C function, but that isn't
possible here, because if a length is PC-dependent, genattrtab needs
tod know what the maximum value is.  (This maximum length becomes the
initial length in shorten_branches.)  I've therefore used 100 as a
placeholder and made mips_adjust_insn_length substitute the correct
value.  Bit of a hack, but still.

The patch makes a few other improvements:

  - We no longer count ghost instructions towards the issue rate.

  - We use J rather than LA/JR -mno-shared -mabicalls code.

  - The GP patterns no longer need to be unspec_volatile, so there's
    more scheduling freedom.

  - We no longer use df_regs_ever_live on $28, which was giving
    too many false positives.

Bootstrapped & regression-tested on mips64octeon-linux-gnu.  It's a big
and potentially-contentious change, so I'll leave a few days for objections
before checking in.

Richard


gcc/
	* config/mips/mips-protos.h (mips_cfun_has_cprestore_slot_p): Declare.
	(mips_cprestore_sp_address_p): Likewise.
	(mips_large_cprestore_offset_p): Likewise.
	(mips_save_gp_to_cprestore_slot): Likewise.
	(mips_restore_gp): Rename to...
	(mips_restore_gp_from_cprestore_slot): ...this.
	(mips_mips_chosen_to_use_gp_p): Declare.
	(mips_emit_save_slot_move): Likewise.
	(mips_output_load_label): Return nothing.
	(mips_eh_uses): Declare.
	* config/mips/mips.h (TARGET_SPLIT_CALLS): Require epilogue_completed.
	(TARGET_CPRESTORE_DIRECTIVE): New macro.
	(TARGET_ABSOLUTE_JUMPS): Likewise.
	(EH_USES): Likewise.
	(FIRST_PSEUDO_REGISTER): Update comment.
	(MIPS_ABSOLUTE_JUMP): New macro, extracted from...
	(MIPS_CALL): ...here.
	(REGISTER_NAMES): Add $cprestore.
	* config/mips/mips.c (machine_function): Remove has_gp_insn_p.
	Add load_label_length, has_inflexible_gp_insn_p,
	has_flexible_gp_insn_p and chosen_to_use_gp_p.
	(mips_expand_call): Don't generate split instructions here.
	(mips_split_call): Update the call to mips_restore_gp after
	the above name change.
	(mips16_cfun_returns_in_fpr_p): Move earlier in file.
	(mips_find_gp_ref): New function.
	(mips_insn_has_inflexible_gp_ref_p): Likewise.
	(mips_cfun_has_inflexible_gp_ref_p): Likewise.
	(mips_insn_has_flexible_gp_ref_p): Likewise.
	(mips_cfun_has_flexible_gp_ref_p): Likewise.
	(mips_function_has_gp_insn): Delete.
	(mips_global_pointer): Drop the df_regs_ever_live_p check.
	Use the new functions above.  Only return INVALID_REGNUM
	for TARGET_ABSOLUTE_JUMPS.
	(mips_chosen_to_use_gp_p): New function.
	(mips_get_cprestore_base_and_offset): New function, extracted from...
	(mips_cprestore_slot): ...here.
	(mips_cfun_has_cprestore_slot_p): New function.
	(mips_cprestore_sp_address_p): Likewise.
	(mips_large_cprestore_offset_p): Likewise.
	(mips_save_gp_to_cprestore_slot): Likewise.
	(mips_restore_gp): Rename to...
	(mips_restore_gp_from_cprestore_slot): ...this.  Assert
	epilogue_completed.  Check mips_chosen_to_use_gp_p instead
	of cfun->machine->global_pointer.
	(mips_direct_save_slot_move_p): New function.
	(mips_emit_save_slot_move): Likewise.
	(mips_output_cplocal): Test mips_chosen_to_use_gp_p () instead
	of cfun->machine->global_pointer.
	(mips_output_function_prologue): Check mips_chosen_to_use_gp_p ().
	(mips_save_reg): Use mips_emit_save_slot_move.
	(mips_expand_prologue): Set chosen_to_use_gp_p.
	Use mips_cfun_has_cprestore_slot_p.  Use gen_potential_cprestore
	for all cprestore saves.  Emit a use_cprestore instruction after
	setting up the cprestore slot.
	(mips_restore_reg): Use mips_emit_save_slot_move.
	(mips_process_load_label): New function.
	(mips_load_label_length): Likewise.
	(mips_output_load_label): Don't return asm: output it here instead.
	Use mips_process_load_label.
	(mips_adjust_insn_length): Adjust the length of branch instructions
	that have length MAX_PIC_BRANCH_LENGTH.
	(mips_output_conditional_branch): Update the call to
	mips_output_load_label.  Assume the branch target is OPERANDS[0]
	rather than OPERANDS[1].  Use MIPS_ABSOLUTE_JUMP for absolute jumps.
	(mips_output_order_conditional_branch): Swap the meaning of
	OPERANDS[0] and OPERANDS[1].
	(mips_variable_issue): Don't count ghost instructions.
	(mips_expand_ghost_gp_insns): New function.
	(mips_reorg): Rerun mips_reorg_process_insns if it returns true.
	(mips_output_mi_thunk): Set chosen_to_use_gp_p.
	(mips_eh_uses): New function.
	* config/mips/predicates.md (cprestore_slot_operand): New predicate.
	* config/mips/mips.md (UNSPEC_POTENTIAL_CPRESTORE): New unspec.
	(UNSPEC_MOVE_GP): Likewise.
	(UNSPEC_CPRESTORE, UNSPEC_RESTORE_GP, UNSPEC_EH_RETURN)
	(UNSPEC_CONSTTABLE_INT, UNSPEC_CONSTTABLE_FLOAT): Bump to make room.
	(CPRESTORE_SLOT_REGNUM): New register.
	(MAX_PIC_BRANCH_LENGTH): New constant.
	(jal_macro): Use MIPS_ABSOLUTE_JUMPS.
	(length): Use MAX_PIC_BRANCH_LENGTH as a placeholder for PIC long
	branches.  Fix commentary.
	(loadgp_newabi_<mode>): Change from unspec_volatile to unspec.
	Only split if mips_chosen_to_use_gp_p; expand to nothing otherwise.
	Change type to "ghost".
	(loadgp_absolute_<mode>): Likewise.
	(loadgp_rtp_<mode>): Likewise.
	(copygp_mips16): Likewise.
	(loadgp_blockage): Remove redundant mode attribute.
	(potential_cprestore): New instruction.
	(cprestore): Turn into an unspec set.
	(use_cprestore): New instruction.
	(*branch_fp): Swap operands 0 and 1.  Remove redundant mode attribute.
	(*branch_fp_inverted): Likewise.
	(*branch_order<mode>): Likewise.
	(*branch_order<mode>_inverted): Likewise.
	(*branch_equality<mode>): Likewise.
	(*branch_equality<mode>_inverted): Likewise.
	(*branch_bit<bbv><mode>): Likewise.
	(*branch_bit<bbv><mode>_inverted): Likewise.
	(*branch_equality<mode>_mips16): Remove redundant mode.
	(jump): Turn into a define_expand.
	(*jump_absolute): New instruction.
	(*jump_pic): Likewise.
	(*jump_mips16): Rename previously-unnamed pattern.  Remove
	redundant mode attribute.
	(restore_gp): Split on epilogue_completed rather than
	reload_completed.  Change type to "ghost".
	(move_gp<mode>): New instruction.
	* config/mips/mips-dsp.md (mips_bposge): Swap operands 0 and 1.
	Remove redundant mode attribute.
	* config/mips/mips-ps-3d.md (bc1any4t): Likewise.
	(bc1any4f, bc1any2t, bc1any2f): Likewise.
	(*branch_upper_lower, *branch_upper_lower_inverted): Likewise.

gcc/testsuite/
	* gcc.target/mips/branch-helper.h: New file.
	* gcc.target/mips/branch-2.c,
	* gcc.target/mips/branch-3.c,
	* gcc.target/mips/branch-4.c,
	* gcc.target/mips/branch-5.c,
	* gcc.target/mips/branch-6.c,
	* gcc.target/mips/branch-7.c,
	* gcc.target/mips/branch-8.c,
	* gcc.target/mips/branch-9.c,
	* gcc.target/mips/branch-10.c,
	* gcc.target/mips/branch-11.c,
	* gcc.target/mips/branch-12.c,
	* gcc.target/mips/branch-13.c,
	* gcc.target/mips/branch-14.c,
	* gcc.target/mips/branch-15.c: New tests.

Index: gcc/config/mips/mips-protos.h
===================================================================
--- gcc/config/mips/mips-protos.h	2009-08-31 08:38:27.000000000 +0100
+++ gcc/config/mips/mips-protos.h	2009-08-31 08:39:25.000000000 +0100
@@ -219,7 +219,11 @@ extern rtx mips_subword (rtx, bool);
 extern bool mips_split_64bit_move_p (rtx, rtx);
 extern void mips_split_doubleword_move (rtx, rtx);
 extern const char *mips_output_move (rtx, rtx);
-extern void mips_restore_gp (rtx);
+extern bool mips_cfun_has_cprestore_slot_p (void);
+extern bool mips_cprestore_sp_address_p (rtx);
+extern bool mips_large_cprestore_offset_p (void);
+extern void mips_save_gp_to_cprestore_slot (rtx, rtx, rtx, rtx);
+extern void mips_restore_gp_from_cprestore_slot (rtx);
 #ifdef RTX_CODE
 extern void mips_expand_scc (rtx *);
 extern void mips_expand_conditional_branch (rtx *);
@@ -276,7 +280,9 @@ extern bool mips_small_data_pattern_p (r
 extern rtx mips_rewrite_small_data (rtx);
 extern HOST_WIDE_INT mips_initial_elimination_offset (int, int);
 extern rtx mips_return_addr (int, rtx);
+extern bool mips_chosen_to_use_gp_p (void);
 extern enum mips_loadgp_style mips_current_loadgp_style (void);
+extern void mips_emit_save_slot_move (rtx, rtx, rtx);
 extern void mips_expand_prologue (void);
 extern void mips_expand_before_return (void);
 extern void mips_expand_epilogue (bool);
@@ -296,7 +302,7 @@ extern int mips_register_move_cost (enum
 				    enum reg_class);
 
 extern int mips_adjust_insn_length (rtx, int);
-extern const char *mips_output_load_label (void);
+extern void mips_output_load_label (rtx);
 extern const char *mips_output_conditional_branch (rtx, rtx *, const char *,
 						   const char *);
 extern const char *mips_output_order_conditional_branch (rtx, rtx *, bool);
@@ -334,6 +340,7 @@ extern void mips_expand_atomic_qihi (uni
 
 extern void mips_expand_vector_init (rtx, rtx);
 
+extern bool mips_eh_uses (unsigned int);
 extern bool mips_epilogue_uses (unsigned int);
 extern void mips_final_prescan_insn (rtx, rtx *, int);
 
Index: gcc/config/mips/mips.h
===================================================================
--- gcc/config/mips/mips.h	2009-08-31 08:38:27.000000000 +0100
+++ gcc/config/mips/mips.h	2009-08-31 08:39:25.000000000 +0100
@@ -161,10 +161,13 @@ #define TARGET_ABICALLS_PIC2 \
 
 /* True if the call patterns should be split into a jalr followed by
    an instruction to restore $gp.  It is only safe to split the load
-   from the call when every use of $gp is explicit.  */
+   from the call when every use of $gp is explicit.
+
+   See mips_chosen_to_use_gp_p for details about how we manage the
+   global pointer.  */
 
 #define TARGET_SPLIT_CALLS \
-  (TARGET_EXPLICIT_RELOCS && TARGET_CALL_CLOBBERED_GP)
+  (TARGET_EXPLICIT_RELOCS && TARGET_CALL_CLOBBERED_GP && epilogue_completed)
 
 /* True if we're generating a form of -mabicalls in which we can use
    operators like %hi and %lo to refer to locally-binding symbols.
@@ -202,6 +205,16 @@ #define TARGET_CALL_CLOBBERED_GP (TARGET
 /* True if TARGET_USE_GOT and if $gp is a call-saved register.  */
 #define TARGET_CALL_SAVED_GP (TARGET_USE_GOT && !TARGET_CALL_CLOBBERED_GP)
 
+/* True if we should use .cprestore to store to the cprestore slot.
+
+   We continue to use .cprestore for explicit-reloc code so that JALs
+   inside inline asms will work correctly.  */
+#define TARGET_CPRESTORE_DIRECTIVE (TARGET_ABICALLS_PIC2 && !TARGET_MIPS16)
+
+/* True if we can use the J and JAL instructions.  */
+#define TARGET_ABSOLUTE_JUMPS \
+  (!flag_pic || TARGET_ABSOLUTE_ABICALLS)
+
 /* True if indirect calls must use register class PIC_FN_ADDR_REG.
    This is true for both the PIC and non-PIC VxWorks RTP modes.  */
 #define TARGET_USE_PIC_FN_ADDR_REG (TARGET_ABICALLS || TARGET_VXWORKS_RTP)
@@ -1300,6 +1313,8 @@ #define EH_RETURN_DATA_REGNO(N) \
 
 #define EH_RETURN_STACKADJ_RTX  gen_rtx_REG (Pmode, GP_REG_FIRST + 3)
 
+#define EH_USES(N) mips_eh_uses (N)
+
 /* Offsets recorded in opcodes are a multiple of this alignment factor.
    The default for this in 64-bit mode is 8, which causes problems with
    SFmode register saves.  */
@@ -1543,11 +1558,12 @@ #define CLZ_DEFINED_VALUE_AT_ZERO(MODE, 
    - 8 condition code registers
    - 2 accumulator registers (hi and lo)
    - 32 registers each for coprocessors 0, 2 and 3
-   - 3 fake registers:
+   - 4 fake registers:
 	- ARG_POINTER_REGNUM
 	- FRAME_POINTER_REGNUM
 	- GOT_VERSION_REGNUM (see the comment above load_call<mode> for details)
-   - 3 dummy entries that were used at various times in the past.
+	- CPRESTORE_SLOT_REGNUM
+   - 2 dummy entries that were used at various times in the past.
    - 6 DSP accumulator registers (3 hi-lo pairs) for MIPS DSP ASE
    - 6 DSP control registers  */
 
@@ -2661,6 +2677,13 @@ #define ADJUST_INSN_LENGTH(INSN, LENGTH)
 #define MIPS_BRANCH(OPCODE, OPERANDS) \
   "%*" OPCODE "%?\t" OPERANDS "%/"
 
+/* Return an asm string that forces INSN to be treated as an absolute
+   J or JAL instruction instead of an assembler macro.  */
+#define MIPS_ABSOLUTE_JUMP(INSN) \
+  (TARGET_ABICALLS_PIC2						\
+   ? ".option\tpic0\n\t" INSN "\n\t.option\tpic2"		\
+   : INSN)
+
 /* Return the asm template for a call.  INSN is the instruction's mnemonic
    ("j" or "jal"), OPERANDS are its operands, and OPNO is the operand number
    of the target.
@@ -2675,11 +2698,7 @@ #define MIPS_CALL(INSN, OPERANDS, OPNO)	
    ? "%*" INSN "\t%" #OPNO "%/"					\
    : REG_P (OPERANDS[OPNO])					\
    ? "%*" INSN "r\t%" #OPNO "%/"				\
-   : TARGET_ABICALLS_PIC2					\
-   ? (".option\tpic0\n\t"					\
-      "%*" INSN "\t%" #OPNO "%/\n\t"				\
-      ".option\tpic2")						\
-   : "%*" INSN "\t%" #OPNO "%/")
+   : MIPS_ABSOLUTE_JUMP ("%*" INSN "\t%" #OPNO "%/"))
 
 /* Control the assembler format that we output.  */
 
@@ -2707,7 +2726,7 @@ #define REGISTER_NAMES							   \
   "$f16", "$f17", "$f18", "$f19", "$f20", "$f21", "$f22", "$f23",	   \
   "$f24", "$f25", "$f26", "$f27", "$f28", "$f29", "$f30", "$f31",	   \
   "hi",   "lo",   "",     "$fcc0","$fcc1","$fcc2","$fcc3","$fcc4",	   \
-  "$fcc5","$fcc6","$fcc7","", "", "$arg", "$frame", "$fakec",		   \
+  "$fcc5","$fcc6","$fcc7","", "$cprestore", "$arg", "$frame", "$fakec",	   \
   "$c0r0", "$c0r1", "$c0r2", "$c0r3", "$c0r4", "$c0r5", "$c0r6", "$c0r7",  \
   "$c0r8", "$c0r9", "$c0r10","$c0r11","$c0r12","$c0r13","$c0r14","$c0r15", \
   "$c0r16","$c0r17","$c0r18","$c0r19","$c0r20","$c0r21","$c0r22","$c0r23", \
Index: gcc/config/mips/mips.c
===================================================================
--- gcc/config/mips/mips.c	2009-08-31 08:38:27.000000000 +0100
+++ gcc/config/mips/mips.c	2009-08-31 08:48:38.000000000 +0100
@@ -307,6 +307,10 @@ struct GTY(())  machine_function {
      if the function doesn't need one.  */
   unsigned int global_pointer;
 
+  /* How many instructions it takes to load a label into $AT, or 0 if
+     this property hasn't yet been calculated.  */
+  unsigned int load_label_length;
+
   /* True if mips_adjust_insn_length should ignore an instruction's
      hazard attribute.  */
   bool ignore_hazard_length_p;
@@ -315,8 +319,16 @@ struct GTY(())  machine_function {
      .set nomacro.  */
   bool all_noreorder_p;
 
-  /* True if the function is known to have an instruction that needs $gp.  */
-  bool has_gp_insn_p;
+  /* True if the function has "inflexible" and "flexible" references
+     to the global pointer.  See mips_cfun_has_inflexible_gp_ref_p
+     and mips_cfun_has_flexible_gp_ref_p for details.  */
+  bool has_inflexible_gp_insn_p;
+  bool has_flexible_gp_insn_p;
+
+  /* True if we have committed to using a global pointer for the current
+     function.  Even if this value is false, we may later decide to use
+     one; see mips_chosen_to_use_gp_p.  */
+  bool chosen_to_use_gp_p;
 
   /* True if we have emitted an instruction to initialize
      mips16_gp_pseudo_rtx.  */
@@ -6313,9 +6325,7 @@ mips_expand_call (enum mips_call_type ty
     {
       rtx (*fn) (rtx, rtx);
 
-      if (type == MIPS_CALL_EPILOGUE && TARGET_SPLIT_CALLS)
-	fn = gen_call_split;
-      else if (type == MIPS_CALL_SIBCALL)
+      if (type == MIPS_CALL_SIBCALL)
 	fn = gen_sibcall_internal;
       else
 	fn = gen_call_internal;
@@ -6328,9 +6338,7 @@ mips_expand_call (enum mips_call_type ty
       rtx (*fn) (rtx, rtx, rtx, rtx);
       rtx reg1, reg2;
 
-      if (type == MIPS_CALL_EPILOGUE && TARGET_SPLIT_CALLS)
-	fn = gen_call_value_multiple_split;
-      else if (type == MIPS_CALL_SIBCALL)
+      if (type == MIPS_CALL_SIBCALL)
 	fn = gen_sibcall_value_multiple_internal;
       else
 	fn = gen_call_value_multiple_internal;
@@ -6343,9 +6351,7 @@ mips_expand_call (enum mips_call_type ty
     {
       rtx (*fn) (rtx, rtx, rtx);
 
-      if (type == MIPS_CALL_EPILOGUE && TARGET_SPLIT_CALLS)
-	fn = gen_call_value_split;
-      else if (type == MIPS_CALL_SIBCALL)
+      if (type == MIPS_CALL_SIBCALL)
 	fn = gen_sibcall_value_internal;
       else
 	fn = gen_call_value_internal;
@@ -6375,7 +6381,7 @@ mips_split_call (rtx insn, rtx call_patt
     /* Pick a temporary register that is suitable for both MIPS16 and
        non-MIPS16 code.  $4 and $5 are used for returning complex double
        values in soft-float code, so $6 is the first suitable candidate.  */
-    mips_restore_gp (gen_rtx_REG (Pmode, GP_ARG_FIRST + 2));
+    mips_restore_gp_from_cprestore_slot (gen_rtx_REG (Pmode, GP_ARG_FIRST + 2));
 }
 
 /* Implement TARGET_FUNCTION_OK_FOR_SIBCALL.  */
@@ -8566,42 +8572,131 @@ mips16e_output_save_restore (rtx pattern
   return buffer;
 }
 
-/* Return true if the current function has an insn that implicitly
-   refers to $gp.  */
+/* Return true if the current function returns its value in a floating-point
+   register in MIPS16 mode.  */
 
 static bool
-mips_function_has_gp_insn (void)
+mips16_cfun_returns_in_fpr_p (void)
 {
-  /* Don't bother rechecking if we found one last time.  */
-  if (!cfun->machine->has_gp_insn_p)
-    {
-      rtx insn;
+  tree return_type = DECL_RESULT (current_function_decl);
+  return (TARGET_MIPS16
+	  && TARGET_HARD_FLOAT_ABI
+	  && !aggregate_value_p (return_type, current_function_decl)
+ 	  && mips_return_mode_in_fpr_p (DECL_MODE (return_type)));
+}
+
+/* Return true if predicate PRED is true for at least one instruction.
+   Cache the result in *CACHE, and assume that the result is true
+   if *CACHE is already true.  */
+
+static bool
+mips_find_gp_ref (bool *cache, bool (*pred) (rtx))
+{
+  rtx insn;
 
+  if (!*cache)
+    {
       push_topmost_sequence ();
       for (insn = get_insns (); insn; insn = NEXT_INSN (insn))
-	if (USEFUL_INSN_P (insn)
-	    && (get_attr_got (insn) != GOT_UNSET
-		|| mips_small_data_pattern_p (PATTERN (insn))))
+	if (USEFUL_INSN_P (insn) && pred (insn))
 	  {
-	    cfun->machine->has_gp_insn_p = true;
+	    *cache = true;
 	    break;
 	  }
       pop_topmost_sequence ();
     }
-  return cfun->machine->has_gp_insn_p;
+  return *cache;
 }
 
-/* Return true if the current function returns its value in a floating-point
-   register in MIPS16 mode.  */
+/* Return true if INSN refers to the global pointer in an "inflexible" way.
+   See mips_cfun_has_inflexible_gp_ref_p for details.  */
 
 static bool
-mips16_cfun_returns_in_fpr_p (void)
+mips_insn_has_inflexible_gp_ref_p (rtx insn)
 {
-  tree return_type = DECL_RESULT (current_function_decl);
-  return (TARGET_MIPS16
-	  && TARGET_HARD_FLOAT_ABI
-	  && !aggregate_value_p (return_type, current_function_decl)
- 	  && mips_return_mode_in_fpr_p (DECL_MODE (return_type)));
+  /* Uses of pic_offset_table_rtx in CALL_INSN_FUNCTION_USAGE
+     indicate that the target could be a traditional MIPS
+     lazily-binding stub.  */
+  return find_reg_fusage (insn, USE, pic_offset_table_rtx);
+}
+
+/* Return true if the current function refers to the global pointer
+   in a way that forces $28 to be valid.  This means that we can't
+   change the choice of global pointer, even for NewABI code.
+
+   One example of this (and one which needs several checks) is that
+   $28 must be valid when calling traditional MIPS lazy-binding stubs.
+   (This restriction does not apply to PLTs.)  */
+
+static bool
+mips_cfun_has_inflexible_gp_ref_p (void)
+{
+  /* If the function has a nonlocal goto, $28 must hold the correct
+     global pointer for the target function.  That is, the target
+     of the goto implicitly uses $28.  */
+  if (crtl->has_nonlocal_goto)
+    return true;
+
+  if (TARGET_ABICALLS_PIC2)
+    {
+      /* Symbolic accesses implicitly use the global pointer unless
+	 -mexplicit-relocs is in effect.  JAL macros to symbolic addresses
+	 might go to traditional MIPS lazy-binding stubs.  */
+      if (!TARGET_EXPLICIT_RELOCS)
+	return true;
+
+      /* FUNCTION_PROFILER includes a JAL to _mcount, which again
+	 can be lazily-bound.  */
+      if (crtl->profile)
+	return true;
+
+      /* MIPS16 functions that return in FPRs need to call an
+	 external libgcc routine.  This call is only made explict
+	 during mips_expand_epilogue, and it too might be lazily bound.  */
+      if (mips16_cfun_returns_in_fpr_p ())
+	return true;
+    }
+
+  return mips_find_gp_ref (&cfun->machine->has_inflexible_gp_insn_p,
+			   mips_insn_has_inflexible_gp_ref_p);
+}
+
+/* Return true if INSN refers to the global pointer in a "flexible" way.
+   See mips_cfun_has_flexible_gp_ref_p for details.  */
+
+static bool
+mips_insn_has_flexible_gp_ref_p (rtx insn)
+{
+  return (get_attr_got (insn) != GOT_UNSET
+	  || mips_small_data_pattern_p (PATTERN (insn))
+	  || reg_overlap_mentioned_p (pic_offset_table_rtx, PATTERN (insn)));
+}
+
+/* Return true if the current function references the global pointer,
+   but if those references do not inherently require the global pointer
+   to be $28.  Assume !mips_cfun_has_inflexible_gp_ref_p ().  */
+
+static bool
+mips_cfun_has_flexible_gp_ref_p (void)
+{
+  /* Reload can sometimes introduce constant pool references
+     into a function that otherwise didn't need them.  For example,
+     suppose we have an instruction like:
+
+	(set (reg:DF R1) (float:DF (reg:SI R2)))
+
+     If R2 turns out to be a constant such as 1, the instruction may
+     have a REG_EQUAL note saying that R1 == 1.0.  Reload then has
+     the option of using this constant if R2 doesn't get allocated
+     to a register.
+
+     In cases like these, reload will have added the constant to the
+     pool but no instruction will yet refer to it.  */
+  if (TARGET_ABICALLS_PIC2 && !reload_completed && crtl->uses_const_pool)
+    return true;
+
+  return mips_find_gp_ref (&cfun->machine->has_flexible_gp_insn_p,
+			   mips_insn_has_flexible_gp_ref_p);
 }
 
 /* Return the register that should be used as the global pointer
@@ -8617,57 +8712,18 @@ mips_global_pointer (void)
   if (!TARGET_USE_GOT)
     return GLOBAL_POINTER_REGNUM;
 
-  /* We must always provide $gp when it is used implicitly.  */
-  if (!TARGET_EXPLICIT_RELOCS)
-    return GLOBAL_POINTER_REGNUM;
-
-  /* FUNCTION_PROFILER includes a jal macro, so we need to give it
-     a valid gp.  */
-  if (crtl->profile)
-    return GLOBAL_POINTER_REGNUM;
-
-  /* If the function has a nonlocal goto, $gp must hold the correct
-     global pointer for the target function.  */
-  if (crtl->has_nonlocal_goto)
+  /* If there are inflexible references to $gp, we must use the
+     standard register.  */
+  if (mips_cfun_has_inflexible_gp_ref_p ())
     return GLOBAL_POINTER_REGNUM;
 
-  /* There's no need to initialize $gp if it isn't referenced now,
-     and if we can be sure that no new references will be added during
-     or after reload.  */
-  if (!df_regs_ever_live_p (GLOBAL_POINTER_REGNUM)
-      && !mips_function_has_gp_insn ())
-    {
-      /* The function doesn't use $gp at the moment.  If we're generating
-	 -call_nonpic code, no new uses will be introduced during or after
-	 reload.  */
-      if (TARGET_ABICALLS_PIC0)
-	return INVALID_REGNUM;
-
-      /* We need to handle the following implicit gp references:
-
-	 - Reload can sometimes introduce constant pool references
-	   into a function that otherwise didn't need them.  For example,
-	   suppose we have an instruction like:
+  /* If there are no current references to $gp, then the only uses
+     we can introduce later are those involved in long branches.  */
+  if (TARGET_ABSOLUTE_JUMPS && !mips_cfun_has_flexible_gp_ref_p ())
+    return INVALID_REGNUM;
 
-	       (set (reg:DF R1) (float:DF (reg:SI R2)))
-
-	   If R2 turns out to be constant such as 1, the instruction may
-	   have a REG_EQUAL note saying that R1 == 1.0.  Reload then has
-	   the option of using this constant if R2 doesn't get allocated
-	   to a register.
-
-	   In cases like these, reload will have added the constant to the
-	   pool but no instruction will yet refer to it.
-
-	 - MIPS16 functions that return in FPRs need to call an
-	   external libgcc routine.  */
-      if (!crtl->uses_const_pool
-	  && !mips16_cfun_returns_in_fpr_p ())
-	return INVALID_REGNUM;
-    }
-
-  /* We need a global pointer, but perhaps we can use a call-clobbered
-     register instead of $gp.  */
+  /* If the global pointer is call-saved, try to use a call-clobbered
+     alternative.  */
   if (TARGET_CALL_SAVED_GP && current_function_is_leaf)
     for (regno = GP_REG_FIRST; regno <= GP_REG_LAST; regno++)
       if (!df_regs_ever_live_p (regno)
@@ -8679,6 +8735,107 @@ mips_global_pointer (void)
   return GLOBAL_POINTER_REGNUM;
 }
 
+/* Return true if we have chosen to use a global pointer for the current
+   function.
+
+   One problem we have to deal with is that, when emitting GOT-based
+   position independent code, long-branch sequences will need to load
+   the address of the branch target from the GOT.  We don't know until
+   the very end of compilation whether (and where) the function needs
+   long branches, so we must ensure that _any_ branch can access the
+   global pointer in some form.  However, we do not want to pessimize
+   the usual case in which all branches are short.
+
+   We handle this as follows:
+
+   - During reload, we set cfun->machine->global_pointer to
+     INVALID_REGNUM if we _know_ that the current function
+     doesn't need a global pointer.  This is only valid if
+     long branches don't need the GOT.
+
+     Otherwise, we assume that we might need a global pointer
+     and pick an appropriate register.
+
+   - During prologue generation, we set cfun->machine->chosen_to_use_gp_p
+     if we already know that the function needs a global pointer.
+     (There is no need to set it earlier than this, and doing it
+     as late as possible leads to fewer false positives.)
+
+   - During prologue and epilogue generation, we emit "ghost"
+     placeholder instructions to manipulate the global pointer.
+
+   - If cfun->machine->global_pointer != INVALID_REGNUM,
+     we ensure that the global pointer is available at every block
+     boundary bar entry and exit.  We do this in one of two ways:
+
+     - If the function has a cprestore slot, we ensure that this
+       slot is valid at every branch.  However, we do not guarantee that
+       the global pointer register itself is valid.  (We don't make the
+       latter guarantee because we want to be able to delete redundant
+       loads from the cprestore slot in the usual case where long
+       branches aren't needed.)
+
+       Long branches must therefore load the global pointer value from
+       the cprestore slot before using it.
+
+       We guarantee that the cprestore slot is valid by loading it
+       into a fake register, CPRESTORE_SLOT_REGNUM.  We then make
+       this register live at every block boundary bar function entry
+       and exit.  It is then invalid to move the load (and thus the
+       preceding store) across a block boundary.
+
+     - If the function has no cprestore slot, then we guarantee that
+       the global pointer register itself is valid at every branch.
+
+     See mips_eh_uses for the handling of the register liveness.
+
+   - After prologue and epilogue creation (epilogue_completed),
+     we have two choses:
+
+     - If cfun->machine->chosen_to_use_gp_p, we split the ghost
+       instructions into real instructions.  On old-ABI targets,
+       we also split calls and restore_gp patterns into instructions
+       that explicitly load the global pointer from the cprestore slot.
+       These split instructions can then be optimized in the usual way.
+
+     - Otherwise, we keep the ghost instructions intact.  On old-ABI
+       targets, we split calls and restore_gp patterns into instructions
+       that _don't_ load from the cprestore slot.  This optimizes for the
+       case where no global pointer is needed.
+
+     For example, the scheduler ignores a ghost instruction that
+     stores $gp to the stack, but it handles the split form of
+     the ghost instruction as an ordinary store.
+
+   - If cfun->machine->chosen_to_use_gp_p is still false at the end
+     of md_reorg, we decide whether the global pointer is needed for
+     long branches.  If so, we split the ghost instructions into
+     real instructions at that stage.
+
+   Note that the ghost instructions must have a zero length for three reasons:
+
+   - Giving the length of the underlying $gp sequence might cause
+     us to use long branches in cases where they aren't really needed.
+
+   - They would perturb things like alignment calculations.
+
+   - More importantly, the hazard detection in md_reorg relies on
+     empty instructions having a zero length.
+
+   If we find a long branch and split the ghost instructions at the
+   end of md_reorg, the split could introduce more long branches.
+   That isn't a problem though, because we still do the split before
+   the final shorten_branches pass.
+
+   This is extremely ugly, but it seems like the best compromise between
+   correctness and efficiency.  */
+
+bool
+mips_chosen_to_use_gp_p (void)
+{
+  return cfun->machine->chosen_to_use_gp_p;
+}
+
 /* Return true if REGNO is a register that is ordinarily call-clobbered
    but must nevertheless be preserved by an interrupt handler.  */
 
@@ -9198,39 +9355,115 @@ mips_set_return_address (rtx address, rt
   mips_emit_move (gen_frame_mem (GET_MODE (address), slot_address), address);
 }
 
-/* Return a MEM rtx for the cprestore slot, using TEMP as a temporary base
+/* Return true if the current function has a cprestore slot.  */
+
+bool
+mips_cfun_has_cprestore_slot_p (void)
+{
+  return (cfun->machine->global_pointer != INVALID_REGNUM
+	  && cfun->machine->frame.cprestore_size > 0);
+}
+
+/* Fill *BASE and *OFFSET such that *BASE + *OFFSET refers to the
+   cprestore slot.  */
+
+static void
+mips_get_cprestore_base_and_offset (rtx *base, HOST_WIDE_INT *offset)
+{
+  const struct mips_frame_info *frame;
+
+  frame = &cfun->machine->frame;
+  if (frame_pointer_needed)
+    {
+      *base = hard_frame_pointer_rtx;
+      *offset = frame->args_size - frame->hard_frame_pointer_offset;
+    }
+  else
+    {
+      *base = stack_pointer_rtx;
+      *offset = frame->args_size;
+    }
+}
+
+/* Return true if X is the stack-pointer-based address of the cprestore
+   slot.  */
+
+bool
+mips_cprestore_sp_address_p (rtx x)
+{
+  rtx base;
+  HOST_WIDE_INT offset;
+
+  mips_split_plus (x, &base, &offset);
+  return base == stack_pointer_rtx && offset == cfun->machine->frame.args_size;
+}
+
+/* Return true if the offset of the cprestore slot from the appropriate
+   base pointer (stack or frame) is outside the range of a load.  */
+
+bool
+mips_large_cprestore_offset_p (void)
+{
+  rtx base;
+  HOST_WIDE_INT offset;
+
+  mips_get_cprestore_base_and_offset (&base, &offset);
+  return !SMALL_OPERAND (offset);
+}
+
+/* Return a MEM rtx for the cprestore slot.  Use TEMP as a temporary
    register if need be.  */
 
 static rtx
 mips_cprestore_slot (rtx temp)
 {
-  const struct mips_frame_info *frame;
   rtx base;
   HOST_WIDE_INT offset;
 
-  frame = &cfun->machine->frame;
-  if (frame_pointer_needed)
+  mips_get_cprestore_base_and_offset (&base, &offset);
+  return gen_frame_mem (Pmode, mips_add_offset (temp, base, offset));
+}
+
+/* Emit instructions to save global pointer value GP into cprestore
+   slot MEM.  MEM is the stack-pointer-based address of the slot;
+   the stack offset is OFFSET.
+
+   MEM may not be a legitimate address.  If it isn't, TEMP is a
+   temporary register that can be used, otherwise it is a SCRATCH.  */
+
+void
+mips_save_gp_to_cprestore_slot (rtx mem, rtx offset, rtx gp, rtx temp)
+{
+  rtx addr;
+
+  if (TARGET_CPRESTORE_DIRECTIVE)
     {
-      base = hard_frame_pointer_rtx;
-      offset = frame->args_size - frame->hard_frame_pointer_offset;
+      gcc_assert (gp == pic_offset_table_rtx);
+      emit_insn (gen_cprestore (mem, offset));
     }
   else
     {
-      base = stack_pointer_rtx;
-      offset = frame->args_size;
+      if (!memory_operand (mem, SImode))
+	{
+	  addr = mips_add_offset (temp, stack_pointer_rtx, INTVAL (offset));
+	  mem = replace_equiv_address (mem, addr);
+	}
+      mips_emit_move (mem, gp);
     }
-  return gen_frame_mem (Pmode, mips_add_offset (temp, base, offset));
 }
 
 /* Restore $gp from its save slot, using TEMP as a temporary base register
-   if need be.  This function is for o32 and o64 abicalls only.  */
+   if need be.  This function is for o32 and o64 abicalls only.
+
+   See mips_chosen_to_use_gp_p for details about how we manage the
+   global pointer.  */
 
 void
-mips_restore_gp (rtx temp)
+mips_restore_gp_from_cprestore_slot (rtx temp)
 {
-  gcc_assert (TARGET_ABICALLS && TARGET_OLDABI);
+  gcc_assert (TARGET_ABICALLS && TARGET_OLDABI && epilogue_completed);
 
-  if (cfun->machine->global_pointer == INVALID_REGNUM)
+  if (!mips_chosen_to_use_gp_p ())
     return;
 
   if (TARGET_MIPS16)
@@ -9327,6 +9560,89 @@ mips_for_each_saved_gpr_and_fpr (HOST_WI
 	offset -= GET_MODE_SIZE (fpr_mode);
       }
 }
+
+/* Return true if a move between register REGNO and its save slot (MEM)
+   can be done in a single move.  LOAD_P is true if we are loading
+   from the slot, false if we are storing to it.  */
+
+static bool
+mips_direct_save_slot_move_p (unsigned int regno, rtx mem, bool load_p)
+{
+  /* There is a specific MIPS16 instruction for saving $31 to the stack.  */
+  if (TARGET_MIPS16 && !load_p && regno == GP_REG_FIRST + 31)
+    return false;
+
+  return mips_secondary_reload_class (REGNO_REG_CLASS (regno),
+				      GET_MODE (mem), mem, load_p) == NO_REGS;
+}
+
+/* Emit a move from SRC to DEST, given that one of them is a register
+   save slot and that the other is a register.  TEMP is a temporary
+   GPR of the same mode that is available if need be.  */
+
+void
+mips_emit_save_slot_move (rtx dest, rtx src, rtx temp)
+{
+  unsigned int regno;
+  rtx mem;
+
+  if (REG_P (src))
+    {
+      regno = REGNO (src);
+      mem = dest;
+    }
+  else
+    {
+      regno = REGNO (dest);
+      mem = src;
+    }
+
+  if (regno == cfun->machine->global_pointer && !mips_chosen_to_use_gp_p ())
+    {
+      /* We don't yet know whether we'll need this instruction or not.
+	 Postpone the decision by emitting a ghost move.  This move
+	 is specifically not frame-related: only the split version is.  */
+      if (TARGET_64BIT)
+	emit_insn (gen_move_gpdi (dest, src));
+      else
+	emit_insn (gen_move_gpsi (dest, src));
+      return;
+    }
+
+  if (regno == HI_REGNUM)
+    {
+      if (REG_P (dest))
+	{
+	  mips_emit_move (temp, src);
+	  if (TARGET_64BIT)
+	    emit_insn (gen_mthisi_di (gen_rtx_REG (TImode, MD_REG_FIRST),
+				      temp, gen_rtx_REG (DImode, LO_REGNUM)));
+	  else
+	    emit_insn (gen_mthisi_di (gen_rtx_REG (DImode, MD_REG_FIRST),
+				      temp, gen_rtx_REG (SImode, LO_REGNUM)));
+	}
+      else
+	{
+	  if (TARGET_64BIT)
+	    emit_insn (gen_mfhidi_ti (temp,
+				      gen_rtx_REG (TImode, MD_REG_FIRST)));
+	  else
+	    emit_insn (gen_mfhisi_di (temp,
+				      gen_rtx_REG (DImode, MD_REG_FIRST)));
+	  mips_emit_move (dest, temp);
+	}
+    }
+  else if (mips_direct_save_slot_move_p (regno, mem, mem == src))
+    mips_emit_move (dest, src);
+  else
+    {
+      gcc_assert (!reg_overlap_mentioned_p (dest, temp));
+      mips_emit_move (temp, src);
+      mips_emit_move (dest, temp);
+    }
+  if (MEM_P (dest))
+    mips_set_frame_expr (mips_frame_set (dest, src));
+}
 
 /* If we're generating n32 or n64 abicalls, and the current function
    does not use $28 as its global pointer, emit a cplocal directive.
@@ -9336,7 +9652,7 @@ mips_for_each_saved_gpr_and_fpr (HOST_WI
 mips_output_cplocal (void)
 {
   if (!TARGET_EXPLICIT_RELOCS
-      && cfun->machine->global_pointer != INVALID_REGNUM
+      && mips_chosen_to_use_gp_p ()
       && cfun->machine->global_pointer != GLOBAL_POINTER_REGNUM)
     output_asm_insn (".cplocal %+", 0);
 }
@@ -9408,7 +9724,8 @@ mips_output_function_prologue (FILE *fil
   /* Handle the initialization of $gp for SVR4 PIC, if applicable.
      Also emit the ".set noreorder; .set nomacro" sequence for functions
      that need it.  */
-  if (mips_current_loadgp_style () == LOADGP_OLDABI)
+  if (mips_chosen_to_use_gp_p ()
+      && mips_current_loadgp_style () == LOADGP_OLDABI)
     {
       if (TARGET_MIPS16)
 	{
@@ -9490,33 +9807,7 @@ mips_save_reg (rtx reg, rtx mem)
       mips_set_frame_expr (gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, x1, x2)));
     }
   else
-    {
-      if (REGNO (reg) == HI_REGNUM)
-	{
-	  if (TARGET_64BIT)
-	    emit_insn (gen_mfhidi_ti (MIPS_PROLOGUE_TEMP (DImode),
-				      gen_rtx_REG (TImode, MD_REG_FIRST)));
-	  else
-	    emit_insn (gen_mfhisi_di (MIPS_PROLOGUE_TEMP (SImode),
-				      gen_rtx_REG (DImode, MD_REG_FIRST)));
-	  mips_emit_move (mem, MIPS_PROLOGUE_TEMP (GET_MODE (reg)));
-	}
-      else if ((TARGET_MIPS16
-		&& REGNO (reg) != GP_REG_FIRST + 31
-		&& !M16_REG_P (REGNO (reg)))
-	       || ACC_REG_P (REGNO (reg)))
-	{
-	  /* If the register has no direct store instruction, move it
-	     through a temporary.  Note that there's a special MIPS16
-	     instruction to save $31.  */
-	  mips_emit_move (MIPS_PROLOGUE_TEMP (GET_MODE (reg)), reg);
-	  mips_emit_move (mem, MIPS_PROLOGUE_TEMP (GET_MODE (reg)));
-	}
-      else
-	mips_emit_move (mem, reg);
-
-      mips_set_frame_expr (mips_frame_set (mem, reg));
-    }
+    mips_emit_save_slot_move (mem, reg, MIPS_PROLOGUE_TEMP (GET_MODE (reg)));
 }
 
 /* The __gnu_local_gp symbol.  */
@@ -9599,7 +9890,16 @@ mips_expand_prologue (void)
   rtx insn;
 
   if (cfun->machine->global_pointer != INVALID_REGNUM)
-    SET_REGNO (pic_offset_table_rtx, cfun->machine->global_pointer);
+    {
+      /* Check whether an insn uses pic_offset_table_rtx, either explicitly
+	 or implicitly.  If so, we can commit to using a global pointer
+	 straight away, otherwise we need to defer the decision.  */
+      if (mips_cfun_has_inflexible_gp_ref_p ()
+	  || mips_cfun_has_flexible_gp_ref_p ())
+	cfun->machine->chosen_to_use_gp_p = true;
+
+      SET_REGNO (pic_offset_table_rtx, cfun->machine->global_pointer);
+    }
 
   frame = &cfun->machine->frame;
   size = frame->total_size;
@@ -9801,17 +10101,20 @@ mips_expand_prologue (void)
   mips_emit_loadgp ();
 
   /* Initialize the $gp save slot.  */
-  if (frame->cprestore_size > 0
-      && cfun->machine->global_pointer != INVALID_REGNUM)
+  if (mips_cfun_has_cprestore_slot_p ())
     {
-      if (TARGET_MIPS16)
-	mips_emit_move (mips_cprestore_slot (MIPS_PROLOGUE_TEMP (Pmode)),
-			MIPS16_PIC_TEMP);
-      else if (TARGET_ABICALLS_PIC2)
-	emit_insn (gen_cprestore (GEN_INT (frame->args_size)));
-      else
-	emit_move_insn (mips_cprestore_slot (MIPS_PROLOGUE_TEMP (Pmode)),
-			pic_offset_table_rtx);
+      rtx offset, addr, mem, gp, temp;
+
+      addr = plus_constant (stack_pointer_rtx, frame->args_size);
+      mem = gen_frame_mem (Pmode, addr);
+      offset = GEN_INT (frame->args_size);
+      gp = TARGET_MIPS16 ? MIPS16_PIC_TEMP : pic_offset_table_rtx;
+      temp = (SMALL_OPERAND (frame->args_size)
+	      ? gen_rtx_SCRATCH (Pmode)
+	      : MIPS_PROLOGUE_TEMP (Pmode));
+
+      emit_insn (gen_potential_cprestore (mem, offset, gp, temp));
+      emit_insn (gen_use_cprestore (copy_rtx (mem)));
     }
 
   /* We need to search back to the last use of K0 or K1.  */
@@ -9844,27 +10147,7 @@ mips_restore_reg (rtx reg, rtx mem)
   if (TARGET_MIPS16 && REGNO (reg) == GP_REG_FIRST + 31)
     reg = gen_rtx_REG (GET_MODE (reg), GP_REG_FIRST + 7);
 
-  if (REGNO (reg) == HI_REGNUM)
-    {
-      mips_emit_move (MIPS_EPILOGUE_TEMP (GET_MODE (reg)), mem);
-      if (TARGET_64BIT)
-	emit_insn (gen_mthisi_di (gen_rtx_REG (TImode, MD_REG_FIRST),
-				  MIPS_EPILOGUE_TEMP (DImode),
-				  gen_rtx_REG (DImode, LO_REGNUM)));
-      else
-	emit_insn (gen_mthisi_di (gen_rtx_REG (DImode, MD_REG_FIRST),
-				  MIPS_EPILOGUE_TEMP (SImode),
-				  gen_rtx_REG (SImode, LO_REGNUM)));
-    }
-  else if ((TARGET_MIPS16 && !M16_REG_P (REGNO (reg)))
-	   || ACC_REG_P (REGNO (reg)))
-    {
-      /* Can't restore directly; move through a temporary.  */
-      mips_emit_move (MIPS_EPILOGUE_TEMP (GET_MODE (reg)), mem);
-      mips_emit_move (reg, MIPS_EPILOGUE_TEMP (GET_MODE (reg)));
-    }
-  else
-    mips_emit_move (reg, mem);
+  mips_emit_save_slot_move (reg, mem, MIPS_EPILOGUE_TEMP (GET_MODE (reg)));
 }
 
 /* Emit any instructions needed before a return.  */
@@ -10732,12 +11015,112 @@ mips_init_libfuncs (void)
     synchronize_libfunc = init_one_libfunc ("__sync_synchronize");
 }
 
+/* Build up a multi-insn sequence that loads label TARGET into $AT.  */
+
+static void
+mips_process_load_label (rtx target)
+{
+  rtx base, gp, intop;
+  HOST_WIDE_INT offset;
+
+  mips_multi_start ();
+  switch (mips_abi)
+    {
+    case ABI_N32:
+      mips_multi_add_insn ("lw\t%@,%%got_page(%0)(%+)", target, 0);
+      mips_multi_add_insn ("addiu\t%@,%@,%%got_ofst(%0)", target, 0);
+      break;
+
+    case ABI_64:
+      mips_multi_add_insn ("ld\t%@,%%got_page(%0)(%+)", target, 0);
+      mips_multi_add_insn ("daddiu\t%@,%@,%%got_ofst(%0)", target, 0);
+      break;
+
+    default:
+      gp = pic_offset_table_rtx;
+      if (mips_cfun_has_cprestore_slot_p ())
+	{
+	  gp = gen_rtx_REG (Pmode, AT_REGNUM);
+	  mips_get_cprestore_base_and_offset (&base, &offset);
+	  if (!SMALL_OPERAND (offset))
+	    {
+	      intop = GEN_INT (CONST_HIGH_PART (offset));
+	      mips_multi_add_insn ("lui\t%0,%1", gp, intop, 0);
+	      mips_multi_add_insn ("addu\t%0,%0,%1", gp, base, 0);
+
+	      base = gp;
+	      offset = CONST_LOW_PART (offset);
+	    }
+	  intop = GEN_INT (offset);
+	  if (ISA_HAS_LOAD_DELAY)
+	    mips_multi_add_insn ("lw\t%0,%1(%2)%#", gp, intop, base, 0);
+	  else
+	    mips_multi_add_insn ("lw\t%0,%1(%2)", gp, intop, base, 0);
+	}
+      if (ISA_HAS_LOAD_DELAY)
+	mips_multi_add_insn ("lw\t%@,%%got(%0)(%1)%#", target, gp, 0);
+      else
+	mips_multi_add_insn ("lw\t%@,%%got(%0)(%1)", target, gp, 0);
+      mips_multi_add_insn ("addiu\t%@,%@,%%lo(%0)", target, 0);
+      break;
+    }
+}
+
+/* Return the number of instructions needed to load a label into $AT.  */
+
+static unsigned int
+mips_load_label_length (void)
+{
+  if (cfun->machine->load_label_length == 0)
+    {
+      mips_process_load_label (pc_rtx);
+      cfun->machine->load_label_length = mips_multi_num_insns;
+    }
+  return cfun->machine->load_label_length;
+}
+
+/* Emit an asm sequence to start a noat block and load the address
+   of a label into $1.  */
+
+void
+mips_output_load_label (rtx target)
+{
+  mips_push_asm_switch (&mips_noat);
+  if (TARGET_EXPLICIT_RELOCS)
+    {
+      mips_process_load_label (target);
+      mips_multi_write ();
+    }
+  else
+    {
+      if (Pmode == DImode)
+	output_asm_insn ("dla\t%@,%0", &target);
+      else
+	output_asm_insn ("la\t%@,%0", &target);
+    }
+}
+
 /* Return the length of INSN.  LENGTH is the initial length computed by
    attributes in the machine-description file.  */
 
 int
 mips_adjust_insn_length (rtx insn, int length)
 {
+  /* mips.md uses MAX_PIC_BRANCH_LENGTH as a placeholder for the length
+     of a PIC long-branch sequence.  Substitute the correct value.  */
+  if (length == MAX_PIC_BRANCH_LENGTH
+      && INSN_CODE (insn) >= 0
+      && get_attr_type (insn) == TYPE_BRANCH)
+    {
+      /* Add the branch-over instruction and its delay slot, if this
+	 is a conditional branch.  */
+      length = simplejump_p (insn) ? 0 : 8;
+
+      /* Load the label into $AT and jump to it.  Ignore the delay
+	 slot of the jump.  */
+      length += mips_load_label_length () + 4;
+    }
+
   /* A unconditional jump has an unfilled delay slot if it is not part
      of a sequence.  A conditional jump normally has a delay slot, but
      does not on MIPS16.  */
@@ -10769,38 +11152,9 @@ mips_adjust_insn_length (rtx insn, int l
   return length;
 }
 
-/* Return an asm sequence to start a noat block and load the address
-   of a label into $1.  */
-
-const char *
-mips_output_load_label (void)
-{
-  if (TARGET_EXPLICIT_RELOCS)
-    switch (mips_abi)
-      {
-      case ABI_N32:
-	return "%[lw\t%@,%%got_page(%0)(%+)\n\taddiu\t%@,%@,%%got_ofst(%0)";
-
-      case ABI_64:
-	return "%[ld\t%@,%%got_page(%0)(%+)\n\tdaddiu\t%@,%@,%%got_ofst(%0)";
-
-      default:
-	if (ISA_HAS_LOAD_DELAY)
-	  return "%[lw\t%@,%%got(%0)(%+)%#\n\taddiu\t%@,%@,%%lo(%0)";
-	return "%[lw\t%@,%%got(%0)(%+)\n\taddiu\t%@,%@,%%lo(%0)";
-      }
-  else
-    {
-      if (Pmode == DImode)
-	return "%[dla\t%@,%0";
-      else
-	return "%[la\t%@,%0";
-    }
-}
-
 /* Return the assembly code for INSN, which has the operands given by
-   OPERANDS, and which branches to OPERANDS[1] if some condition is true.
-   BRANCH_IF_TRUE is the asm template that should be used if OPERANDS[1]
+   OPERANDS, and which branches to OPERANDS[0] if some condition is true.
+   BRANCH_IF_TRUE is the asm template that should be used if OPERANDS[0]
    is in range of a direct branch.  BRANCH_IF_FALSE is an inverted
    version of BRANCH_IF_TRUE.  */
 
@@ -10812,7 +11166,7 @@ mips_output_conditional_branch (rtx insn
   unsigned int length;
   rtx taken, not_taken;
 
-  gcc_assert (LABEL_P (operands[1]));  
+  gcc_assert (LABEL_P (operands[0]));
 
   length = get_attr_length (insn);
   if (length <= 8)
@@ -10826,10 +11180,10 @@ mips_output_conditional_branch (rtx insn
      not use branch-likely instructions.  */
   mips_branch_likely = false;
   not_taken = gen_label_rtx ();
-  taken = operands[1];
+  taken = operands[0];
 
   /* Generate the reversed branch to NOT_TAKEN.  */
-  operands[1] = not_taken;
+  operands[0] = not_taken;
   output_asm_insn (branch_if_false, operands);
 
   /* If INSN has a delay slot, we must provide delay slots for both the
@@ -10851,11 +11205,11 @@ mips_output_conditional_branch (rtx insn
     }
 
   /* Output the unconditional branch to TAKEN.  */
-  if (length <= 16)
-    output_asm_insn ("j\t%0%/", &taken);
+  if (TARGET_ABSOLUTE_JUMPS)
+    output_asm_insn (MIPS_ABSOLUTE_JUMP ("j\t%0%/"), &taken);
   else
     {
-      output_asm_insn (mips_output_load_label (), &taken);
+      mips_output_load_label (taken);
       output_asm_insn ("jr\t%@%]%/", 0);
     }
 
@@ -10881,10 +11235,10 @@ mips_output_conditional_branch (rtx insn
   return "";
 }
 
-/* Return the assembly code for INSN, which branches to OPERANDS[1]
+/* Return the assembly code for INSN, which branches to OPERANDS[0]
    if some ordering condition is true.  The condition is given by
-   OPERANDS[0] if !INVERTED_P, otherwise it is the inverse of
-   OPERANDS[0].  OPERANDS[2] is the comparison's first operand;
+   OPERANDS[1] if !INVERTED_P, otherwise it is the inverse of
+   OPERANDS[1].  OPERANDS[2] is the comparison's first operand;
    its second is always zero.  */
 
 const char *
@@ -10892,17 +11246,17 @@ mips_output_order_conditional_branch (rt
 {
   const char *branch[2];
 
-  /* Make BRANCH[1] branch to OPERANDS[1] when the condition is true.
+  /* Make BRANCH[1] branch to OPERANDS[0] when the condition is true.
      Make BRANCH[0] branch on the inverse condition.  */
-  switch (GET_CODE (operands[0]))
+  switch (GET_CODE (operands[1]))
     {
       /* These cases are equivalent to comparisons against zero.  */
     case LEU:
       inverted_p = !inverted_p;
       /* Fall through.  */
     case GTU:
-      branch[!inverted_p] = MIPS_BRANCH ("bne", "%2,%.,%1");
-      branch[inverted_p] = MIPS_BRANCH ("beq", "%2,%.,%1");
+      branch[!inverted_p] = MIPS_BRANCH ("bne", "%2,%.,%0");
+      branch[inverted_p] = MIPS_BRANCH ("beq", "%2,%.,%0");
       break;
 
       /* These cases are always true or always false.  */
@@ -10910,13 +11264,13 @@ mips_output_order_conditional_branch (rt
       inverted_p = !inverted_p;
       /* Fall through.  */
     case GEU:
-      branch[!inverted_p] = MIPS_BRANCH ("beq", "%.,%.,%1");
-      branch[inverted_p] = MIPS_BRANCH ("bne", "%.,%.,%1");
+      branch[!inverted_p] = MIPS_BRANCH ("beq", "%.,%.,%0");
+      branch[inverted_p] = MIPS_BRANCH ("bne", "%.,%.,%0");
       break;
 
     default:
-      branch[!inverted_p] = MIPS_BRANCH ("b%C0z", "%2,%1");
-      branch[inverted_p] = MIPS_BRANCH ("b%N0z", "%2,%1");
+      branch[!inverted_p] = MIPS_BRANCH ("b%C1z", "%2,%0");
+      branch[inverted_p] = MIPS_BRANCH ("b%N1z", "%2,%0");
       break;
     }
   return mips_output_conditional_branch (insn, operands, branch[1], branch[0]);
@@ -11915,7 +12269,8 @@ mips_variable_issue (FILE *file ATTRIBUT
   /* Ignore USEs and CLOBBERs; don't count them against the issue rate.  */
   if (USEFUL_INSN_P (insn))
     {
-      more--;
+      if (get_attr_type (insn) != TYPE_GHOST)
+	more--;
       if (!reload_completed && TUNE_MACC_CHAINS)
 	mips_macc_chains_record (insn);
       vr4130_last_insn = insn;
@@ -14173,6 +14528,46 @@ mips_reorg_process_insns (void)
   htab_delete (htab);
 }
 
+/* If we are using a GOT, but have not decided to use a global pointer yet,
+   see whether we need one to implement long branches.  Convert the ghost
+   global-pointer instructions into real ones if so.  */
+
+static bool
+mips_expand_ghost_gp_insns (void)
+{
+  rtx insn;
+  int normal_length;
+
+  /* Quick exit if we already know that we will or won't need a
+     global pointer.  */
+  if (!TARGET_USE_GOT
+      || cfun->machine->global_pointer == INVALID_REGNUM
+      || mips_chosen_to_use_gp_p ())
+    return false;
+
+  shorten_branches (get_insns ());
+
+  /* Look for a branch that is longer than normal.  The normal length for
+     non-MIPS16 branches is 8, because the length includes the delay slot.
+     It is 4 for MIPS16, because MIPS16 branches are extended instructions,
+     but they have no delay slot.  */
+  normal_length = (TARGET_MIPS16 ? 4 : 8);
+  for (insn = get_insns (); insn; insn = NEXT_INSN (insn))
+    if (JUMP_P (insn)
+	&& USEFUL_INSN_P (insn)
+	&& get_attr_length (insn) > normal_length)
+      break;
+
+  if (insn == NULL_RTX)
+    return false;
+
+  /* We've now established that we need $gp.  */
+  cfun->machine->chosen_to_use_gp_p = true;
+  split_all_insns_noflow ();
+
+  return true;
+}
+
 /* Implement TARGET_MACHINE_DEPENDENT_REORG.  */
 
 static void
@@ -14189,6 +14584,10 @@ mips_reorg (void)
       && TUNE_MIPS4130
       && TARGET_VR4130_ALIGN)
     vr4130_align_insns ();
+  if (mips_expand_ghost_gp_insns ())
+    /* The expansion could invalidate some of the VR4130 alignment
+       optimizations, but this should be an extremely rare case anyhow.  */
+    mips_reorg_process_insns ();
 }
 
 /* Implement TARGET_ASM_OUTPUT_MI_THUNK.  Generate rtl rather than asm text
@@ -14222,6 +14621,7 @@ mips_output_mi_thunk (FILE *file, tree t
 	 TARGET_CALL_SAVED_GP.  */
       cfun->machine->global_pointer
 	= TARGET_CALL_SAVED_GP ? 15 : GLOBAL_POINTER_REGNUM;
+      cfun->machine->chosen_to_use_gp_p = true;
       SET_REGNO (pic_offset_table_rtx, cfun->machine->global_pointer);
 
       /* Set up the global pointer for n32 or n64 abicalls.  */
@@ -15137,6 +15537,30 @@ mips_order_regs_for_local_alloc (void)
     }
 }
 
+/* Implement EH_USES.  */
+
+bool
+mips_eh_uses (unsigned int regno)
+{
+  if (reload_completed && !TARGET_ABSOLUTE_JUMPS)
+    {
+      /* We need to force certain registers to be live in order to handle PIC
+	 long branches correctly.  See mips_chosen_to_use_gp_p for details.  */
+      if (mips_cfun_has_cprestore_slot_p ())
+	{
+	  if (regno == CPRESTORE_SLOT_REGNUM)
+	    return true;
+	}
+      else
+	{
+	  if (cfun->machine->global_pointer == regno)
+	    return true;
+	}
+    }
+
+  return false;
+}
+
 /* Implement EPILOGUE_USES.  */
 
 bool
Index: gcc/config/mips/predicates.md
===================================================================
--- gcc/config/mips/predicates.md	2009-08-31 08:38:27.000000000 +0100
+++ gcc/config/mips/predicates.md	2009-08-31 08:39:25.000000000 +0100
@@ -244,6 +244,10 @@ (define_predicate "move_operand"
     }
 })
 
+(define_predicate "cprestore_sp_slot_operand"
+  (and (match_code "mem")
+       (match_test "mips_cprestore_sp_address_p (XEXP (op, 0))")))
+
 (define_predicate "consttable_operand"
   (match_test "CONSTANT_P (op)"))
 
Index: gcc/config/mips/mips.md
===================================================================
--- gcc/config/mips/mips.md	2009-08-31 08:38:27.000000000 +0100
+++ gcc/config/mips/mips.md	2009-08-31 08:50:06.000000000 +0100
@@ -29,11 +29,13 @@ (define_constants
    (UNSPEC_STORE_WORD		 2)
    (UNSPEC_GET_FNADDR		 3)
    (UNSPEC_BLOCKAGE		 4)
-   (UNSPEC_CPRESTORE		 5)
-   (UNSPEC_RESTORE_GP		 6)
-   (UNSPEC_EH_RETURN		 7)
-   (UNSPEC_CONSTTABLE_INT	 8)
-   (UNSPEC_CONSTTABLE_FLOAT	 9)
+   (UNSPEC_POTENTIAL_CPRESTORE	 5)
+   (UNSPEC_CPRESTORE		 6)
+   (UNSPEC_RESTORE_GP		 7)
+   (UNSPEC_MOVE_GP		 8)
+   (UNSPEC_EH_RETURN		 9)
+   (UNSPEC_CONSTTABLE_INT	10)
+   (UNSPEC_CONSTTABLE_FLOAT	11)
    (UNSPEC_ALIGN		14)
    (UNSPEC_HIGH			17)
    (UNSPEC_LOAD_LEFT		18)
@@ -77,6 +79,7 @@ (define_constants
    (UNSPEC_ADDRESS_FIRST	100)
 
    (TLS_GET_TP_REGNUM		3)
+   (CPRESTORE_SLOT_REGNUM	76)
    (GOT_VERSION_REGNUM		79)
 
    ;; For MIPS Paired-Singled Floating Point Instructions.
@@ -256,6 +259,9 @@ (define_constants
 
    (UNSPEC_MIPS_CACHE		600)
    (UNSPEC_R10K_CACHE_BARRIER	601)
+
+   ;; PIC long branch sequences are never longer than 100 bytes.
+   (MAX_PIC_BRANCH_LENGTH	100)
   ]
 )
 
@@ -281,12 +287,11 @@ (define_attr "jal" "unset,direct,indirec
 ;;
 ;; jal is always a macro for TARGET_CALL_CLOBBERED_GP because it includes
 ;; an instruction to restore $gp.  Direct jals are also macros for
-;; flag_pic && !TARGET_ABSOLUTE_ABICALLS because they first load
-;; the target address into a register.
+;; !TARGET_ABSOLUTE_JUMPS because they first load the target address
+;; into a register.
 (define_attr "jal_macro" "no,yes"
   (cond [(eq_attr "jal" "direct")
-	 (symbol_ref "((TARGET_CALL_CLOBBERED_GP
-			|| (flag_pic && !TARGET_ABSOLUTE_ABICALLS))
+	 (symbol_ref "(TARGET_CALL_CLOBBERED_GP || !TARGET_ABSOLUTE_JUMPS
 		       ? JAL_MACRO_YES : JAL_MACRO_NO)")
 	 (eq_attr "jal" "indirect")
 	 (symbol_ref "(TARGET_CALL_CLOBBERED_GP
@@ -498,9 +503,10 @@ (define_attr "length" ""
 	       (ne (symbol_ref "TARGET_MIPS16") (const_int 0)))
 	  (const_int 8)
 
-	  ;; Direct branch instructions have a range of [-0x40000,0x3fffc].
-	  ;; If a branch is outside this range, we have a choice of two
-	  ;; sequences.  For PIC, an out-of-range branch like:
+	  ;; Direct branch instructions have a range of [-0x20000,0x1fffc],
+	  ;; relative to the address of the delay slot.  If a branch is
+	  ;; outside this range, we have a choice of two sequences.
+	  ;; For PIC, an out-of-range branch like:
 	  ;;
 	  ;;	bne	r1,r2,target
 	  ;;	dslot
@@ -514,9 +520,6 @@ (define_attr "length" ""
 	  ;;	nop
 	  ;; 1:
 	  ;;
-	  ;; where the load address can be up to three instructions long
-	  ;; (lw, nop, addiu).
-	  ;;
 	  ;; The non-PIC case is similar except that we use a direct
 	  ;; jump instead of an la/jr pair.  Since the target of this
 	  ;; jump is an absolute 28-bit bit address (the other bits
@@ -531,12 +534,21 @@ (define_attr "length" ""
 	  ;; will add the length of the implicit nop.  The values for
 	  ;; forward and backward branches will be different as well.
 	  (eq_attr "type" "branch")
-	  (cond [(and (le (minus (match_dup 1) (pc)) (const_int 131064))
-                      (le (minus (pc) (match_dup 1)) (const_int 131068)))
-                  (const_int 4)
-		 (ne (symbol_ref "flag_pic") (const_int 0))
-		 (const_int 24)
-		 ] (const_int 12))
+	  (cond [(and (le (minus (match_dup 0) (pc)) (const_int 131064))
+			  (le (minus (pc) (match_dup 0)) (const_int 131068)))
+		   (const_int 4)
+
+		 ;; The non-PIC case: branch, first delay slot, and J.
+		 (ne (symbol_ref "TARGET_ABSOLUTE_JUMPS") (const_int 0))
+		   (const_int 12)]
+
+		 ;; Use MAX_PIC_BRANCH_LENGTH as a (gross) overestimate.
+		 ;; mips_adjust_insn_length substitutes the correct length.
+		 ;;
+		 ;; Note that we can't simply use (symbol_ref ...) here
+		 ;; because genattrtab needs to know the maximum length
+		 ;; of an insn.
+		 (const_int MAX_PIC_BRANCH_LENGTH))
 
 	  ;; "Ghost" instructions occupy no space.
 	  (eq_attr "type" "ghost")
@@ -4754,12 +4766,12 @@ (define_expand "load_const_gp_<mode>"
 ;; function address.
 (define_insn_and_split "loadgp_newabi_<mode>"
   [(set (match_operand:P 0 "register_operand" "=d")
-	(unspec_volatile:P [(match_operand:P 1)
-			    (match_operand:P 2 "register_operand" "d")]
-			   UNSPEC_LOADGP))]
+	(unspec:P [(match_operand:P 1)
+		   (match_operand:P 2 "register_operand" "d")]
+		  UNSPEC_LOADGP))]
   "mips_current_loadgp_style () == LOADGP_NEWABI"
-  "#"
-  ""
+  { return mips_chosen_to_use_gp_p () ? "#" : ""; }
+  "&& mips_chosen_to_use_gp_p ()"
   [(set (match_dup 0) (match_dup 3))
    (set (match_dup 0) (match_dup 4))
    (set (match_dup 0) (match_dup 5))]
@@ -4768,21 +4780,21 @@ (define_insn_and_split "loadgp_newabi_<m
   operands[4] = gen_rtx_PLUS (Pmode, operands[0], operands[2]);
   operands[5] = gen_rtx_LO_SUM (Pmode, operands[0], operands[1]);
 }
-  [(set_attr "length" "12")])
+  [(set_attr "type" "ghost")])
 
 ;; Likewise, for -mno-shared code.  Operand 0 is the __gnu_local_gp symbol.
 (define_insn_and_split "loadgp_absolute_<mode>"
   [(set (match_operand:P 0 "register_operand" "=d")
-	(unspec_volatile:P [(match_operand:P 1)] UNSPEC_LOADGP))]
+	(unspec:P [(match_operand:P 1)] UNSPEC_LOADGP))]
   "mips_current_loadgp_style () == LOADGP_ABSOLUTE"
-  "#"
-  ""
+  { return mips_chosen_to_use_gp_p () ? "#" : ""; }
+  "&& mips_chosen_to_use_gp_p ()"
   [(const_int 0)]
 {
   mips_emit_move (operands[0], operands[1]);
   DONE;
 }
-  [(set_attr "length" "8")])
+  [(set_attr "type" "ghost")])
 
 ;; This blockage instruction prevents the gp load from being
 ;; scheduled after an implicit use of gp.  It also prevents
@@ -4791,19 +4803,18 @@ (define_insn "loadgp_blockage"
   [(unspec_volatile [(reg:SI 28)] UNSPEC_BLOCKAGE)]
   ""
   ""
-  [(set_attr "type" "ghost")
-   (set_attr "mode" "none")])
+  [(set_attr "type" "ghost")])
 
 ;; Initialize $gp for RTP PIC.  Operand 0 is the __GOTT_BASE__ symbol
 ;; and operand 1 is the __GOTT_INDEX__ symbol.
 (define_insn_and_split "loadgp_rtp_<mode>"
   [(set (match_operand:P 0 "register_operand" "=d")
-	(unspec_volatile:P [(match_operand:P 1 "symbol_ref_operand")
-			    (match_operand:P 2 "symbol_ref_operand")]
-			   UNSPEC_LOADGP))]
+	(unspec:P [(match_operand:P 1 "symbol_ref_operand")
+		   (match_operand:P 2 "symbol_ref_operand")]
+		  UNSPEC_LOADGP))]
   "mips_current_loadgp_style () == LOADGP_RTP"
-  "#"
-  ""
+  { return mips_chosen_to_use_gp_p () ? "#" : ""; }
+  "&& mips_chosen_to_use_gp_p ()"
   [(set (match_dup 0) (high:P (match_dup 3)))
    (set (match_dup 0) (unspec:P [(match_dup 0)
 				 (match_dup 3)] UNSPEC_LOAD_GOT))
@@ -4813,37 +4824,72 @@ (define_insn_and_split "loadgp_rtp_<mode
   operands[3] = mips_unspec_address (operands[1], SYMBOL_ABSOLUTE);
   operands[4] = mips_unspec_address (operands[2], SYMBOL_HALF);
 }
-  [(set_attr "length" "12")])
+  [(set_attr "type" "ghost")])
 
 ;; Initialize the global pointer for MIPS16 code.  Operand 0 is the
 ;; global pointer and operand 1 is the MIPS16 register that holds
 ;; the required value.
 (define_insn_and_split "copygp_mips16"
   [(set (match_operand:SI 0 "register_operand" "=y")
-	(unspec_volatile:SI [(match_operand:SI 1 "register_operand" "d")]
-			    UNSPEC_COPYGP))]
+	(unspec:SI [(match_operand:SI 1 "register_operand" "d")]
+		   UNSPEC_COPYGP))]
   "TARGET_MIPS16"
-  "#"
-  "&& reload_completed"
-  [(set (match_dup 0) (match_dup 1))])
+  { return mips_chosen_to_use_gp_p () ? "#" : ""; }
+  "&& mips_chosen_to_use_gp_p ()"
+  [(set (match_dup 0) (match_dup 1))]
+  ""
+  [(set_attr "type" "ghost")])
+
+;; A placeholder for where the cprestore instruction should go,
+;; if we decide we need one.  Operand 0 and operand 1 are as for
+;; "cprestore".  Operand 2 is a register that holds the gp value.
+;;
+;; The "cprestore" pattern requires operand 2 to be pic_offset_table_rtx,
+;; otherwise any register that holds the correct value will do.
+(define_insn_and_split "potential_cprestore"
+  [(set (match_operand:SI 0 "cprestore_sp_slot_operand" "=X,X")
+	(unspec:SI [(match_operand:SI 1 "const_int_operand" "I,i")
+		    (match_operand:SI 2 "register_operand" "d,d")]
+		   UNSPEC_POTENTIAL_CPRESTORE))
+   (clobber (match_operand:SI 3 "scratch_operand" "=X,&d"))]
+  "!TARGET_CPRESTORE_DIRECTIVE || operands[2] == pic_offset_table_rtx"
+  { return mips_chosen_to_use_gp_p () ? "#" : ""; }
+  "mips_chosen_to_use_gp_p ()"
+  [(const_int 0)]
+{
+  mips_save_gp_to_cprestore_slot (operands[0], operands[1],
+				  operands[2], operands[3]);
+  DONE;
+}
+  [(set_attr "type" "ghost")])
 
 ;; Emit a .cprestore directive, which normally expands to a single store
-;; instruction.  Note that we continue to use .cprestore for explicit reloc
-;; code so that jals inside inline asms will work correctly.
+;; instruction.  Operand 0 is a (possibly illegitimate) sp-based MEM
+;; for the cprestore slot.  Operand 1 is the offset of the slot from
+;; the stack pointer.  (This is redundant with operand 0, but it makes
+;; things a little simpler.)
 (define_insn "cprestore"
-  [(unspec_volatile [(match_operand 0 "const_int_operand" "I,i")
-                     (use (reg:SI 28))]
-		    UNSPEC_CPRESTORE)]
-  ""
+  [(set (match_operand:SI 0 "cprestore_sp_slot_operand" "=X,X")
+	(unspec:SI [(match_operand:SI 1 "const_int_operand" "I,i")
+		    (reg:SI 28)]
+		   UNSPEC_CPRESTORE))]
+  "TARGET_CPRESTORE_DIRECTIVE"
 {
   if (mips_nomacro.nesting_level > 0 && which_alternative == 1)
-    return ".set\tmacro\;.cprestore\t%0\;.set\tnomacro";
+    return ".set\tmacro\;.cprestore\t%1\;.set\tnomacro";
   else
-    return ".cprestore\t%0";
+    return ".cprestore\t%1";
 }
   [(set_attr "type" "store")
    (set_attr "length" "4,12")])
 
+(define_insn "use_cprestore"
+  [(set (reg:SI CPRESTORE_SLOT_REGNUM)
+	(match_operand:SI 0 "cprestore_sp_slot_operand"))]
+  ""
+  ""
+  [(set_attr "type" "ghost")])
+
 ;; Expand in-line code to clear the instruction cache between operand[0] and
 ;; operand[1].
 (define_expand "clear_cache"
@@ -5149,100 +5195,94 @@ (define_insn "rotr<mode>3"
 (define_insn "*branch_fp"
   [(set (pc)
         (if_then_else
-         (match_operator 0 "equality_operator"
+         (match_operator 1 "equality_operator"
                          [(match_operand:CC 2 "register_operand" "z")
 			  (const_int 0)])
-         (label_ref (match_operand 1 "" ""))
+         (label_ref (match_operand 0 "" ""))
          (pc)))]
   "TARGET_HARD_FLOAT"
 {
   return mips_output_conditional_branch (insn, operands,
-					 MIPS_BRANCH ("b%F0", "%Z2%1"),
-					 MIPS_BRANCH ("b%W0", "%Z2%1"));
+					 MIPS_BRANCH ("b%F1", "%Z2%0"),
+					 MIPS_BRANCH ("b%W1", "%Z2%0"));
 }
-  [(set_attr "type" "branch")
-   (set_attr "mode" "none")])
+  [(set_attr "type" "branch")])
 
 (define_insn "*branch_fp_inverted"
   [(set (pc)
         (if_then_else
-         (match_operator 0 "equality_operator"
+         (match_operator 1 "equality_operator"
                          [(match_operand:CC 2 "register_operand" "z")
 			  (const_int 0)])
          (pc)
-         (label_ref (match_operand 1 "" ""))))]
+         (label_ref (match_operand 0 "" ""))))]
   "TARGET_HARD_FLOAT"
 {
   return mips_output_conditional_branch (insn, operands,
-					 MIPS_BRANCH ("b%W0", "%Z2%1"),
-					 MIPS_BRANCH ("b%F0", "%Z2%1"));
+					 MIPS_BRANCH ("b%W1", "%Z2%0"),
+					 MIPS_BRANCH ("b%F1", "%Z2%0"));
 }
-  [(set_attr "type" "branch")
-   (set_attr "mode" "none")])
+  [(set_attr "type" "branch")])
 
 ;; Conditional branches on ordered comparisons with zero.
 
 (define_insn "*branch_order<mode>"
   [(set (pc)
 	(if_then_else
-	 (match_operator 0 "order_operator"
+	 (match_operator 1 "order_operator"
 			 [(match_operand:GPR 2 "register_operand" "d")
 			  (const_int 0)])
-	 (label_ref (match_operand 1 "" ""))
+	 (label_ref (match_operand 0 "" ""))
 	 (pc)))]
   "!TARGET_MIPS16"
   { return mips_output_order_conditional_branch (insn, operands, false); }
-  [(set_attr "type" "branch")
-   (set_attr "mode" "none")])
+  [(set_attr "type" "branch")])
 
 (define_insn "*branch_order<mode>_inverted"
   [(set (pc)
 	(if_then_else
-	 (match_operator 0 "order_operator"
+	 (match_operator 1 "order_operator"
 			 [(match_operand:GPR 2 "register_operand" "d")
 			  (const_int 0)])
 	 (pc)
-	 (label_ref (match_operand 1 "" ""))))]
+	 (label_ref (match_operand 0 "" ""))))]
   "!TARGET_MIPS16"
   { return mips_output_order_conditional_branch (insn, operands, true); }
-  [(set_attr "type" "branch")
-   (set_attr "mode" "none")])
+  [(set_attr "type" "branch")])
 
 ;; Conditional branch on equality comparison.
 
 (define_insn "*branch_equality<mode>"
   [(set (pc)
 	(if_then_else
-	 (match_operator 0 "equality_operator"
+	 (match_operator 1 "equality_operator"
 			 [(match_operand:GPR 2 "register_operand" "d")
 			  (match_operand:GPR 3 "reg_or_0_operand" "dJ")])
-	 (label_ref (match_operand 1 "" ""))
+	 (label_ref (match_operand 0 "" ""))
 	 (pc)))]
   "!TARGET_MIPS16"
 {
   return mips_output_conditional_branch (insn, operands,
-					 MIPS_BRANCH ("b%C0", "%2,%z3,%1"),
-					 MIPS_BRANCH ("b%N0", "%2,%z3,%1"));
+					 MIPS_BRANCH ("b%C1", "%2,%z3,%0"),
+					 MIPS_BRANCH ("b%N1", "%2,%z3,%0"));
 }
-  [(set_attr "type" "branch")
-   (set_attr "mode" "none")])
+  [(set_attr "type" "branch")])
 
 (define_insn "*branch_equality<mode>_inverted"
   [(set (pc)
 	(if_then_else
-	 (match_operator 0 "equality_operator"
+	 (match_operator 1 "equality_operator"
 			 [(match_operand:GPR 2 "register_operand" "d")
 			  (match_operand:GPR 3 "reg_or_0_operand" "dJ")])
 	 (pc)
-	 (label_ref (match_operand 1 "" ""))))]
+	 (label_ref (match_operand 0 "" ""))))]
   "!TARGET_MIPS16"
 {
   return mips_output_conditional_branch (insn, operands,
-					 MIPS_BRANCH ("b%N0", "%2,%z3,%1"),
-					 MIPS_BRANCH ("b%C0", "%2,%z3,%1"));
+					 MIPS_BRANCH ("b%N1", "%2,%z3,%0"),
+					 MIPS_BRANCH ("b%C1", "%2,%z3,%0"));
 }
-  [(set_attr "type" "branch")
-   (set_attr "mode" "none")])
+  [(set_attr "type" "branch")])
 
 ;; MIPS16 branches
 
@@ -5271,8 +5311,7 @@ (define_insn "*branch_equality<mode>_mip
 	return "bt%N0z\t%3";
     }
 }
-  [(set_attr "type" "branch")
-   (set_attr "mode" "none")])
+  [(set_attr "type" "branch")])
 
 (define_expand "cbranch<mode>4"
   [(set (pc)
@@ -5313,42 +5352,40 @@ (define_insn "*branch_bit<bbv><mode>"
   [(set (pc)
 	(if_then_else
 	 (equality_op (zero_extract:GPR
-		       (match_operand:GPR 0 "register_operand" "d")
+		       (match_operand:GPR 1 "register_operand" "d")
 		       (const_int 1)
 		       (match_operand 2 "const_int_operand" ""))
 		      (const_int 0))
-	 (label_ref (match_operand 1 ""))
+	 (label_ref (match_operand 0 ""))
 	 (pc)))]
   "ISA_HAS_BBIT && UINTVAL (operands[2]) < GET_MODE_BITSIZE (<MODE>mode)"
 {
   return
     mips_output_conditional_branch (insn, operands,
-				    MIPS_BRANCH ("bbit<bbv>", "%0,%2,%1"),
-				    MIPS_BRANCH ("bbit<bbinv>", "%0,%2,%1"));
+				    MIPS_BRANCH ("bbit<bbv>", "%1,%2,%0"),
+				    MIPS_BRANCH ("bbit<bbinv>", "%1,%2,%0"));
 }
   [(set_attr "type"	     "branch")
-   (set_attr "mode"	     "none")
    (set_attr "branch_likely" "no")])
 
 (define_insn "*branch_bit<bbv><mode>_inverted"
   [(set (pc)
 	(if_then_else
 	 (equality_op (zero_extract:GPR
-		       (match_operand:GPR 0 "register_operand" "d")
+		       (match_operand:GPR 1 "register_operand" "d")
 		       (const_int 1)
 		       (match_operand 2 "const_int_operand" ""))
 		      (const_int 0))
 	 (pc)
-	 (label_ref (match_operand 1 ""))))]
+	 (label_ref (match_operand 0 ""))))]
   "ISA_HAS_BBIT && UINTVAL (operands[2]) < GET_MODE_BITSIZE (<MODE>mode)"
 {
   return
     mips_output_conditional_branch (insn, operands,
-				    MIPS_BRANCH ("bbit<bbinv>", "%0,%2,%1"),
-				    MIPS_BRANCH ("bbit<bbv>", "%0,%2,%1"));
+				    MIPS_BRANCH ("bbit<bbinv>", "%1,%2,%0"),
+				    MIPS_BRANCH ("bbit<bbv>", "%1,%2,%0"));
 }
   [(set_attr "type"	     "branch")
-   (set_attr "mode"	     "none")
    (set_attr "branch_likely" "no")])
 
 ;;
@@ -5535,47 +5572,41 @@ (define_insn "s<code>_<mode>"
 
 ;; Unconditional branches.
 
-(define_insn "jump"
+(define_expand "jump"
   [(set (pc)
-	(label_ref (match_operand 0 "" "")))]
-  "!TARGET_MIPS16"
+	(label_ref (match_operand 0)))])
+
+(define_insn "*jump_absolute"
+  [(set (pc)
+	(label_ref (match_operand 0)))]
+  "!TARGET_MIPS16 && TARGET_ABSOLUTE_JUMPS"
+  { return MIPS_ABSOLUTE_JUMP ("%*j\t%l0%/"); }
+  [(set_attr "type" "jump")])
+
+(define_insn "*jump_pic"
+  [(set (pc)
+	(label_ref (match_operand 0)))]
+  "!TARGET_MIPS16 && !TARGET_ABSOLUTE_JUMPS"
 {
-  if (flag_pic)
+  if (get_attr_length (insn) <= 8)
+    return "%*b\t%l0%/";
+  else
     {
-      if (get_attr_length (insn) <= 8)
-	return "%*b\t%l0%/";
-      else
-	{
-	  output_asm_insn (mips_output_load_label (), operands);
-	  return "%*jr\t%@%/%]";
-	}
+      mips_output_load_label (operands[0]);
+      return "%*jr\t%@%/%]";
     }
-  else
-    return "%*j\t%l0%/";
 }
-  [(set_attr "type"	"jump")
-   (set_attr "mode"	"none")
-   (set (attr "length")
-	;; We can't use `j' when emitting PIC.  Emit a branch if it's
-	;; in range, otherwise load the address of the branch target into
-	;; $at and then jump to it.
-	(if_then_else
-	 (ior (eq (symbol_ref "flag_pic") (const_int 0))
-	      (lt (abs (minus (match_dup 0)
-			      (plus (pc) (const_int 4))))
-		  (const_int 131072)))
-	 (const_int 4) (const_int 16)))])
+  [(set_attr "type" "branch")])
 
 ;; We need a different insn for the mips16, because a mips16 branch
 ;; does not have a delay slot.
 
-(define_insn ""
+(define_insn "*jump_mips16"
   [(set (pc)
 	(label_ref (match_operand 0 "" "")))]
   "TARGET_MIPS16"
   "b\t%l0"
-  [(set_attr "type" "branch")
-   (set_attr "mode" "none")])
+  [(set_attr "type" "branch")])
 
 (define_expand "indirect_jump"
   [(set (pc) (match_operand 0 "register_operand"))]
@@ -5876,14 +5907,28 @@ (define_insn_and_split "restore_gp"
    (clobber (match_scratch:SI 0 "=&d"))]
   "TARGET_CALL_CLOBBERED_GP"
   "#"
-  "&& reload_completed"
+  "&& epilogue_completed"
   [(const_int 0)]
 {
-  mips_restore_gp (operands[0]);
+  mips_restore_gp_from_cprestore_slot (operands[0]);
   DONE;
 }
-  [(set_attr "type" "load")
-   (set_attr "length" "12")])
+  [(set_attr "type" "ghost")])
+
+;; Move between $gp and its register save slot.
+(define_insn_and_split "move_gp<mode>"
+  [(set (match_operand:GPR 0 "nonimmediate_operand" "=d,m")
+  	(unspec:GPR [(match_operand:GPR 1 "move_operand" "m,d")]
+		    UNSPEC_MOVE_GP))]
+  ""
+  { return mips_chosen_to_use_gp_p () ? "#" : ""; }
+  "mips_chosen_to_use_gp_p ()"
+  [(const_int 0)]
+{
+  mips_emit_move (operands[0], operands[1]);
+  DONE;
+}
+  [(set_attr "type" "ghost")])
 
 ;;
 ;;  ....................
Index: gcc/config/mips/mips-dsp.md
===================================================================
--- gcc/config/mips/mips-dsp.md	2009-08-31 08:38:27.000000000 +0100
+++ gcc/config/mips/mips-dsp.md	2009-08-31 08:39:25.000000000 +0100
@@ -1102,11 +1102,10 @@ (define_insn "mips_lwx_<mode>"
 (define_insn "mips_bposge"
   [(set (pc)
 	(if_then_else (ge (reg:CCDSP CCDSP_PO_REGNUM)
-			  (match_operand:SI 0 "immediate_operand" "I"))
-		      (label_ref (match_operand 1 "" ""))
+			  (match_operand:SI 1 "immediate_operand" "I"))
+		      (label_ref (match_operand 0 "" ""))
 		      (pc)))]
   "ISA_HAS_DSP"
-  "%*bposge%0\t%1%/"
-  [(set_attr "type"	"branch")
-   (set_attr "mode"	"none")])
+  "%*bposge%1\t%0%/"
+  [(set_attr "type"	"branch")])
 
Index: gcc/config/mips/mips-ps-3d.md
===================================================================
--- gcc/config/mips/mips-ps-3d.md	2009-08-31 08:38:27.000000000 +0100
+++ gcc/config/mips/mips-ps-3d.md	2009-08-31 08:39:25.000000000 +0100
@@ -439,50 +439,46 @@ (define_insn "s<code>_ps"
 ; Branch on Any of Four Floating Point Condition Codes True
 (define_insn "bc1any4t"
   [(set (pc)
-	(if_then_else (ne (match_operand:CCV4 0 "register_operand" "z")
+	(if_then_else (ne (match_operand:CCV4 1 "register_operand" "z")
 			  (const_int 0))
-		      (label_ref (match_operand 1 "" ""))
+		      (label_ref (match_operand 0 "" ""))
 		      (pc)))]
   "TARGET_HARD_FLOAT && TARGET_PAIRED_SINGLE_FLOAT"
-  "%*bc1any4t\t%0,%1%/"
-  [(set_attr "type" "branch")
-   (set_attr "mode" "none")])
+  "%*bc1any4t\t%1,%0%/"
+  [(set_attr "type" "branch")])
 
 ; Branch on Any of Four Floating Point Condition Codes False
 (define_insn "bc1any4f"
   [(set (pc)
-	(if_then_else (ne (match_operand:CCV4 0 "register_operand" "z")
+	(if_then_else (ne (match_operand:CCV4 1 "register_operand" "z")
 			  (const_int -1))
-		      (label_ref (match_operand 1 "" ""))
+		      (label_ref (match_operand 0 "" ""))
 		      (pc)))]
   "TARGET_HARD_FLOAT && TARGET_PAIRED_SINGLE_FLOAT"
-  "%*bc1any4f\t%0,%1%/"
-  [(set_attr "type" "branch")
-   (set_attr "mode" "none")])
+  "%*bc1any4f\t%1,%0%/"
+  [(set_attr "type" "branch")])
 
 ; Branch on Any of Two Floating Point Condition Codes True
 (define_insn "bc1any2t"
   [(set (pc)
-	(if_then_else (ne (match_operand:CCV2 0 "register_operand" "z")
+	(if_then_else (ne (match_operand:CCV2 1 "register_operand" "z")
 			  (const_int 0))
-		      (label_ref (match_operand 1 "" ""))
+		      (label_ref (match_operand 0 "" ""))
 		      (pc)))]
   "TARGET_HARD_FLOAT && TARGET_PAIRED_SINGLE_FLOAT"
-  "%*bc1any2t\t%0,%1%/"
-  [(set_attr "type" "branch")
-   (set_attr "mode" "none")])
+  "%*bc1any2t\t%1,%0%/"
+  [(set_attr "type" "branch")])
 
 ; Branch on Any of Two Floating Point Condition Codes False
 (define_insn "bc1any2f"
   [(set (pc)
-	(if_then_else (ne (match_operand:CCV2 0 "register_operand" "z")
+	(if_then_else (ne (match_operand:CCV2 1 "register_operand" "z")
 			  (const_int -1))
-		      (label_ref (match_operand 1 "" ""))
+		      (label_ref (match_operand 0 "" ""))
 		      (pc)))]
   "TARGET_HARD_FLOAT && TARGET_PAIRED_SINGLE_FLOAT"
-  "%*bc1any2f\t%0,%1%/"
-  [(set_attr "type" "branch")
-   (set_attr "mode" "none")])
+  "%*bc1any2f\t%1,%0%/"
+  [(set_attr "type" "branch")])
 
 ; Used to access one register in a CCV2 pair.  Operand 0 is the register
 ; pair and operand 1 is the index of the register we want (a CONST_INT).
@@ -497,45 +493,43 @@ (define_expand "single_cc"
 (define_insn "*branch_upper_lower"
   [(set (pc)
         (if_then_else
-	 (match_operator 0 "equality_operator"
+	 (match_operator 1 "equality_operator"
 	    [(unspec:CC [(match_operand:CCV2 2 "register_operand" "z")
 			 (match_operand 3 "const_int_operand")]
 			UNSPEC_SINGLE_CC)
 	     (const_int 0)])
-	 (label_ref (match_operand 1 "" ""))
+	 (label_ref (match_operand 0 "" ""))
 	 (pc)))]
   "TARGET_HARD_FLOAT"
 {
   operands[2]
     = gen_rtx_REG (CCmode, REGNO (operands[2]) + INTVAL (operands[3]));
   return mips_output_conditional_branch (insn, operands,
-					 MIPS_BRANCH ("b%F0", "%2,%1"),
-					 MIPS_BRANCH ("b%W0", "%2,%1"));
+					 MIPS_BRANCH ("b%F1", "%2,%0"),
+					 MIPS_BRANCH ("b%W1", "%2,%0"));
 }
-  [(set_attr "type" "branch")
-   (set_attr "mode" "none")])
+  [(set_attr "type" "branch")])
 
 ; As above, but with the sense of the condition reversed.
 (define_insn "*branch_upper_lower_inverted"
   [(set (pc)
         (if_then_else
-	 (match_operator 0 "equality_operator"
+	 (match_operator 1 "equality_operator"
 	    [(unspec:CC [(match_operand:CCV2 2 "register_operand" "z")
 			 (match_operand 3 "const_int_operand")]
 			UNSPEC_SINGLE_CC)
 	     (const_int 0)])
 	 (pc)
-	 (label_ref (match_operand 1 "" ""))))]
+	 (label_ref (match_operand 0 "" ""))))]
   "TARGET_HARD_FLOAT"
 {
   operands[2]
     = gen_rtx_REG (CCmode, REGNO (operands[2]) + INTVAL (operands[3]));
   return mips_output_conditional_branch (insn, operands,
-					 MIPS_BRANCH ("b%W0", "%2,%1"),
-					 MIPS_BRANCH ("b%F0", "%2,%1"));
+					 MIPS_BRANCH ("b%W1", "%2,%0"),
+					 MIPS_BRANCH ("b%F1", "%2,%0"));
 }
-  [(set_attr "type" "branch")
-   (set_attr "mode" "none")])
+  [(set_attr "type" "branch")])
 
 ;----------------------------------------------------------------------------
 ; Floating Point Reduced Precision Reciprocal Square Root Instructions.
Index: gcc/testsuite/gcc.target/mips/branch-helper.h
===================================================================
--- /dev/null	2009-08-31 08:37:02.501840154 +0100
+++ gcc/testsuite/gcc.target/mips/branch-helper.h	2009-08-31 08:39:25.000000000 +0100
@@ -0,0 +1,37 @@
+/* DN(X) generates 2**N copies of asm instruction X.  */
+#define D0(X) X
+#define D1(X) X "\n\t" X
+#define D2(X) D1 (D1 (X))
+#define D3(X) D2 (D1 (X))
+#define D4(X) D2 (D2 (X))
+#define D5(X) D4 (D1 (X))
+#define D6(X) D4 (D2 (X))
+#define D7(X) D4 (D2 (D1 (X)))
+#define D8(X) D4 (D4 (X))
+#define D9(X) D8 (D1 (X))
+#define D10(X) D8 (D2 (X))
+#define D11(X) D8 (D2 (D1 (X)))
+#define D12(X) D8 (D4 (X))
+#define D13(X) D8 (D4 (D1 (X)))
+#define D14(X) D8 (D4 (D2 (X)))
+
+/* Emit something that is 0x1fff8 bytes long, which is the largest
+   permissible range for non-MIPS16 forward branches.  */
+#define OCCUPY_0x1fff8 \
+  asm (D14 ("nop") "\n\t" \
+       D13 ("nop") "\n\t" \
+       D12 ("nop") "\n\t" \
+       D11 ("nop") "\n\t" \
+       D10 ("nop") "\n\t" \
+       D9 ("nop") "\n\t" \
+       D8 ("nop") "\n\t" \
+       D7 ("nop") "\n\t" \
+       D6 ("nop") "\n\t" \
+       D5 ("nop") "\n\t" \
+       D4 ("nop") "\n\t" \
+       D3 ("nop") "\n\t" \
+       D2 ("nop") "\n\t" \
+       D1 ("nop"))
+
+/* Likewise emit something that is 0x1fffc bytes long.  */
+#define OCCUPY_0x1fffc do { asm ("nop"); OCCUPY_0x1fff8; } while (0)
Index: gcc/testsuite/gcc.target/mips/branch-2.c
===================================================================
--- /dev/null	2009-08-31 08:37:02.501840154 +0100
+++ gcc/testsuite/gcc.target/mips/branch-2.c	2009-08-31 08:39:25.000000000 +0100
@@ -0,0 +1,13 @@
+/* { dg-options "-mabicalls -mshared -mabi=32" } */
+/* { dg-final { scan-assembler-not "(\\\$25|\\\$28|cpload)" } } */
+/* { dg-final { scan-assembler-not "\tjr\t\\\$1\n" } } */
+/* { dg-final { scan-assembler-not "cprestore" } } */
+
+#include "branch-helper.h"
+
+NOMIPS16 void
+foo (volatile int *x)
+{
+  if (__builtin_expect (*x == 0, 1))
+    OCCUPY_0x1fff8;
+}
Index: gcc/testsuite/gcc.target/mips/branch-3.c
===================================================================
--- /dev/null	2009-08-31 08:37:02.501840154 +0100
+++ gcc/testsuite/gcc.target/mips/branch-3.c	2009-08-31 08:39:25.000000000 +0100
@@ -0,0 +1,13 @@
+/* { dg-options "-mabicalls -mshared -mabi=32" } */
+/* { dg-final { scan-assembler "\t\\.cpload\t\\\$25\n" } } */
+/* { dg-final { scan-assembler "\tjr\t\\\$1\n" } } */
+/* { dg-final { scan-assembler-not "cprestore" } } */
+
+#include "branch-helper.h"
+
+NOMIPS16 void
+foo (volatile int *x)
+{
+  if (__builtin_expect (*x == 0, 1))
+    OCCUPY_0x1fffc;
+}
Index: gcc/testsuite/gcc.target/mips/branch-4.c
===================================================================
--- /dev/null	2009-08-31 08:37:02.501840154 +0100
+++ gcc/testsuite/gcc.target/mips/branch-4.c	2009-08-31 08:39:25.000000000 +0100
@@ -0,0 +1,12 @@
+/* { dg-options "-mabicalls -mshared -mabi=n32" } */
+/* { dg-final { scan-assembler-not "(\\\$25|\\\$28|%gp_rel|%got)" } } */
+/* { dg-final { scan-assembler-not "\tjr\t\\\$1\n" } } */
+
+#include "branch-helper.h"
+
+NOMIPS16 void
+foo (volatile int *x)
+{
+  if (__builtin_expect (*x == 0, 1))
+    OCCUPY_0x1fff8;
+}
Index: gcc/testsuite/gcc.target/mips/branch-5.c
===================================================================
--- /dev/null	2009-08-31 08:37:02.501840154 +0100
+++ gcc/testsuite/gcc.target/mips/branch-5.c	2009-08-31 08:39:25.000000000 +0100
@@ -0,0 +1,14 @@
+/* { dg-options "-mabicalls -mshared -mabi=n32" } */
+/* { dg-final { scan-assembler "\taddiu\t\\\$3,\\\$3,%lo\\(%neg\\(%gp_rel\\(foo\\)\\)\\)\n" } } */
+/* { dg-final { scan-assembler "\tlw\t\\\$1,%got_page\\(\[^)\]*\\)\\(\\\$3\\)\\n" } } */
+/* { dg-final { scan-assembler "\tjr\t\\\$1\n" } } */
+/* { dg-final { scan-assembler-not "\\\$28" } } */
+
+#include "branch-helper.h"
+
+NOMIPS16 void
+foo (volatile int *x)
+{
+  if (__builtin_expect (*x == 0, 1))
+    OCCUPY_0x1fffc;
+}
Index: gcc/testsuite/gcc.target/mips/branch-6.c
===================================================================
--- /dev/null	2009-08-31 08:37:02.501840154 +0100
+++ gcc/testsuite/gcc.target/mips/branch-6.c	2009-08-31 08:39:25.000000000 +0100
@@ -0,0 +1,12 @@
+/* { dg-options "-mabicalls -mshared -mabi=64" } */
+/* { dg-final { scan-assembler-not "(\\\$25|\\\$28|%gp_rel|%got)" } } */
+/* { dg-final { scan-assembler-not "\tjr\t\\\$1\n" } } */
+
+#include "branch-helper.h"
+
+NOMIPS16 void
+foo (volatile int *x)
+{
+  if (__builtin_expect (*x == 0, 1))
+    OCCUPY_0x1fff8;
+}
Index: gcc/testsuite/gcc.target/mips/branch-7.c
===================================================================
--- /dev/null	2009-08-31 08:37:02.501840154 +0100
+++ gcc/testsuite/gcc.target/mips/branch-7.c	2009-08-31 08:39:25.000000000 +0100
@@ -0,0 +1,14 @@
+/* { dg-options "-mabicalls -mshared -mabi=64" } */
+/* { dg-final { scan-assembler "\tdaddiu\t\\\$3,\\\$3,%lo\\(%neg\\(%gp_rel\\(foo\\)\\)\\)\n" } } */
+/* { dg-final { scan-assembler "\tld\t\\\$1,%got_page\\(\[^)\]*\\)\\(\\\$3\\)\\n" } } */
+/* { dg-final { scan-assembler "\tjr\t\\\$1\n" } } */
+/* { dg-final { scan-assembler-not "\\\$28" } } */
+
+#include "branch-helper.h"
+
+NOMIPS16 void
+foo (volatile int *x)
+{
+  if (__builtin_expect (*x == 0, 1))
+    OCCUPY_0x1fffc;
+}
Index: gcc/testsuite/gcc.target/mips/branch-8.c
===================================================================
--- /dev/null	2009-08-31 08:37:02.501840154 +0100
+++ gcc/testsuite/gcc.target/mips/branch-8.c	2009-08-31 08:39:25.000000000 +0100
@@ -0,0 +1,13 @@
+/* { dg-options "-mabicalls -mshared -mabi=32" } */
+/* { dg-final { scan-assembler-not "(\\\$28|cpload|cprestore)" } } */
+/* { dg-final { scan-assembler-not "\tjr\t\\\$1\n" } } */
+
+#include "branch-helper.h"
+
+NOMIPS16 void
+foo (void (*bar) (void), volatile int *x)
+{
+  bar ();
+  if (__builtin_expect (*x == 0, 1))
+    OCCUPY_0x1fff8;
+}
Index: gcc/testsuite/gcc.target/mips/branch-9.c
===================================================================
--- /dev/null	2009-08-31 08:37:02.501840154 +0100
+++ gcc/testsuite/gcc.target/mips/branch-9.c	2009-08-31 08:39:25.000000000 +0100
@@ -0,0 +1,18 @@
+/* { dg-options "-mabicalls -mshared -mabi=32" } */
+/* { dg-final { scan-assembler "\t\\.cpload\t\\\$25\n" } } */
+/* { dg-final { scan-assembler "\t\\.cprestore\t16\n" } } */
+/* { dg-final { scan-assembler "\tlw\t\\\$1,16\\(\\\$fp\\)\n" } } */
+/* { dg-final { scan-assembler "\tlw\t\\\$1,%got\\(\[^)\]*\\)\\(\\\$1\\)\n" } } */
+/* { dg-final { scan-assembler "\taddiu\t\\\$1,\\\$1,%lo\\(\[^)\]*\\)\n" } } */
+/* { dg-final { scan-assembler "\tjr\t\\\$1\n" } } */
+/* { dg-final { scan-assembler-not "\tlw\t\\\$28,16\\(\\\$sp\\)\n" } } */
+
+#include "branch-helper.h"
+
+NOMIPS16 void
+foo (void (*bar) (void), volatile int *x)
+{
+  bar ();
+  if (__builtin_expect (*x == 0, 1))
+    OCCUPY_0x1fffc;
+}
Index: gcc/testsuite/gcc.target/mips/branch-10.c
===================================================================
--- /dev/null	2009-08-31 08:37:02.501840154 +0100
+++ gcc/testsuite/gcc.target/mips/branch-10.c	2009-08-31 08:39:25.000000000 +0100
@@ -0,0 +1,13 @@
+/* { dg-options "-mabicalls -mshared -mabi=n32" } */
+/* { dg-final { scan-assembler-not "(\\\$28|%gp_rel|%got)" } } */
+/* { dg-final { scan-assembler-not "\tjr\t\\\$1\n" } } */
+
+#include "branch-helper.h"
+
+NOMIPS16 void
+foo (void (*bar) (void), volatile int *x)
+{
+  bar ();
+  if (__builtin_expect (*x == 0, 1))
+    OCCUPY_0x1fff8;
+}
Index: gcc/testsuite/gcc.target/mips/branch-11.c
===================================================================
--- /dev/null	2009-08-31 08:37:02.501840154 +0100
+++ gcc/testsuite/gcc.target/mips/branch-11.c	2009-08-31 08:39:25.000000000 +0100
@@ -0,0 +1,17 @@
+/* { dg-options "-mabicalls -mshared -mabi=n32" } */
+/* { dg-final { scan-assembler "\tsd\t\\\$28," } } */
+/* { dg-final { scan-assembler "\tld\t\\\$28," } } */
+/* { dg-final { scan-assembler "\taddiu\t\\\$28,\\\$28,%lo\\(%neg\\(%gp_rel\\(foo\\)\\)\\)\n" } } */
+/* { dg-final { scan-assembler "\tlw\t\\\$1,%got_page\\(\[^)\]*\\)\\(\\\$28\\)\n" } } */
+/* { dg-final { scan-assembler "\taddiu\t\\\$1,\\\$1,%got_ofst\\(\[^)\]*\\)\n" } } */
+/* { dg-final { scan-assembler "\tjr\t\\\$1\n" } } */
+
+#include "branch-helper.h"
+
+NOMIPS16 void
+foo (void (*bar) (void), volatile int *x)
+{
+  bar ();
+  if (__builtin_expect (*x == 0, 1))
+    OCCUPY_0x1fffc;
+}
Index: gcc/testsuite/gcc.target/mips/branch-12.c
===================================================================
--- /dev/null	2009-08-31 08:37:02.501840154 +0100
+++ gcc/testsuite/gcc.target/mips/branch-12.c	2009-08-31 08:39:25.000000000 +0100
@@ -0,0 +1,13 @@
+/* { dg-options "-mabicalls -mshared -mabi=64" } */
+/* { dg-final { scan-assembler-not "(\\\$28|%gp_rel|%got)" } } */
+/* { dg-final { scan-assembler-not "\tjr\t\\\$1\n" } } */
+
+#include "branch-helper.h"
+
+NOMIPS16 void
+foo (void (*bar) (void), volatile int *x)
+{
+  bar ();
+  if (__builtin_expect (*x == 0, 1))
+    OCCUPY_0x1fff8;
+}
Index: gcc/testsuite/gcc.target/mips/branch-13.c
===================================================================
--- /dev/null	2009-08-31 08:37:02.501840154 +0100
+++ gcc/testsuite/gcc.target/mips/branch-13.c	2009-08-31 08:39:25.000000000 +0100
@@ -0,0 +1,17 @@
+/* { dg-options "-mabicalls -mshared -mabi=64" } */
+/* { dg-final { scan-assembler "\tsd\t\\\$28," } } */
+/* { dg-final { scan-assembler "\tld\t\\\$28," } } */
+/* { dg-final { scan-assembler "\tdaddiu\t\\\$28,\\\$28,%lo\\(%neg\\(%gp_rel\\(foo\\)\\)\\)\n" } } */
+/* { dg-final { scan-assembler "\tld\t\\\$1,%got_page\\(\[^)\]*\\)\\(\\\$28\\)\n" } } */
+/* { dg-final { scan-assembler "\tdaddiu\t\\\$1,\\\$1,%got_ofst\\(\[^)\]*\\)\n" } } */
+/* { dg-final { scan-assembler "\tjr\t\\\$1\n" } } */
+
+#include "branch-helper.h"
+
+NOMIPS16 void
+foo (void (*bar) (void), volatile int *x)
+{
+  bar ();
+  if (__builtin_expect (*x == 0, 1))
+    OCCUPY_0x1fffc;
+}
Index: gcc/testsuite/gcc.target/mips/branch-14.c
===================================================================
--- /dev/null	2009-08-31 08:37:02.501840154 +0100
+++ gcc/testsuite/gcc.target/mips/branch-14.c	2009-08-31 08:39:25.000000000 +0100
@@ -0,0 +1,23 @@
+/* An executable version of branch-2.c.  */
+/* { dg-do run } */
+
+#include "branch-helper.h"
+
+void __attribute__((noinline))
+foo (volatile int *x)
+{
+  if (__builtin_expect (*x == 0, 1))
+    OCCUPY_0x1fff8;
+}
+
+int
+main (void)
+{
+  int x = 0;
+  int y = 1;
+
+  foo (&x);
+  foo (&y);
+
+  return 0;
+}
Index: gcc/testsuite/gcc.target/mips/branch-15.c
===================================================================
--- /dev/null	2009-08-31 08:37:02.501840154 +0100
+++ gcc/testsuite/gcc.target/mips/branch-15.c	2009-08-31 08:39:25.000000000 +0100
@@ -0,0 +1,23 @@
+/* An executable version of branch-3.c.  */
+/* { dg-do run } */
+
+#include "branch-helper.h"
+
+void
+foo (volatile int *x)
+{
+  if (__builtin_expect (*x == 0, 1))
+    OCCUPY_0x1fffc;
+}
+
+int
+main (void)
+{
+  int x = 0;
+  int y = 1;
+
+  foo (&x);
+  foo (&y);
+
+  return 0;
+}



More information about the Gcc-patches mailing list