[power7-meissner, ibm-gcc-4_3-branch] Power7 tweaks

Michael Meissner meissner@linux.vnet.ibm.com
Thu Apr 16 15:37:00 GMT 2009


These tweaks are to prevent some of the bad code that is being generated when
the compiler wants to use the branch and decrement instructions in code that
has a switch statement.

The indirect jump and switch table insns prefer to only use the CTR register
and not the LR register, presumably so we don't have to save the LR in a leaf
function.

The doloop_end insn really would prefer to use the CTR register, but grudingly
will allow use of other registers, memmory, or the LR/CTR registers.

The 4.5 compiler in a moment of brilliance (not), decides to use the LR
register to spill the count to, and has to do a move from LR, decrement,
move back to LR, and test, while the 4.3 compiler with the recent power
backend, decides to move the index to a floating point register, and then bails
out when there is no reload problem.

It would be nice if we had some way of determining that for the loop in
question, whether there were any indirect jumps, switches, or calls, so that we
knew we could use the CTR register, but still allow other regions of the code
to use the count register.  Without that, I have restricted the code not to
generate the countdown loops if the function used a switch statement or
indirect jump.  The libcpp/exec.c module shows this up.

I also changed the secondary reload to allow pseudos in address checking, since
at secondary reload time, all of the pseudos are not yet nailed down.

Finally, I fixed some thinkos in power7 costs, and added more debug output for
-mdebug=cost.

-- 
Michael Meissner, IBM
4 Technology Place Drive, MS 2203A, Westford, MA, 01886, USA
meissner@linux.vnet.ibm.com
-------------- next part --------------
2009-04-16  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* config/rs6000/rs6000-protos.h (rs6000_has_indirect_jump_p): New
	declaration.
	(rs6000_set_indirect_jump): Ditto.

	* config/rs6000/rs6000.c (struct machine_function): Add
	indirect_jump_p field.
	(rs6000_override_options): Wrap warning messages in N_().  If
	-mvsx was implicitly set, don't give a warning for -msoft-float,
	just silently turn off vsx.
	(rs6000_secondary_reload_inner): Don't use strict register
	checking, since pseudos may still be present.
	(register_move_cost): If -mdebug=cost, print out cost information.
	(rs6000_memory_move_cost): Ditto.
	(rs6000_has_indirect_jump_p): New function, return true if
	current function has an indirect jump.
	(rs6000_set_indirect_jump): New function, note that an indirect
	jump has been generated.

	* config/rs6000/rs6000.md (indirect_jump): Note that we've
	generated an indirect jump.
	(tablejump): Ditto.
	(doloop_end): Do not generate decrement ctr and branch
	instructions if an indirect jump has been generated.

Index: gcc/config/rs6000/rs6000-protos.h
===================================================================
--- gcc/config/rs6000/rs6000-protos.h	(revision 146116)
+++ gcc/config/rs6000/rs6000-protos.h	(working copy)
@@ -176,6 +176,8 @@ extern int rs6000_register_move_cost (en
 				      enum reg_class, enum reg_class);
 extern int rs6000_memory_move_cost (enum machine_mode, enum reg_class, int);
 extern bool rs6000_tls_referenced_p (rtx);
+extern bool rs6000_has_indirect_jump_p (void);
+extern void rs6000_set_indirect_jump (void);
 extern void rs6000_conditional_register_usage (void);
 
 /* Declare functions in rs6000-c.c */
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 146119)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -130,6 +130,8 @@ typedef struct machine_function GTY(())
      64-bits wide and is allocated early enough so that the offset
      does not overflow the 16-bit load/store offset field.  */
   rtx sdmode_stack_slot;
+  /* Whether an indirect jump or table jump was generated.  */
+  bool indirect_jump_p;
 } machine_function;
 
 /* Target cpu type */
@@ -2132,23 +2134,29 @@ rs6000_override_options (const char *def
       const char *msg = NULL;
       if (!TARGET_HARD_FLOAT || !TARGET_FPRS
 	  || !TARGET_SINGLE_FLOAT || !TARGET_DOUBLE_FLOAT)
-	msg = "-mvsx requires hardware floating point";
+	{
+	  if (target_flags_explicit & MASK_VSX)
+	    msg = N_("-mvsx requires hardware floating point");
+	  else
+	    target_flags &= ~ MASK_VSX;
+	}
       else if (TARGET_PAIRED_FLOAT)
-	msg = "-mvsx and -mpaired are incompatible";
+	msg = N_("-mvsx and -mpaired are incompatible");
       /* The hardware will allow VSX and little endian, but until we make sure
 	 things like vector select, etc. work don't allow VSX on little endian
 	 systems at this point.  */
       else if (!BYTES_BIG_ENDIAN)
-	msg = "-mvsx used with little endian code";
+	msg = N_("-mvsx used with little endian code");
       else if (TARGET_AVOID_XFORM > 0)
-	msg = "-mvsx needs indexed addressing";
+	msg = N_("-mvsx needs indexed addressing");
 
       if (msg)
 	{
 	  warning (0, msg);
-	  target_flags &= MASK_VSX;
+	  target_flags &= ~ MASK_VSX;
 	}
-      else if (!TARGET_ALTIVEC && (target_flags_explicit & MASK_ALTIVEC) == 0)
+      else if (TARGET_VSX && !TARGET_ALTIVEC
+	       && (target_flags_explicit & MASK_ALTIVEC) == 0)
 	target_flags |= MASK_ALTIVEC;
     }
 
@@ -12618,12 +12626,12 @@ rs6000_secondary_reload_inner (rtx reg, 
 	}
 
       if (GET_CODE (addr) == PLUS
-	  && (!rs6000_legitimate_offset_address_p (TImode, addr, true)
+	  && (!rs6000_legitimate_offset_address_p (TImode, addr, false)
 	      || and_op2 != NULL_RTX))
 	{
 	  addr_op1 = XEXP (addr, 0);
 	  addr_op2 = XEXP (addr, 1);
-	  gcc_assert (legitimate_indirect_address_p (addr_op1, true));
+	  gcc_assert (legitimate_indirect_address_p (addr_op1, false));
 
 	  if (!REG_P (addr_op2)
 	      && (GET_CODE (addr_op2) != CONST_INT
@@ -12642,8 +12650,8 @@ rs6000_secondary_reload_inner (rtx reg, 
 	  addr = scratch_or_premodify;
 	  scratch_or_premodify = scratch;
 	}
-      else if (!legitimate_indirect_address_p (addr, true)
-	       && !rs6000_legitimate_offset_address_p (TImode, addr, true))
+      else if (!legitimate_indirect_address_p (addr, false)
+	       && !rs6000_legitimate_offset_address_p (TImode, addr, false))
 	{
 	  rs6000_emit_move (scratch_or_premodify, addr, Pmode);
 	  addr = scratch_or_premodify;
@@ -12672,24 +12680,24 @@ rs6000_secondary_reload_inner (rtx reg, 
       if (GET_CODE (addr) == PRE_MODIFY
 	  && (!VECTOR_MEM_VSX_P (mode)
 	      || and_op2 != NULL_RTX
-	      || !legitimate_indexed_address_p (XEXP (addr, 1), true)))
+	      || !legitimate_indexed_address_p (XEXP (addr, 1), false)))
 	{
 	  scratch_or_premodify = XEXP (addr, 0);
 	  gcc_assert (legitimate_indirect_address_p (scratch_or_premodify,
-						     true));
+						     false));
 	  gcc_assert (GET_CODE (XEXP (addr, 1)) == PLUS);
 	  addr = XEXP (addr, 1);
 	}
 
-      if (legitimate_indirect_address_p (addr, true)	/* reg */
-	  || legitimate_indexed_address_p (addr, true)	/* reg+reg */
+      if (legitimate_indirect_address_p (addr, false)	/* reg */
+	  || legitimate_indexed_address_p (addr, false)	/* reg+reg */
 	  || GET_CODE (addr) == PRE_MODIFY		/* VSX pre-modify */
 	  || GET_CODE (addr) == AND			/* Altivec memory */
 	  || (rclass == FLOAT_REGS			/* legacy float mem */
 	      && GET_MODE_SIZE (mode) == 8
 	      && and_op2 == NULL_RTX
 	      && scratch_or_premodify == scratch
-	      && rs6000_legitimate_offset_address_p (mode, addr, true)))
+	      && rs6000_legitimate_offset_address_p (mode, addr, false)))
 	;
 
       else if (GET_CODE (addr) == PLUS)
@@ -12709,7 +12717,7 @@ rs6000_secondary_reload_inner (rtx reg, 
 	}
 
       else if (GET_CODE (addr) == SYMBOL_REF || GET_CODE (addr) == CONST
-	       || GET_CODE (addr) == CONST_INT)
+	       || GET_CODE (addr) == CONST_INT || REG_P (addr))
 	{
 	  rs6000_emit_move (scratch_or_premodify, addr, Pmode);
 	  addr = scratch_or_premodify;
@@ -12741,7 +12749,7 @@ rs6000_secondary_reload_inner (rtx reg, 
      andi. instruction.  */
   if (and_op2 != NULL_RTX)
     {
-      if (! legitimate_indirect_address_p (addr, true))
+      if (! legitimate_indirect_address_p (addr, false))
 	{
 	  emit_insn (gen_rtx_SET (VOIDmode, scratch, addr));
 	  addr = scratch;
@@ -23647,6 +23655,8 @@ int
 rs6000_register_move_cost (enum machine_mode mode,
 			   enum reg_class from, enum reg_class to)
 {
+  int ret;
+
   /*  Moves from/to GENERAL_REGS.  */
   if (reg_classes_intersect_p (to, GENERAL_REGS)
       || reg_classes_intersect_p (from, GENERAL_REGS))
@@ -23655,39 +23665,47 @@ rs6000_register_move_cost (enum machine_
 	from = to;
 
       if (from == FLOAT_REGS || from == ALTIVEC_REGS || from == VSX_REGS)
-	return (rs6000_memory_move_cost (mode, from, 0)
-		+ rs6000_memory_move_cost (mode, GENERAL_REGS, 0));
+	ret = (rs6000_memory_move_cost (mode, from, 0)
+	       + rs6000_memory_move_cost (mode, GENERAL_REGS, 0));
 
       /* It's more expensive to move CR_REGS than CR0_REGS because of the
 	 shift.  */
       else if (from == CR_REGS)
-	return 4;
+	ret = 4;
 
       /* Power6 has slower LR/CTR moves so make them more expensive than
 	 memory in order to bias spills to memory .*/
       else if (rs6000_cpu == PROCESSOR_POWER6
 	       && reg_classes_intersect_p (from, LINK_OR_CTR_REGS))
-        return 6 * hard_regno_nregs[0][mode];
+        ret = 6 * hard_regno_nregs[0][mode];
 
       else
 	/* A move will cost one instruction per GPR moved.  */
-	return 2 * hard_regno_nregs[0][mode];
+	ret = 2 * hard_regno_nregs[0][mode];
     }
 
   /* If we have VSX, we can easily move between FPR or Altivec registers.  */
-  else if (TARGET_VSX
-	   && ((from == VSX_REGS || from == FLOAT_REGS || from == ALTIVEC_REGS)
-	       || (to == VSX_REGS || to == FLOAT_REGS || to == ALTIVEC_REGS)))
-    return 2;
+  else if (VECTOR_UNIT_VSX_P (mode)
+	   && reg_classes_intersect_p (to, VSX_REGS)
+	   && reg_classes_intersect_p (from, VSX_REGS))
+    ret = 2 * hard_regno_nregs[32][mode];
 
   /* Moving between two similar registers is just one instruction.  */
   else if (reg_classes_intersect_p (to, from))
-    return (mode == TFmode || mode == TDmode) ? 4 : 2;
+    ret = (mode == TFmode || mode == TDmode) ? 4 : 2;
 
   /* Everything else has to go through GENERAL_REGS.  */
   else
-    return (rs6000_register_move_cost (mode, GENERAL_REGS, to)
-	    + rs6000_register_move_cost (mode, from, GENERAL_REGS));
+    ret = (rs6000_register_move_cost (mode, GENERAL_REGS, to)
+	   + rs6000_register_move_cost (mode, from, GENERAL_REGS));
+
+  if (TARGET_DEBUG_COST)
+    fprintf (stderr,
+	     "rs6000_register_move_cost:, ret=%d, mode=%s, from=%s, to=%s\n",
+	     ret, GET_MODE_NAME (mode), reg_class_names[from],
+	     reg_class_names[to]);
+
+  return ret;
 }
 
 /* A C expressions returning the cost of moving data of MODE from a register to
@@ -23697,14 +23715,23 @@ int
 rs6000_memory_move_cost (enum machine_mode mode, enum reg_class rclass,
 			 int in ATTRIBUTE_UNUSED)
 {
+  int ret;
+
   if (reg_classes_intersect_p (rclass, GENERAL_REGS))
-    return 4 * hard_regno_nregs[0][mode];
+    ret = 4 * hard_regno_nregs[0][mode];
   else if (reg_classes_intersect_p (rclass, FLOAT_REGS))
-    return 4 * hard_regno_nregs[32][mode];
+    ret = 4 * hard_regno_nregs[32][mode];
   else if (reg_classes_intersect_p (rclass, ALTIVEC_REGS))
-    return 4 * hard_regno_nregs[FIRST_ALTIVEC_REGNO][mode];
+    ret = 4 * hard_regno_nregs[FIRST_ALTIVEC_REGNO][mode];
   else
-    return 4 + rs6000_register_move_cost (mode, rclass, GENERAL_REGS);
+    ret = 4 + rs6000_register_move_cost (mode, rclass, GENERAL_REGS);
+
+  if (TARGET_DEBUG_COST)
+    fprintf (stderr,
+	     "rs6000_memory_move_cost: ret=%d, mode=%s, rclass=%s, in=%d\n",
+	     ret, GET_MODE_NAME (mode), reg_class_names[rclass], in);
+
+  return ret;
 }
 
 /* Returns a code for a target-specific builtin that implements
@@ -24424,4 +24451,24 @@ rs6000_final_prescan_insn (rtx insn, rtx
     }
 }
 
+/* Return true if the function has an indirect jump or a table jump.  The compiler
+   prefers the ctr register for such jumps, which interferes with using the decrement
+   ctr register and branch.  */
+
+bool
+rs6000_has_indirect_jump_p (void)
+{
+  gcc_assert (cfun && cfun->machine);
+  return cfun->machine->indirect_jump_p;
+}
+
+/* Remember when we've generated an indirect jump.  */
+
+void
+rs6000_set_indirect_jump (void)
+{
+  gcc_assert (cfun && cfun->machine);
+  cfun->machine->indirect_jump_p = true;
+}
+
 #include "gt-rs6000.h"
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 146116)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -14667,7 +14667,11 @@
   [(set_attr "type" "jmpreg")])
 
 (define_expand "indirect_jump"
-  [(set (pc) (match_operand 0 "register_operand" ""))])
+  [(set (pc) (match_operand 0 "register_operand" ""))]
+  ""
+{
+  rs6000_set_indirect_jump ();
+})
 
 (define_insn "*indirect_jump<mode>"
   [(set (pc) (match_operand:P 0 "register_operand" "c,*l"))]
@@ -14682,14 +14686,14 @@
   [(use (match_operand 0 "" ""))
    (use (label_ref (match_operand 1 "" "")))]
   ""
-  "
 {
+  rs6000_set_indirect_jump ();
   if (TARGET_32BIT)
     emit_jump_insn (gen_tablejumpsi (operands[0], operands[1]));
   else
     emit_jump_insn (gen_tablejumpdi (operands[0], operands[1]));
   DONE;
-}")
+})
 
 (define_expand "tablejumpsi"
   [(set (match_dup 3)
@@ -14749,6 +14753,11 @@
   /* Only use this on innermost loops.  */
   if (INTVAL (operands[3]) > 1)
     FAIL;
+  /* Do not try to use decrement and count on code that has an indirect
+     jump or a table jump, because the ctr register is preferred over the
+     lr register.  */
+  if (rs6000_has_indirect_jump_p ())
+    FAIL;
   if (TARGET_64BIT)
     {
       if (GET_MODE (operands[0]) != DImode)
-------------- next part --------------
2009-04-16  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* config/rs6000/rs6000-protos.h (rs6000_has_indirect_jump_p): New
	declaration.
	(rs6000_set_indirect_jump): Ditto.

	* config/rs6000/rs6000.c (struct machine_function): Add
	indirect_jump_p field.
	(rs6000_override_options): Wrap warning messages in N_().  If
	-mvsx was implicitly set, don't give a warning for -msoft-float,
	just silently turn off vsx.
	(rs6000_secondary_reload_inner): Handle more possible combinations
	of addresses.  Don't use strict register checking, since pseudos
	may still be present.
	(register_move_cost): If -mdebug=cost, print out cost information.
	(rs6000_memory_move_cost): Ditto.
	(rs6000_has_indirect_jump_p): New function, return true if
	current function has an indirect jump.
	(rs6000_set_indirect_jump): New function, note that an indirect
	jump has been generated.

	* config/rs6000/rs6000.md (indirect_jump): Note that we've
	generated an indirect jump.
	(tablejump): Ditto.
	(doloop_end): Do not generate decrement ctr and branch
	instructions if an indirect jump has been generated.

	* config/rs6000/vector.md (vec_reload_and_plus_<mptrsize>): Allow
	register+small constant in addition to register+register, and
	restrict the insn to only match during reload and afterwards.
	(vec_reload_and_reg_<mptrsize>): Allow for and of register
	indirect to not generate insn not found message.

2009-04-15  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR driver/39293
	* gcc.c (save_temps_flag): Add support for -save-temps=obj.
	(cpp_options): Ditto.
	(default_compilers): Ditto.
	(display_help): Ditto.
	(process_command): Ditto.
	(do_spec_1): Ditto.
	(set_input): Use lbasename instead of duplicate code.
	(save_temps_prefix): New static for -save-temps=obj.
	(save_temps_length): Ditto.
	
	* doc/invoke.texi (-save-temps=obj): Document new variant to
	-save-temps switch.

Index: gcc/config/rs6000/vector.md
===================================================================
--- gcc/config/rs6000/vector.md	(revision 146007)
+++ gcc/config/rs6000/vector.md	(working copy)
@@ -129,14 +129,15 @@
 })
 
 ;; Reload sometimes tries to move the address to a GPR, and can generate
-;; invalid RTL for addresses involving AND -16.
+;; invalid RTL for addresses involving AND -16.  Allow addresses involving
+;; reg+reg, reg+small constant, or just reg, all wrapped in an AND -16.
 
 (define_insn_and_split "*vec_reload_and_plus_<mptrsize>"
   [(set (match_operand:P 0 "gpc_reg_operand" "=b")
 	(and:P (plus:P (match_operand:P 1 "gpc_reg_operand" "r")
-		       (match_operand:P 2 "gpc_reg_operand" "r"))
+		       (match_operand:P 2 "reg_or_cint_operand" "rI"))
 	       (const_int -16)))]
-  "TARGET_ALTIVEC || TARGET_VSX"
+  "(TARGET_ALTIVEC || TARGET_VSX) && (reload_in_progress || reload_completed)"
   "#"
   "&& reload_completed"
   [(set (match_dup 0)
@@ -146,6 +147,21 @@
 		   (and:P (match_dup 0)
 			  (const_int -16)))
 	      (clobber:CC (scratch:CC))])])
+
+;; The normal ANDSI3/ANDDI3 won't match if reload decides to move an AND -16
+;; address to a register because there is no clobber of a (scratch), so we add
+;; it here.
+(define_insn_and_split "*vec_reload_and_reg_<mptrsize>"
+  [(set (match_operand:P 0 "gpc_reg_operand" "=b")
+	(and:P (match_operand:P 1 "gpc_reg_operand" "r")
+	       (const_int -16)))]
+  "(TARGET_ALTIVEC || TARGET_VSX) && (reload_in_progress || reload_completed)"
+  "#"
+  "&& reload_completed"
+  [(parallel [(set (match_dup 0)
+		   (and:P (match_dup 1)
+			  (const_int -16)))
+	      (clobber:CC (scratch:CC))])])
 
 ;; Generic floating point vector arithmetic support
 (define_expand "add<mode>3"
Index: gcc/config/rs6000/rs6000-protos.h
===================================================================
--- gcc/config/rs6000/rs6000-protos.h	(revision 145942)
+++ gcc/config/rs6000/rs6000-protos.h	(working copy)
@@ -176,6 +176,8 @@ extern int rs6000_register_move_cost (en
 				      enum reg_class, enum reg_class);
 extern int rs6000_memory_move_cost (enum machine_mode, enum reg_class, int);
 extern bool rs6000_tls_referenced_p (rtx);
+extern bool rs6000_has_indirect_jump_p (void);
+extern void rs6000_set_indirect_jump (void);
 extern void rs6000_conditional_register_usage (void);
 
 /* Declare functions in rs6000-c.c */
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 146007)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -130,6 +130,8 @@ typedef struct machine_function GTY(())
      64-bits wide and is allocated early enough so that the offset
      does not overflow the 16-bit load/store offset field.  */
   rtx sdmode_stack_slot;
+  /* Whether an indirect jump or table jump was generated.  */
+  bool indirect_jump_p;
 } machine_function;
 
 /* Target cpu type */
@@ -2116,23 +2118,29 @@ rs6000_override_options (const char *def
       const char *msg = NULL;
       if (!TARGET_HARD_FLOAT || !TARGET_FPRS
 	  || !TARGET_SINGLE_FLOAT || !TARGET_DOUBLE_FLOAT)
-	msg = "-mvsx requires hardware floating point";
+	{
+	  if (target_flags_explicit & MASK_VSX)
+	    msg = N_("-mvsx requires hardware floating point");
+	  else
+	    target_flags &= ~ MASK_VSX;
+	}
       else if (TARGET_PAIRED_FLOAT)
-	msg = "-mvsx and -mpaired are incompatible";
+	msg = N_("-mvsx and -mpaired are incompatible");
       /* The hardware will allow VSX and little endian, but until we make sure
 	 things like vector select, etc. work don't allow VSX on little endian
 	 systems at this point.  */
       else if (!BYTES_BIG_ENDIAN)
-	msg = "-mvsx used with little endian code";
+	msg = N_("-mvsx used with little endian code");
       else if (TARGET_AVOID_XFORM > 0)
-	msg = "-mvsx needs indexed addressing";
+	msg = N_("-mvsx needs indexed addressing");
 
       if (msg)
 	{
 	  warning (0, msg);
-	  target_flags &= MASK_VSX;
+	  target_flags &= ~ MASK_VSX;
 	}
-      else if (!TARGET_ALTIVEC && (target_flags_explicit & MASK_ALTIVEC) == 0)
+      else if (TARGET_VSX && !TARGET_ALTIVEC
+	       && (target_flags_explicit & MASK_ALTIVEC) == 0)
 	target_flags |= MASK_ALTIVEC;
     }
 
@@ -12520,6 +12528,11 @@ rs6000_secondary_reload_inner (rtx reg, 
   enum reg_class rclass;
   rtx addr;
   rtx and_op2 = NULL_RTX;
+  rtx addr_op1;
+  rtx addr_op2;
+  rtx scratch_or_premodify = scratch;
+  rtx and_rtx;
+  rtx cc_clobber;
 
   if (TARGET_DEBUG_ADDR)
     {
@@ -12541,7 +12554,8 @@ rs6000_secondary_reload_inner (rtx reg, 
 
   switch (rclass)
     {
-      /* Move reg+reg addresses into a scratch register for GPRs.  */
+      /* GPRs can handle reg + small constant, all other addresses need to use
+	 the scratch register.  */
     case GENERAL_REGS:
     case BASE_REGS:
       if (GET_CODE (addr) == AND)
@@ -12549,70 +12563,152 @@ rs6000_secondary_reload_inner (rtx reg, 
 	  and_op2 = XEXP (addr, 1);
 	  addr = XEXP (addr, 0);
 	}
+
+      if (GET_CODE (addr) == PRE_MODIFY)
+	{
+	  scratch_or_premodify = XEXP (addr, 0);
+	  gcc_assert (REG_P (scratch_or_premodify));
+	  gcc_assert (GET_CODE (XEXP (addr, 1)) == PLUS);
+	  addr = XEXP (addr, 1);
+	}
+
       if (GET_CODE (addr) == PLUS
-	  && (!rs6000_legitimate_offset_address_p (TImode, addr, true)
+	  && (!rs6000_legitimate_offset_address_p (TImode, addr, false)
 	      || and_op2 != NULL_RTX))
 	{
-	  if (GET_CODE (addr) == SYMBOL_REF || GET_CODE (addr) == CONST
-	      || GET_CODE (addr) == CONST_INT)
-	    rs6000_emit_move (scratch, addr, GET_MODE (addr));
-	  else
-	    emit_insn (gen_rtx_SET (VOIDmode, scratch, addr));
-	  addr = scratch;
+	  addr_op1 = XEXP (addr, 0);
+	  addr_op2 = XEXP (addr, 1);
+	  gcc_assert (legitimate_indirect_address_p (addr_op1, false));
+
+	  if (!REG_P (addr_op2)
+	      && (GET_CODE (addr_op2) != CONST_INT
+		  || !satisfies_constraint_I (addr_op2)))
+	    {
+	      rs6000_emit_move (scratch, addr_op2, Pmode);
+	      addr_op2 = scratch;
+	    }
+
+	  emit_insn (gen_rtx_SET (VOIDmode,
+				  scratch_or_premodify,
+				  gen_rtx_PLUS (Pmode,
+						addr_op1,
+						addr_op2)));
+
+	  addr = scratch_or_premodify;
+	  scratch_or_premodify = scratch;
 	}
-      else if (GET_CODE (addr) == PRE_MODIFY
-	       && REG_P (XEXP (addr, 0))
-	       && GET_CODE (XEXP (addr, 1)) == PLUS)
+      else if (!legitimate_indirect_address_p (addr, false)
+	       && !rs6000_legitimate_offset_address_p (TImode, addr, false))
 	{
-	  emit_insn (gen_rtx_SET (VOIDmode, XEXP (addr, 0), XEXP (addr, 1)));
-	  addr = XEXP (addr, 0);
+	  rs6000_emit_move (scratch_or_premodify, addr, Pmode);
+	  addr = scratch_or_premodify;
+	  scratch_or_premodify = scratch;
 	}
       break;
 
+      /* Float/Altivec registers can only handle reg+reg addressing.  Move
+	 other addresses into a scratch register.  */
+    case FLOAT_REGS:
+    case VSX_REGS:
+    case ALTIVEC_REGS:
+
       /* With float regs, we need to handle the AND ourselves, since we can't
 	 use the Altivec instruction with an implicit AND -16.  Allow scalar
 	 loads to float registers to use reg+offset even if VSX.  */
-    case FLOAT_REGS:
-    case VSX_REGS:
-      if (GET_CODE (addr) == AND)
+      if (GET_CODE (addr) == AND
+	  && (rclass != ALTIVEC_REGS || GET_MODE_SIZE (mode) != 16))
 	{
 	  and_op2 = XEXP (addr, 1);
 	  addr = XEXP (addr, 0);
 	}
-      /* fall through */
 
-      /* Move reg+offset addresses into a scratch register.  */
-    case ALTIVEC_REGS:
-      if (!legitimate_indirect_address_p (addr, true)
-	  && !legitimate_indexed_address_p (addr, true)
-	  && (GET_CODE (addr) != PRE_MODIFY
-	      || !legitimate_indexed_address_p (XEXP (addr, 1), true))
-	  && (rclass != FLOAT_REGS
-	      || GET_MODE_SIZE (mode) != 8
+      /* If we aren't using a VSX load, save the PRE_MODIFY register and use it
+	 as the address later.  */
+      if (GET_CODE (addr) == PRE_MODIFY
+	  && (!VECTOR_MEM_VSX_P (mode)
 	      || and_op2 != NULL_RTX
-	      || !rs6000_legitimate_offset_address_p (mode, addr, true)))
+	      || !legitimate_indexed_address_p (XEXP (addr, 1), false)))
 	{
-	  if (GET_CODE (addr) == SYMBOL_REF || GET_CODE (addr) == CONST
-	      || GET_CODE (addr) == CONST_INT)
-	    rs6000_emit_move (scratch, addr, GET_MODE (addr));
-	  else
-	    emit_insn (gen_rtx_SET (VOIDmode, scratch, addr));
-	  addr = scratch;
+	  scratch_or_premodify = XEXP (addr, 0);
+	  gcc_assert (legitimate_indirect_address_p (scratch_or_premodify,
+						     false));
+	  gcc_assert (GET_CODE (XEXP (addr, 1)) == PLUS);
+	  addr = XEXP (addr, 1);
+	}
+
+      if (legitimate_indirect_address_p (addr, false)	/* reg */
+	  || legitimate_indexed_address_p (addr, false)	/* reg+reg */
+	  || GET_CODE (addr) == PRE_MODIFY		/* VSX pre-modify */
+	  || GET_CODE (addr) == AND			/* Altivec memory */
+	  || (rclass == FLOAT_REGS			/* legacy float mem */
+	      && GET_MODE_SIZE (mode) == 8
+	      && and_op2 == NULL_RTX
+	      && scratch_or_premodify == scratch
+	      && rs6000_legitimate_offset_address_p (mode, addr, false)))
+	;
+
+      else if (GET_CODE (addr) == PLUS)
+	{
+	  addr_op1 = XEXP (addr, 0);
+	  addr_op2 = XEXP (addr, 1);
+	  gcc_assert (REG_P (addr_op1));
+
+	  rs6000_emit_move (scratch, addr_op2, Pmode);
+	  emit_insn (gen_rtx_SET (VOIDmode,
+				  scratch_or_premodify,
+				  gen_rtx_PLUS (Pmode,
+						addr_op1,
+						scratch)));
+	  addr = scratch_or_premodify;
+	  scratch_or_premodify = scratch;
+	}
+
+      else if (GET_CODE (addr) == SYMBOL_REF || GET_CODE (addr) == CONST
+	       || GET_CODE (addr) == CONST_INT || REG_P (addr))
+	{
+	  rs6000_emit_move (scratch_or_premodify, addr, Pmode);
+	  addr = scratch_or_premodify;
+	  scratch_or_premodify = scratch;
 	}
+
+      else
+	gcc_unreachable ();
+
       break;
 
     default:
       gcc_unreachable ();
     }
 
-  /* If the original address involved an AND -16 that is part of the Altivec
-     addresses, recreate the and now.  */
+  /* If the original address involved a pre-modify that we couldn't use the VSX
+     memory instruction with update, and we haven't taken care of already,
+     store the address in the pre-modify register and use that as the
+     address.  */
+  if (scratch_or_premodify != scratch && scratch_or_premodify != addr)
+    {
+      emit_insn (gen_rtx_SET (VOIDmode, scratch_or_premodify, addr));
+      addr = scratch_or_premodify;
+    }
+
+  /* If the original address involved an AND -16 and we couldn't use an ALTIVEC
+     memory instruction, recreate the AND now, including the clobber which is
+     generated by the general ANDSI3/ANDDI3 patterns for the
+     andi. instruction.  */
   if (and_op2 != NULL_RTX)
     {
-      rtx and_rtx = gen_rtx_SET (VOIDmode,
-				 scratch,
-				 gen_rtx_AND (Pmode, addr, and_op2));
-      rtx cc_clobber = gen_rtx_CLOBBER (CCmode, gen_rtx_SCRATCH (CCmode));
+      if (! legitimate_indirect_address_p (addr, false))
+	{
+	  emit_insn (gen_rtx_SET (VOIDmode, scratch, addr));
+	  addr = scratch;
+	}
+
+      and_rtx = gen_rtx_SET (VOIDmode,
+			     scratch,
+			     gen_rtx_AND (Pmode,
+					  addr,
+					  and_op2));
+
+      cc_clobber = gen_rtx_CLOBBER (CCmode, gen_rtx_SCRATCH (CCmode));
       emit_insn (gen_rtx_PARALLEL (VOIDmode,
 				   gen_rtvec (2, and_rtx, cc_clobber)));
       addr = scratch;
@@ -23442,6 +23538,8 @@ int
 rs6000_register_move_cost (enum machine_mode mode,
 			   enum reg_class from, enum reg_class to)
 {
+  int ret;
+
   /*  Moves from/to GENERAL_REGS.  */
   if (reg_classes_intersect_p (to, GENERAL_REGS)
       || reg_classes_intersect_p (from, GENERAL_REGS))
@@ -23450,39 +23548,47 @@ rs6000_register_move_cost (enum machine_
 	from = to;
 
       if (from == FLOAT_REGS || from == ALTIVEC_REGS || from == VSX_REGS)
-	return (rs6000_memory_move_cost (mode, from, 0)
-		+ rs6000_memory_move_cost (mode, GENERAL_REGS, 0));
+	ret = (rs6000_memory_move_cost (mode, from, 0)
+	       + rs6000_memory_move_cost (mode, GENERAL_REGS, 0));
 
       /* It's more expensive to move CR_REGS than CR0_REGS because of the
 	 shift.  */
       else if (from == CR_REGS)
-	return 4;
+	ret = 4;
 
       /* Power6 has slower LR/CTR moves so make them more expensive than
 	 memory in order to bias spills to memory .*/
       else if (rs6000_cpu == PROCESSOR_POWER6
 	       && reg_classes_intersect_p (from, LINK_OR_CTR_REGS))
-        return 6 * hard_regno_nregs[0][mode];
+        ret = 6 * hard_regno_nregs[0][mode];
 
       else
 	/* A move will cost one instruction per GPR moved.  */
-	return 2 * hard_regno_nregs[0][mode];
+	ret = 2 * hard_regno_nregs[0][mode];
     }
 
   /* If we have VSX, we can easily move between FPR or Altivec registers.  */
-  else if (TARGET_VSX
-	   && ((from == VSX_REGS || from == FLOAT_REGS || from == ALTIVEC_REGS)
-	       || (to == VSX_REGS || to == FLOAT_REGS || to == ALTIVEC_REGS)))
-    return 2;
+  else if (VECTOR_UNIT_VSX_P (mode)
+	   && reg_classes_intersect_p (to, VSX_REGS)
+	   && reg_classes_intersect_p (from, VSX_REGS))
+    ret = 2 * hard_regno_nregs[32][mode];
 
   /* Moving between two similar registers is just one instruction.  */
   else if (reg_classes_intersect_p (to, from))
-    return (mode == TFmode || mode == TDmode) ? 4 : 2;
+    ret = (mode == TFmode || mode == TDmode) ? 4 : 2;
 
   /* Everything else has to go through GENERAL_REGS.  */
   else
-    return (rs6000_register_move_cost (mode, GENERAL_REGS, to)
-	    + rs6000_register_move_cost (mode, from, GENERAL_REGS));
+    ret = (rs6000_register_move_cost (mode, GENERAL_REGS, to)
+	   + rs6000_register_move_cost (mode, from, GENERAL_REGS));
+
+  if (TARGET_DEBUG_COST)
+    fprintf (stderr,
+	     "rs6000_register_move_cost:, ret=%d, mode=%s, from=%s, to=%s\n",
+	     ret, GET_MODE_NAME (mode), reg_class_names[from],
+	     reg_class_names[to]);
+
+  return ret;
 }
 
 /* A C expressions returning the cost of moving data of MODE from a register to
@@ -23492,14 +23598,23 @@ int
 rs6000_memory_move_cost (enum machine_mode mode, enum reg_class rclass,
 			 int in ATTRIBUTE_UNUSED)
 {
+  int ret;
+
   if (reg_classes_intersect_p (rclass, GENERAL_REGS))
-    return 4 * hard_regno_nregs[0][mode];
+    ret = 4 * hard_regno_nregs[0][mode];
   else if (reg_classes_intersect_p (rclass, FLOAT_REGS))
-    return 4 * hard_regno_nregs[32][mode];
+    ret = 4 * hard_regno_nregs[32][mode];
   else if (reg_classes_intersect_p (rclass, ALTIVEC_REGS))
-    return 4 * hard_regno_nregs[FIRST_ALTIVEC_REGNO][mode];
+    ret = 4 * hard_regno_nregs[FIRST_ALTIVEC_REGNO][mode];
   else
-    return 4 + rs6000_register_move_cost (mode, rclass, GENERAL_REGS);
+    ret = 4 + rs6000_register_move_cost (mode, rclass, GENERAL_REGS);
+
+  if (TARGET_DEBUG_COST)
+    fprintf (stderr,
+	     "rs6000_memory_move_cost: ret=%d, mode=%s, rclass=%s, in=%d\n",
+	     ret, GET_MODE_NAME (mode), reg_class_names[rclass], in);
+
+  return ret;
 }
 
 /* Returns a code for a target-specific builtin that implements
@@ -24206,4 +24321,24 @@ rs6000_final_prescan_insn (rtx insn, rtx
     }
 }
 
+/* Return true if the function has an indirect jump or a table jump.  The compiler
+   prefers the ctr register for such jumps, which interferes with using the decrement
+   ctr register and branch.  */
+
+bool
+rs6000_has_indirect_jump_p (void)
+{
+  gcc_assert (cfun && cfun->machine);
+  return cfun->machine->indirect_jump_p;
+}
+
+/* Remember when we've generated an indirect jump.  */
+
+void
+rs6000_set_indirect_jump (void)
+{
+  gcc_assert (cfun && cfun->machine);
+  cfun->machine->indirect_jump_p = true;
+}
+
 #include "gt-rs6000.h"
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 146007)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -5532,6 +5532,12 @@
         && !HONOR_SIGNED_ZEROS (DFmode))
        || VECTOR_UNIT_VSX_P (DFmode))"
   {
+     if (VECTOR_UNIT_VSX_P (DFmode))
+       {
+	 emit_insn (gen_vsx_copysigndf3 (operands[0], operands[1],
+					 operands[2]));
+	 DONE;
+       }
      operands[3] = gen_reg_rtx (DFmode);
      operands[4] = gen_reg_rtx (DFmode);
      operands[5] = CONST0_RTX (DFmode);
@@ -14661,7 +14667,11 @@
   [(set_attr "type" "jmpreg")])
 
 (define_expand "indirect_jump"
-  [(set (pc) (match_operand 0 "register_operand" ""))])
+  [(set (pc) (match_operand 0 "register_operand" ""))]
+  ""
+{
+  rs6000_set_indirect_jump ();
+})
 
 (define_insn "*indirect_jump<mode>"
   [(set (pc) (match_operand:P 0 "register_operand" "c,*l"))]
@@ -14676,14 +14686,14 @@
   [(use (match_operand 0 "" ""))
    (use (label_ref (match_operand 1 "" "")))]
   ""
-  "
 {
+  rs6000_set_indirect_jump ();
   if (TARGET_32BIT)
     emit_jump_insn (gen_tablejumpsi (operands[0], operands[1]));
   else
     emit_jump_insn (gen_tablejumpdi (operands[0], operands[1]));
   DONE;
-}")
+})
 
 (define_expand "tablejumpsi"
   [(set (match_dup 3)
@@ -14743,6 +14753,11 @@
   /* Only use this on innermost loops.  */
   if (INTVAL (operands[3]) > 1)
     FAIL;
+  /* Do not try to use decrement and count on code that has an indirect
+     jump or a table jump, because the ctr register is preferred over the
+     lr register.  */
+  if (rs6000_has_indirect_jump_p ())
+    FAIL;
   if (TARGET_64BIT)
     {
       if (GET_MODE (operands[0]) != DImode)
Index: gcc/doc/invoke.texi
===================================================================
--- gcc/doc/invoke.texi	(revision 146007)
+++ gcc/doc/invoke.texi	(working copy)
@@ -310,7 +310,7 @@ Objective-C and Objective-C++ Dialects}.
 -print-multi-directory  -print-multi-lib @gol
 -print-prog-name=@var{program}  -print-search-dirs  -Q @gol
 -print-sysroot-headers-suffix @gol
--save-temps  -time}
+-save-temps -save-temps=cwd -save-temps=obj -time}
 
 @item Optimization Options
 @xref{Optimize Options,,Options that Control Optimization}.
@@ -4980,6 +4980,7 @@ at abort point, control-flow and regions
 four, @option{-fsched-verbose} also includes dependence info.
 
 @item -save-temps
+@itemx -save-temps=cwd
 @opindex save-temps
 Store the usual ``temporary'' intermediate files permanently; place them
 in the current directory and name them based on the source file.  Thus,
@@ -4994,6 +4995,39 @@ input source file with the same extensio
 The corresponding intermediate file may be obtained by renaming the
 source file before using @option{-save-temps}.
 
+If you invoke GCC in parallel, compiling several different source
+files that share a common base name in different subdirectories or the
+same source file compiled for multiple output destinations, it is
+likely that the different parallel compilers will interfere with each
+other, and overwrite the temporary files.  For instance:
+
+@smallexample
+gcc -save-temps -o outdir1/foo.o indir1/foo.c&
+gcc -save-temps -o outdir2/foo.o indir2/foo.c&
+@end smallexample
+
+may result in @file{foo.i} and @file{foo.o} being written to
+simultaneously by both compilers.
+
+@item -save-temps=obj
+@opindex save-temps=obj
+Store the usual ``temporary'' intermediate files permanently.  If the
+@option{-o} option is used, the temporary files are based on the
+object file.  If the @option{-o} option is not used, the
+@option{-save-temps=obj} switch behaves like @option{-save-temps}.
+
+For example:
+
+@smallexample
+gcc -save-temps=obj -c foo.c
+gcc -save-temps=obj -c bar.c -o dir/xbar.o
+gcc -save-temps=obj foobar.c -o dir2/yfoobar
+@end smallexample
+
+would create @file{foo.i}, @file{foo.s}, @file{dir/xbar.i},
+@file{dir/xbar.s}, @file{dir2/yfoobar.i}, @file{dir2/yfoobar.s}, and
+@file{dir2/yfoobar.o}.
+
 @item -time
 @opindex time
 Report the CPU time taken by each subprocess in the compilation
Index: gcc/gcc.c
===================================================================
--- gcc/gcc.c	(revision 145841)
+++ gcc/gcc.c	(working copy)
@@ -219,7 +219,15 @@ static const char *target_sysroot_hdrs_s
 /* Nonzero means write "temp" files in source directory
    and use the source file's name in them, and don't delete them.  */
 
-static int save_temps_flag;
+static enum save_temps {
+  SAVE_TEMPS_NONE,		/* no -save-temps */
+  SAVE_TEMPS_CWD,		/* -save-temps in current directory */
+  SAVE_TEMPS_OBJ		/* -save-temps in object directory */
+} save_temps_flag;
+
+/* Output file to use to get the object directory for -save-temps=obj  */
+static char *save_temps_prefix = 0;
+static size_t save_temps_length = 0;
 
 /* Nonzero means pass multiple source files to the compiler at one time.  */
 
@@ -385,7 +393,8 @@ or with constant text in a single argume
  %i     substitute the name of the input file being processed.
  %b     substitute the basename of the input file being processed.
 	This is the substring up to (and not including) the last period
-	and not including the directory.
+	and not including the directory unless -save-temps was specified
+	to put temporaries in a different location.	
  %B	same as %b, but include the file suffix (text after the last period).
  %gSUFFIX
 	substitute a file name that has suffix SUFFIX and is chosen
@@ -809,7 +818,7 @@ static const char *cpp_unique_options =
 static const char *cpp_options =
 "%(cpp_unique_options) %1 %{m*} %{std*&ansi&trigraphs} %{W*&pedantic*} %{w}\
  %{f*} %{g*:%{!g0:%{!fno-working-directory:-fworking-directory}}} %{O*}\
- %{undef} %{save-temps:-fpch-preprocess}";
+ %{undef} %{save-temps*:-fpch-preprocess}";
 
 /* This contains cpp options which are not passed when the preprocessor
    output will be used by another program.  */
@@ -985,17 +994,17 @@ static const struct compiler default_com
           %{traditional|ftraditional:\
 %eGNU C no longer supports -traditional without -E}\
        %{!combine:\
-	  %{save-temps|traditional-cpp|no-integrated-cpp:%(trad_capable_cpp) \
-		%(cpp_options) -o %{save-temps:%b.i} %{!save-temps:%g.i} \n\
-		    cc1 -fpreprocessed %{save-temps:%b.i} %{!save-temps:%g.i} \
+	  %{save-temps*|traditional-cpp|no-integrated-cpp:%(trad_capable_cpp) \
+		%(cpp_options) -o %{save-temps*:%b.i} %{!save-temps*:%g.i} \n\
+		    cc1 -fpreprocessed %{save-temps*:%b.i} %{!save-temps*:%g.i} \
 			%(cc1_options)}\
-	  %{!save-temps:%{!traditional-cpp:%{!no-integrated-cpp:\
+	  %{!save-temps*:%{!traditional-cpp:%{!no-integrated-cpp:\
 		cc1 %(cpp_unique_options) %(cc1_options)}}}\
           %{!fsyntax-only:%(invoke_as)}} \
       %{combine:\
-	  %{save-temps|traditional-cpp|no-integrated-cpp:%(trad_capable_cpp) \
-		%(cpp_options) -o %{save-temps:%b.i} %{!save-temps:%g.i}}\
-	  %{!save-temps:%{!traditional-cpp:%{!no-integrated-cpp:\
+	  %{save-temps*|traditional-cpp|no-integrated-cpp:%(trad_capable_cpp) \
+		%(cpp_options) -o %{save-temps*:%b.i} %{!save-temps*:%g.i}}\
+	  %{!save-temps*:%{!traditional-cpp:%{!no-integrated-cpp:\
 		cc1 %(cpp_unique_options) %(cc1_options)}}\
                 %{!fsyntax-only:%(invoke_as)}}}}}}", 0, 1, 1},
   {"-",
@@ -1007,13 +1016,13 @@ static const struct compiler default_com
       external preprocessor if -save-temps is given.  */
      "%{E|M|MM:%(trad_capable_cpp) %(cpp_options) %(cpp_debug_options)}\
       %{!E:%{!M:%{!MM:\
-	  %{save-temps|traditional-cpp|no-integrated-cpp:%(trad_capable_cpp) \
-		%(cpp_options) -o %{save-temps:%b.i} %{!save-temps:%g.i} \n\
-		    cc1 -fpreprocessed %{save-temps:%b.i} %{!save-temps:%g.i} \
+	  %{save-temps*|traditional-cpp|no-integrated-cpp:%(trad_capable_cpp) \
+		%(cpp_options) -o %{save-temps*:%b.i} %{!save-temps*:%g.i} \n\
+		    cc1 -fpreprocessed %{save-temps*:%b.i} %{!save-temps*:%g.i} \
 			%(cc1_options)\
                         -o %g.s %{!o*:--output-pch=%i.gch}\
                         %W{o*:--output-pch=%*}%V}\
-	  %{!save-temps:%{!traditional-cpp:%{!no-integrated-cpp:\
+	  %{!save-temps*:%{!traditional-cpp:%{!no-integrated-cpp:\
 		cc1 %(cpp_unique_options) %(cc1_options)\
                     -o %g.s %{!o*:--output-pch=%i.gch}\
                     %W{o*:--output-pch=%*}%V}}}}}}", 0, 0, 0},
@@ -3233,6 +3242,7 @@ display_help (void)
   fputs (_("  -Xlinker <arg>           Pass <arg> on to the linker\n"), stdout);
   fputs (_("  -combine                 Pass multiple source files to compiler at once\n"), stdout);
   fputs (_("  -save-temps              Do not delete intermediate files\n"), stdout);
+  fputs (_("  -save-temps=<arg>        Do not delete intermediate files\n"), stdout);
   fputs (_("  -pipe                    Use pipes rather than intermediate files\n"), stdout);
   fputs (_("  -time                    Time the execution of each subprocess\n"), stdout);
   fputs (_("  -specs=<file>            Override built-in specs with the contents of <file>\n"), stdout);
@@ -3750,9 +3760,20 @@ warranty; not even for MERCHANTABILITY o
 	n_infiles++;
       else if (strcmp (argv[i], "-save-temps") == 0)
 	{
-	  save_temps_flag = 1;
+	  save_temps_flag = SAVE_TEMPS_CWD;
 	  n_switches++;
 	}
+      else if (strncmp (argv[i], "-save-temps=", 12) == 0)
+	{
+	  n_switches++;
+	  if (strcmp (argv[i]+12, "cwd") == 0)
+	    save_temps_flag = SAVE_TEMPS_CWD;
+	  else if (strcmp (argv[i]+12, "obj") == 0
+		   || strcmp (argv[i]+12, "object") == 0)
+	    save_temps_flag = SAVE_TEMPS_OBJ;
+	  else
+	    fatal ("'%s' is an unknown -save-temps option", argv[i]);
+	}
       else if (strcmp (argv[i], "-combine") == 0)
 	{
 	  combine_flag = 1;
@@ -3917,6 +3938,8 @@ warranty; not even for MERCHANTABILITY o
 	      else
 		argv[i] = convert_filename (argv[i], ! have_c, 0);
 #endif
+	      /* Save the output name in case -save-temps=obj was used.  */
+	      save_temps_prefix = xstrdup ((p[1] == 0) ? argv[i + 1] : argv[i] + 1);
 	      goto normal_switch;
 
 	    default:
@@ -3974,6 +3997,25 @@ warranty; not even for MERCHANTABILITY o
 	}
     }
 
+  /* If -save-temps=obj and -o name, create the prefix to use for %b.
+     Otherwise just make -save-temps=obj the same as -save-temps=cwd.  */
+  if (save_temps_flag == SAVE_TEMPS_OBJ && save_temps_prefix != NULL)
+    {
+      save_temps_length = strlen (save_temps_prefix);
+      temp = strrchr (lbasename (save_temps_prefix), '.');
+      if (temp)
+	{
+	  save_temps_length -= strlen (temp);
+	  save_temps_prefix[save_temps_length] = '\0';
+	}
+
+    }
+  else if (save_temps_prefix != NULL)
+    {
+      free (save_temps_prefix);
+      save_temps_prefix = NULL;
+    }
+
   if (save_temps_flag && use_pipes)
     {
       /* -save-temps overrides -pipe, so that temp files are produced */
@@ -4679,12 +4721,18 @@ do_spec_1 (const char *spec, int inswitc
 	    fatal ("spec '%s' invalid", spec);
 
 	  case 'b':
-	    obstack_grow (&obstack, input_basename, basename_length);
+	    if (save_temps_length)
+	      obstack_grow (&obstack, save_temps_prefix, save_temps_length);
+	    else
+	      obstack_grow (&obstack, input_basename, basename_length);
 	    arg_going = 1;
 	    break;
 
 	  case 'B':
-	    obstack_grow (&obstack, input_basename, suffixed_basename_length);
+	    if (save_temps_length)
+	      obstack_grow (&obstack, save_temps_prefix, save_temps_length);
+	    else
+	      obstack_grow (&obstack, input_basename, suffixed_basename_length);
 	    arg_going = 1;
 	    break;
 
@@ -4830,6 +4878,26 @@ do_spec_1 (const char *spec, int inswitc
 		    suffix_length += strlen (TARGET_OBJECT_SUFFIX);
 		  }
 
+		/* If -save-temps=obj and -o were specified, use that for the
+		   temp file.  */
+		if (save_temps_length)
+		  {
+		    char *tmp;
+		    temp_filename_length
+		      = save_temps_length + suffix_length + 1;
+		    tmp = (char *) alloca (temp_filename_length);
+		    memcpy (tmp, save_temps_prefix, save_temps_length);
+		    memcpy (tmp + save_temps_length, suffix, suffix_length);
+		    tmp[save_temps_length + suffix_length] = '\0';
+		    temp_filename = save_string (tmp,
+						 temp_filename_length + 1);
+		    obstack_grow (&obstack, temp_filename,
+				  temp_filename_length);
+		    arg_going = 1;
+		    delete_this_arg = 0;
+		    break;
+		  }
+
 		/* If the input_filename has the same suffix specified
 		   for the %g, %u, or %U, and -save-temps is specified,
 		   we could end up using that file as an intermediate
@@ -4841,13 +4909,14 @@ do_spec_1 (const char *spec, int inswitc
 		if (save_temps_flag)
 		  {
 		    char *tmp;
-		    
-		    temp_filename_length = basename_length + suffix_length;
-		    tmp = alloca (temp_filename_length + 1);
-		    strncpy (tmp, input_basename, basename_length);
-		    strncpy (tmp + basename_length, suffix, suffix_length);
-		    tmp[temp_filename_length] = '\0';
+
+		    temp_filename_length = basename_length + suffix_length + 1;
+		    tmp = (char *) alloca (temp_filename_length);
+		    memcpy (tmp, input_basename, basename_length);
+		    memcpy (tmp + basename_length, suffix, suffix_length);
+		    tmp[basename_length + suffix_length] = '\0';
 		    temp_filename = tmp;
+
 		    if (strcmp (temp_filename, input_filename) != 0)
 		      {
 #ifndef HOST_LACKS_INODE_NUMBERS
@@ -6068,16 +6137,7 @@ set_input (const char *filename)
 
   input_filename = filename;
   input_filename_length = strlen (input_filename);
-
-  input_basename = input_filename;
-#ifdef HAVE_DOS_BASED_FILE_SYSTEM
-  /* Skip drive name so 'x:foo' is handled properly.  */
-  if (input_basename[1] == ':')
-    input_basename += 2;
-#endif
-  for (p = input_basename; *p; p++)
-    if (IS_DIR_SEPARATOR (*p))
-      input_basename = p + 1;
+  input_basename = lbasename (input_filename);
 
   /* Find a suffix starting with the last period,
      and set basename_length to exclude that suffix.  */


More information about the Gcc-patches mailing list