[Patch ARM] Fix PR50313 and handle PIC addresses properly.

Ramana Radhakrishnan ramana.radhakrishnan@linaro.org
Wed Jan 18 15:04:00 GMT 2012


Hi ,

PR50313 is a case where having the 2 patterns "pic_load_addr_*" and
"pic_add_dot_plus_eight/four" from expand time becomes a problem as
discussed by Jakub in his comment in the audit trail for PR48308.
There is a separate problem in combine as explained by my comment in
the audit trail for PR48308 and my RFC patch for the same sometime
last week.

What I've done in this case is to take the existing patterns merged
them into a UNIFIED form which means that we can hoist the entire
computation out with any of the loop optimizers rather than piece-meal
hoping to have each one being hoisted correctly. I've had to adjust
the cost of this to be equivalent to that of a memory load ( it's in
fact the cost of a memory load + the cost of an add instruction but
it's good enough for the moment). It's reasonably close enough for
arm_size_rtx_costs, and thumb1_rtx_costs pushes this high enough that
it will warrant a hoist anyway. I've also been conservative in
adjusting cannot_copy_insn_p to disallow copying such insns to prevent
any cases where you might have duplicate labels being generated.

Once this happens the original bug report in PR50313 appears to be
fixed and I've had this survive a bootstrap and regression test on an
A9 board and this has successfully cross-built a version of eglibc
with an ARM v7-a configuration (which is currently undergoing
testing).

There are a few ways by which we can get better code - one is to
replace the load from the constant pool with a movw / movt pair of the
actual differences in which case we'd be able to get performant code
in libraries for this case, the other which as RichardE suggested in a
conversation was to allow the splitter to actually generate the
internal labels in the pic_load_addr and pic_add_dot_plus_eight which
could then mean that the label number is really not reserved until
this complex pattern has been split. I don't think these enhancements
are worth doing so late in stage4 and I do want to backport this patch
as it stands into the 4.6 branch.

OK (and to backport to 4.6 branch ?) if no regressions ?

cheers
Ramana


2012-01-18  Ramana Radhakrishnan  <ramana.radhakrishnan@linaro.org>

	PR target/50313
	* config/arm/arm.c (arm_load_pic_register): Use
	gen_pic_load_addr_unified. Delete calls to gen_pic_load_addr_32bit
	, gen_pic_add_dot_plus_eight and gen_pic_add_dot_plus_four.
	(arm_pic_static_addr): Likewise.
	(arm_rtx_costs_1): Adjust cost for UNSPEC_PIC_UNIFIED.
	(arm_note_pic_base): Handle UNSPEC_PIC_UNIFIED.
	* config/arm/arm.md (UNSPEC_PIC_UNIFIED): Define.
	(pic_load_addr_unified): New.
-------------- next part --------------
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 4c310d4..240434f 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -5578,11 +5578,7 @@ arm_load_pic_register (unsigned long saved_regs ATTRIBUTE_UNUSED)
 
       if (TARGET_32BIT)
 	{
-	  emit_insn (gen_pic_load_addr_32bit (pic_reg, pic_rtx));
-	  if (TARGET_ARM)
-	    emit_insn (gen_pic_add_dot_plus_eight (pic_reg, pic_reg, labelno));
-	  else
-	    emit_insn (gen_pic_add_dot_plus_four (pic_reg, pic_reg, labelno));
+	  emit_insn (gen_pic_load_addr_unified (pic_reg, pic_rtx, labelno));
 	}
       else /* TARGET_THUMB1 */
 	{
@@ -5595,10 +5591,10 @@ arm_load_pic_register (unsigned long saved_regs ATTRIBUTE_UNUSED)
 				     thumb_find_work_register (saved_regs));
 	      emit_insn (gen_pic_load_addr_thumb1 (pic_tmp, pic_rtx));
 	      emit_insn (gen_movsi (pic_offset_table_rtx, pic_tmp));
+	      emit_insn (gen_pic_add_dot_plus_four (pic_reg, pic_reg, labelno));
 	    }
 	  else
-	    emit_insn (gen_pic_load_addr_thumb1 (pic_reg, pic_rtx));
-	  emit_insn (gen_pic_add_dot_plus_four (pic_reg, pic_reg, labelno));
+	    emit_insn (gen_pic_load_addr_unified (pic_reg, pic_rtx, labelno));
 	}
     }
 
@@ -5628,20 +5624,7 @@ arm_pic_static_addr (rtx orig, rtx reg)
                                UNSPEC_SYMBOL_OFFSET);
   offset_rtx = gen_rtx_CONST (Pmode, offset_rtx);
 
-  if (TARGET_32BIT)
-    {
-      emit_insn (gen_pic_load_addr_32bit (reg, offset_rtx));
-      if (TARGET_ARM)
-        insn = emit_insn (gen_pic_add_dot_plus_eight (reg, reg, labelno));
-      else
-        insn = emit_insn (gen_pic_add_dot_plus_four (reg, reg, labelno));
-    }
-  else /* TARGET_THUMB1 */
-    {
-      emit_insn (gen_pic_load_addr_thumb1 (reg, offset_rtx));
-      insn = emit_insn (gen_pic_add_dot_plus_four (reg, reg, labelno));
-    }
-
+  insn = emit_insn (gen_pic_load_addr_unified (reg, offset_rtx, labelno));
   return insn;
 }
 
@@ -7648,6 +7631,15 @@ arm_rtx_costs_1 (rtx x, enum rtx_code outer, int* total, bool speed)
 
     case SET:
       return false;
+      
+    case UNSPEC:
+      /* We cost this as high as our memory costs to allow this to
+	 be hoisted from loops.  */
+      if (XINT (x, 1) == UNSPEC_PIC_UNIFIED)
+	{
+	  *total = COSTS_N_INSNS (2 + ARM_NUM_REGS (mode));
+	}
+      return true;
 
     default:
       *total = COSTS_N_INSNS (4);
@@ -10008,7 +10000,8 @@ static int
 arm_note_pic_base (rtx *x, void *date ATTRIBUTE_UNUSED)
 {
   if (GET_CODE (*x) == UNSPEC
-      && XINT (*x, 1) == UNSPEC_PIC_BASE)
+      && (XINT (*x, 1) == UNSPEC_PIC_BASE
+	  || XINT (*x, 1) == UNSPEC_PIC_UNIFIED))
     return 1;
   return 0;
 }
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 5620d7d..8842846 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -116,6 +116,7 @@
 			; unaligned locations, on architectures which support
 			; that.
   UNSPEC_UNALIGNED_STORE ; Same for str/strh.
+  UNSPEC_PIC_UNIFIED    ; Create a common pic addressing form.
 ])
 
 ;; UNSPEC_VOLATILE Usage:
@@ -5601,7 +5602,7 @@
   "flag_pic"
 )
 
-;; Split calculate_pic_address into pic_load_addr_* and a move.
+;; Split calculate_pic_address into pic_load_addr_32bit/thumb1 and a move.
 (define_split
   [(set (match_operand:SI 0 "register_operand" "")
 	(mem:SI (plus:SI (match_operand:SI 1 "register_operand" "")
@@ -5613,6 +5614,29 @@
   "operands[3] = can_create_pseudo_p () ? gen_reg_rtx (SImode) : operands[0];"
 )
 
+;; operand1 is the memory address to go into pic_load_addr_32bit.
+;; operand2 is the PIC label to be emitted from pic_add_dot_plus_*.
+;; We do this to allow hoisting of the entire c
+(define_insn_and_split "pic_load_addr_unified"
+  [(set (match_operand:SI 0 "s_register_operand" "=r,r,l")
+	(unspec:SI [(match_operand:SI 1 "" "mX,mX,mX")
+		    (match_operand:SI 2 "" "")]
+		    UNSPEC_PIC_UNIFIED))]
+ "flag_pic"
+ "#"
+ "flag_pic"
+ [(set (match_dup 3) (unspec:SI [(match_dup 1)] UNSPEC_PIC_SYM))
+  (set (match_dup 0) (unspec:SI [(match_dup 3) (match_dup 4)
+       		     		 (match_dup 2)] UNSPEC_PIC_BASE))]
+ "operands[3] = can_create_pseudo_p () ? gen_reg_rtx (SImode) : operands[0];
+  operands[4] = TARGET_THUMB ? GEN_INT (4) : GEN_INT (8);"
+ [(set_attr "type" "load1,load1,load1")
+  (set_attr "pool_range" "4096,4096,1024")
+  (set_attr "neg_pool_range" "0,4084,0")
+  (set_attr "arch"  "a,t2,t1")
+  (set_attr "length" "8,6,4")]
+)
+
 ;; The rather odd constraints on the following are to force reload to leave
 ;; the insn alone, and to force the minipool generation pass to then move
 ;; the GOT symbol to memory.


More information about the Gcc-patches mailing list