[PATCH, i386]: Avoid partial reg stall with arith insn + setCC + movzbl sequence

Uros Bizjak ubizjak@gmail.com
Mon Mar 12 08:52:00 GMT 2012


Hello!

Attached patch improves setCC + movzbl to xor + setcc peephole2 to
also handle CC setting arithmetic instructions.

2012-03-12  Uros Bizjak  <ubizjak@gmail.com>

	* config/i386/i386.md (setcc + movzbl to xor + setcc peephole2):
	Also convert sequences with CC setting arithmetic instruction.

Tested on x86_64-pc-linux-gnu {, -m32}, committed to mainline SVN.

Uros.
-------------- next part --------------
Index: i386.md
===================================================================
--- i386.md	(revision 185201)
+++ i386.md	(working copy)
@@ -11170,6 +11170,27 @@
   ix86_expand_clear (operands[3]);
 })
 
+(define_peephole2
+  [(parallel [(set (reg FLAGS_REG) (match_operand 0 "" ""))
+	      (match_operand 4 "" "")])
+   (set (match_operand:QI 1 "register_operand" "")
+	(match_operator:QI 2 "ix86_comparison_operator"
+	  [(reg FLAGS_REG) (const_int 0)]))
+   (set (match_operand 3 "q_regs_operand" "")
+	(zero_extend (match_dup 1)))]
+  "(peep2_reg_dead_p (3, operands[1])
+    || operands_match_p (operands[1], operands[3]))
+   && ! reg_overlap_mentioned_p (operands[3], operands[0])"
+  [(parallel [(set (match_dup 5) (match_dup 0))
+	      (match_dup 4)])
+   (set (strict_low_part (match_dup 6))
+	(match_dup 2))]
+{
+  operands[5] = gen_rtx_REG (GET_MODE (operands[0]), FLAGS_REG);
+  operands[6] = gen_lowpart (QImode, operands[3]);
+  ix86_expand_clear (operands[3]);
+})
+
 ;; Similar, but match zero extend with andsi3.
 
 (define_peephole2
@@ -11190,6 +11211,28 @@
   operands[5] = gen_lowpart (QImode, operands[3]);
   ix86_expand_clear (operands[3]);
 })
+
+(define_peephole2
+  [(parallel [(set (reg FLAGS_REG) (match_operand 0 "" ""))
+	      (match_operand 4 "" "")])
+   (set (match_operand:QI 1 "register_operand" "")
+	(match_operator:QI 2 "ix86_comparison_operator"
+	  [(reg FLAGS_REG) (const_int 0)]))
+   (parallel [(set (match_operand 3 "q_regs_operand" "")
+		   (zero_extend (match_dup 1)))
+	      (clobber (reg:CC FLAGS_REG))])]
+  "(peep2_reg_dead_p (3, operands[1])
+    || operands_match_p (operands[1], operands[3]))
+   && ! reg_overlap_mentioned_p (operands[3], operands[0])"
+  [(parallel [(set (match_dup 5) (match_dup 0))
+	      (match_dup 4)])
+   (set (strict_low_part (match_dup 6))
+	(match_dup 2))]
+{
+  operands[5] = gen_rtx_REG (GET_MODE (operands[0]), FLAGS_REG);
+  operands[6] = gen_lowpart (QImode, operands[3]);
+  ix86_expand_clear (operands[3]);
+})
 

 ;; Call instructions.
 


More information about the Gcc-patches mailing list