[PATCH, i386]: Avoid partial reg stall with arith insn + setCC + movzbl sequence
Uros Bizjak
ubizjak@gmail.com
Mon Mar 12 08:52:00 GMT 2012
Hello!
Attached patch improves setCC + movzbl to xor + setcc peephole2 to
also handle CC setting arithmetic instructions.
2012-03-12 Uros Bizjak <ubizjak@gmail.com>
* config/i386/i386.md (setcc + movzbl to xor + setcc peephole2):
Also convert sequences with CC setting arithmetic instruction.
Tested on x86_64-pc-linux-gnu {, -m32}, committed to mainline SVN.
Uros.
-------------- next part --------------
Index: i386.md
===================================================================
--- i386.md (revision 185201)
+++ i386.md (working copy)
@@ -11170,6 +11170,27 @@
ix86_expand_clear (operands[3]);
})
+(define_peephole2
+ [(parallel [(set (reg FLAGS_REG) (match_operand 0 "" ""))
+ (match_operand 4 "" "")])
+ (set (match_operand:QI 1 "register_operand" "")
+ (match_operator:QI 2 "ix86_comparison_operator"
+ [(reg FLAGS_REG) (const_int 0)]))
+ (set (match_operand 3 "q_regs_operand" "")
+ (zero_extend (match_dup 1)))]
+ "(peep2_reg_dead_p (3, operands[1])
+ || operands_match_p (operands[1], operands[3]))
+ && ! reg_overlap_mentioned_p (operands[3], operands[0])"
+ [(parallel [(set (match_dup 5) (match_dup 0))
+ (match_dup 4)])
+ (set (strict_low_part (match_dup 6))
+ (match_dup 2))]
+{
+ operands[5] = gen_rtx_REG (GET_MODE (operands[0]), FLAGS_REG);
+ operands[6] = gen_lowpart (QImode, operands[3]);
+ ix86_expand_clear (operands[3]);
+})
+
;; Similar, but match zero extend with andsi3.
(define_peephole2
@@ -11190,6 +11211,28 @@
operands[5] = gen_lowpart (QImode, operands[3]);
ix86_expand_clear (operands[3]);
})
+
+(define_peephole2
+ [(parallel [(set (reg FLAGS_REG) (match_operand 0 "" ""))
+ (match_operand 4 "" "")])
+ (set (match_operand:QI 1 "register_operand" "")
+ (match_operator:QI 2 "ix86_comparison_operator"
+ [(reg FLAGS_REG) (const_int 0)]))
+ (parallel [(set (match_operand 3 "q_regs_operand" "")
+ (zero_extend (match_dup 1)))
+ (clobber (reg:CC FLAGS_REG))])]
+ "(peep2_reg_dead_p (3, operands[1])
+ || operands_match_p (operands[1], operands[3]))
+ && ! reg_overlap_mentioned_p (operands[3], operands[0])"
+ [(parallel [(set (match_dup 5) (match_dup 0))
+ (match_dup 4)])
+ (set (strict_low_part (match_dup 6))
+ (match_dup 2))]
+{
+ operands[5] = gen_rtx_REG (GET_MODE (operands[0]), FLAGS_REG);
+ operands[6] = gen_lowpart (QImode, operands[3]);
+ ix86_expand_clear (operands[3]);
+})
;; Call instructions.
More information about the Gcc-patches
mailing list