Hi All,
This patch fixes a bug in one of the ARM peephole2 optimizations. The
peephole2 optimization in question was changed to use the CC-updating
form for all of the instructions produced by the peephole so that the
encoding will be smaller when compiling for thumb [1]. However, I don't
think that is always safe.
For example, the CC register might be used by something *after* the
peephole window. The current peephole will transform:
(insn:TI 7 49 18 2 (set (reg:CC 24 cc)
(compare:CC (reg:SI 3 r3 [orig:136 *endname_1(D) ] [136])
(const_int 0 [0]))) repro.c:5 212 {*arm_cmpsi_insn}
(nil))
(insn:TI 18 7 11 2 (cond_exec (ne (reg:CC 24 cc)
(const_int 0 [0]))
(set (reg:SI 3 r3 [140])
(const_int 0 [0]))) repro.c:8 3366 {*p *arm_movsi_vfp}
(expr_list:REG_EQUIV (const_int 0 [0])
(nil)))
(insn 11 18 19 2 (cond_exec (eq (reg:CC 24 cc)
(const_int 0 [0]))
(set (reg:SI 3 r3 [138])
(const_int 1 [0x1]))) repro.c:6 3366 {*p *arm_movsi_vfp}
(expr_list:REG_EQUIV (const_int 1 [0x1])
(nil)))
(insn:TI 19 11 12 2 (cond_exec (ne (reg:CC 24 cc)
(const_int 0 [0]))
(set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32])
(reg:SI 3 r3 [140]))) repro.c:8 3366 {*p *arm_movsi_vfp}
(expr_list:REG_DEAD (reg/f:SI 2 r2 [143])
(nil)))
(insn:TI 12 19 22 2 (cond_exec (eq (reg:CC 24 cc)
(const_int 0 [0]))
(set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32])
(reg:SI 3 r3 [138]))) repro.c:6 3366 {*p *arm_movsi_vfp}
(nil))
(insn:TI 22 12 58 2 (cond_exec (ne (reg:CC 24 cc)
(const_int 0 [0]))
(set (mem:QI (reg/v/f:SI 0 r0 [orig:135 endname ] [135]) [0 *endname_1(D)+0 S1 A8])
(reg:QI 3 r3 [140]))) repro.c:9 3115 {*p *arm_movqi_insn}
(expr_list:REG_DEAD (reg:CC 24 cc)
(expr_list:REG_DEAD (reg:QI 3 r3 [140])
(expr_list:REG_DEAD (reg/v/f:SI 0 r0 [orig:135 endname ] [135])
(nil)))))
into the following:
(insn 59 49 60 2 (parallel [
(set (reg:CC 24 cc)
(compare:CC (reg:SI 3 r3 [orig:136 *endname_1(D) ] [136])
(const_int 0 [0])))
(set (reg:SI 1 r1)
(minus:SI (reg:SI 3 r3 [orig:136 *endname_1(D) ] [136])
(const_int 0 [0])))
]) repro.c:6 -1
(nil))
(insn 60 59 61 2 (parallel [
(set (reg:CC 24 cc)
(compare:CC (const_int 0 [0])
(reg:SI 1 r1)))
(set (reg:SI 3 r3 [140])
(minus:SI (const_int 0 [0])
(reg:SI 1 r1)))
]) repro.c:6 -1
(nil))
(insn 61 60 19 2 (parallel [
(set (reg:SI 3 r3 [140])
(plus:SI (plus:SI (reg:SI 3 r3 [140])
(reg:SI 1 r1))
(geu:SI (reg:CC 24 cc)
(const_int 0 [0]))))
(clobber (reg:CC 24 cc))
]) repro.c:6 -1
(nil))
(insn:TI 19 61 12 2 (cond_exec (ne (reg:CC 24 cc)
(const_int 0 [0]))
(set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32])
(reg:SI 3 r3 [140]))) repro.c:8 3366 {*p *arm_movsi_vfp}
(nil))
(insn:TI 12 19 22 2 (cond_exec (eq (reg:CC 24 cc)
(const_int 0 [0]))
(set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32])
(reg:SI 3 r3 [138]))) repro.c:6 3366 {*p *arm_movsi_vfp}
(expr_list:REG_DEAD (reg/f:SI 2 r2 [143])
(nil)))
(insn:TI 22 12 58 2 (cond_exec (ne (reg:CC 24 cc)
(const_int 0 [0]))
(set (mem:QI (reg/v/f:SI 0 r0 [orig:135 endname ] [135]) [0 *endname_1(D)+0 S1 A8])
(reg:QI 3 r3 [140]))) repro.c:9 3115 {*p *arm_movqi_insn}
(expr_list:REG_DEAD (reg:CC 24 cc)
(expr_list:REG_DEAD (reg:QI 3 r3 [140])
(expr_list:REG_DEAD (reg/v/f:SI 0 r0 [orig:135 endname ] [135])
(nil)))))
This gets compiled into the incorrect sequence:
ldrb r3, [r0, #0]
ldr r2, .L4
subs r1, r3, #0
rsbs r3, r1, #0
adcs r3, r3, r1
strne r3, [r2, #0]
streq r3, [r2, #0]
strneb r3, [r0, #0]
The conditional stores are now dealing with an incorrect condition state.
This patch fixes the problem by ensuring that the CC reg is dead after the
peephole window for the current peephole definition and falls back on the
original pre-PR46975 peephole when it is live. Unfortunately I had trouble
coming up with a reproduction case against mainline. I only noticed the bug
while working with some local changes that exposed it.
Built and tested a full ARM GNU/Linux toolchain. No regressions in the GCC
test suite.
OK?
gcc/
2013-05-29 Meador Inge <meadori@codesourcery.com>
* config/arm/arm.md (conditional move peephole2): Only clobber CC
register when it is dead after the peephole window.
[1] http://gcc.gnu.org/ml/gcc-patches/2010-12/msg01336.html
Index: gcc/config/arm/arm.md
===================================================================
--- gcc/config/arm/arm.md (revision 199414)
+++ gcc/config/arm/arm.md (working copy)
@@ -9978,29 +9978,48 @@
;; Attempt to improve the sequence generated by the compare_scc splitters
;; not to use conditional execution.
(define_peephole2
- [(set (reg:CC CC_REGNUM)
+ [(set (match_operand 0 "cc_register" "")
(compare:CC (match_operand:SI 1 "register_operand" "")
(match_operand:SI 2 "arm_rhs_operand" "")))
(cond_exec (ne (reg:CC CC_REGNUM) (const_int 0))
- (set (match_operand:SI 0 "register_operand" "") (const_int 0)))
+ (set (match_operand:SI 3 "register_operand" "") (const_int 0)))
(cond_exec (eq (reg:CC CC_REGNUM) (const_int 0))
- (set (match_dup 0) (const_int 1)))
- (match_scratch:SI 3 "r")]
- "TARGET_32BIT"
+ (set (match_dup 3) (const_int 1)))
+ (match_scratch:SI 4 "r")]
+ "TARGET_32BIT && peep2_reg_dead_p (3, operands[0])"
[(parallel
[(set (reg:CC CC_REGNUM)
(compare:CC (match_dup 1) (match_dup 2)))
- (set (match_dup 3) (minus:SI (match_dup 1) (match_dup 2)))])
+ (set (match_dup 4) (minus:SI (match_dup 1) (match_dup 2)))])
(parallel
[(set (reg:CC CC_REGNUM)
- (compare:CC (const_int 0) (match_dup 3)))
- (set (match_dup 0) (minus:SI (const_int 0) (match_dup 3)))])
+ (compare:CC (const_int 0) (match_dup 4)))
+ (set (match_dup 3) (minus:SI (const_int 0) (match_dup 4)))])
(parallel
- [(set (match_dup 0)
- (plus:SI (plus:SI (match_dup 0) (match_dup 3))
+ [(set (match_dup 3)
+ (plus:SI (plus:SI (match_dup 3) (match_dup 4))
(geu:SI (reg:CC CC_REGNUM) (const_int 0))))
(clobber (reg:CC CC_REGNUM))])])