[PATCH] ARM: Don't clobber CC reg when it is live after the peephole window
Meador Inge
meadori@codesourcery.com
Wed Jun 5 20:36:00 GMT 2013
Ping.
On 05/29/2013 12:15 PM, Meador Inge wrote:
> Hi All,
>
> This patch fixes a bug in one of the ARM peephole2 optimizations. The
> peephole2 optimization in question was changed to use the CC-updating
> form for all of the instructions produced by the peephole so that the
> encoding will be smaller when compiling for thumb [1]. However, I don't
> think that is always safe.
>
> For example, the CC register might be used by something *after* the
> peephole window. The current peephole will transform:
>
>
> (insn:TI 7 49 18 2 (set (reg:CC 24 cc)
> (compare:CC (reg:SI 3 r3 [orig:136 *endname_1(D) ] [136])
> (const_int 0 [0]))) repro.c:5 212 {*arm_cmpsi_insn}
> (nil))
>
> (insn:TI 18 7 11 2 (cond_exec (ne (reg:CC 24 cc)
> (const_int 0 [0]))
> (set (reg:SI 3 r3 [140])
> (const_int 0 [0]))) repro.c:8 3366 {*p *arm_movsi_vfp}
> (expr_list:REG_EQUIV (const_int 0 [0])
> (nil)))
>
> (insn 11 18 19 2 (cond_exec (eq (reg:CC 24 cc)
> (const_int 0 [0]))
> (set (reg:SI 3 r3 [138])
> (const_int 1 [0x1]))) repro.c:6 3366 {*p *arm_movsi_vfp}
> (expr_list:REG_EQUIV (const_int 1 [0x1])
> (nil)))
>
> (insn:TI 19 11 12 2 (cond_exec (ne (reg:CC 24 cc)
> (const_int 0 [0]))
> (set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32])
> (reg:SI 3 r3 [140]))) repro.c:8 3366 {*p *arm_movsi_vfp}
> (expr_list:REG_DEAD (reg/f:SI 2 r2 [143])
> (nil)))
>
> (insn:TI 12 19 22 2 (cond_exec (eq (reg:CC 24 cc)
> (const_int 0 [0]))
> (set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32])
> (reg:SI 3 r3 [138]))) repro.c:6 3366 {*p *arm_movsi_vfp}
> (nil))
>
> (insn:TI 22 12 58 2 (cond_exec (ne (reg:CC 24 cc)
> (const_int 0 [0]))
> (set (mem:QI (reg/v/f:SI 0 r0 [orig:135 endname ] [135]) [0 *endname_1(D)+0 S1 A8])
> (reg:QI 3 r3 [140]))) repro.c:9 3115 {*p *arm_movqi_insn}
> (expr_list:REG_DEAD (reg:CC 24 cc)
> (expr_list:REG_DEAD (reg:QI 3 r3 [140])
> (expr_list:REG_DEAD (reg/v/f:SI 0 r0 [orig:135 endname ] [135])
> (nil)))))
>
> into the following:
>
>
> (insn 59 49 60 2 (parallel [
> (set (reg:CC 24 cc)
> (compare:CC (reg:SI 3 r3 [orig:136 *endname_1(D) ] [136])
> (const_int 0 [0])))
> (set (reg:SI 1 r1)
> (minus:SI (reg:SI 3 r3 [orig:136 *endname_1(D) ] [136])
> (const_int 0 [0])))
> ]) repro.c:6 -1
> (nil))
>
> (insn 60 59 61 2 (parallel [
> (set (reg:CC 24 cc)
> (compare:CC (const_int 0 [0])
> (reg:SI 1 r1)))
> (set (reg:SI 3 r3 [140])
> (minus:SI (const_int 0 [0])
> (reg:SI 1 r1)))
> ]) repro.c:6 -1
> (nil))
>
> (insn 61 60 19 2 (parallel [
> (set (reg:SI 3 r3 [140])
> (plus:SI (plus:SI (reg:SI 3 r3 [140])
> (reg:SI 1 r1))
> (geu:SI (reg:CC 24 cc)
> (const_int 0 [0]))))
> (clobber (reg:CC 24 cc))
> ]) repro.c:6 -1
> (nil))
>
> (insn:TI 19 61 12 2 (cond_exec (ne (reg:CC 24 cc)
> (const_int 0 [0]))
> (set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32])
> (reg:SI 3 r3 [140]))) repro.c:8 3366 {*p *arm_movsi_vfp}
> (nil))
>
> (insn:TI 12 19 22 2 (cond_exec (eq (reg:CC 24 cc)
> (const_int 0 [0]))
> (set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32])
> (reg:SI 3 r3 [138]))) repro.c:6 3366 {*p *arm_movsi_vfp}
> (expr_list:REG_DEAD (reg/f:SI 2 r2 [143])
> (nil)))
>
> (insn:TI 22 12 58 2 (cond_exec (ne (reg:CC 24 cc)
> (const_int 0 [0]))
> (set (mem:QI (reg/v/f:SI 0 r0 [orig:135 endname ] [135]) [0 *endname_1(D)+0 S1 A8])
> (reg:QI 3 r3 [140]))) repro.c:9 3115 {*p *arm_movqi_insn}
> (expr_list:REG_DEAD (reg:CC 24 cc)
> (expr_list:REG_DEAD (reg:QI 3 r3 [140])
> (expr_list:REG_DEAD (reg/v/f:SI 0 r0 [orig:135 endname ] [135])
> (nil)))))
>
>
> This gets compiled into the incorrect sequence:
>
>
> ldrb r3, [r0, #0]
> ldr r2, .L4
> subs r1, r3, #0
> rsbs r3, r1, #0
> adcs r3, r3, r1
> strne r3, [r2, #0]
> streq r3, [r2, #0]
> strneb r3, [r0, #0]
>
>
> The conditional stores are now dealing with an incorrect condition state.
>
> This patch fixes the problem by ensuring that the CC reg is dead after the
> peephole window for the current peephole definition and falls back on the
> original pre-PR46975 peephole when it is live. Unfortunately I had trouble
> coming up with a reproduction case against mainline. I only noticed the bug
> while working with some local changes that exposed it.
>
> Built and tested a full ARM GNU/Linux toolchain. No regressions in the GCC
> test suite.
>
> OK?
>
> gcc/
>
> 2013-05-29 Meador Inge <meadori@codesourcery.com>
>
> * config/arm/arm.md (conditional move peephole2): Only clobber CC
> register when it is dead after the peephole window.
>
> [1] http://gcc.gnu.org/ml/gcc-patches/2010-12/msg01336.html
>
> Index: gcc/config/arm/arm.md
> ===================================================================
> --- gcc/config/arm/arm.md (revision 199414)
> +++ gcc/config/arm/arm.md (working copy)
> @@ -9978,29 +9978,48 @@
> ;; Attempt to improve the sequence generated by the compare_scc splitters
> ;; not to use conditional execution.
> (define_peephole2
> - [(set (reg:CC CC_REGNUM)
> + [(set (match_operand 0 "cc_register" "")
> (compare:CC (match_operand:SI 1 "register_operand" "")
> (match_operand:SI 2 "arm_rhs_operand" "")))
> (cond_exec (ne (reg:CC CC_REGNUM) (const_int 0))
> - (set (match_operand:SI 0 "register_operand" "") (const_int 0)))
> + (set (match_operand:SI 3 "register_operand" "") (const_int 0)))
> (cond_exec (eq (reg:CC CC_REGNUM) (const_int 0))
> - (set (match_dup 0) (const_int 1)))
> - (match_scratch:SI 3 "r")]
> - "TARGET_32BIT"
> + (set (match_dup 3) (const_int 1)))
> + (match_scratch:SI 4 "r")]
> + "TARGET_32BIT && peep2_reg_dead_p (3, operands[0])"
> [(parallel
> [(set (reg:CC CC_REGNUM)
> (compare:CC (match_dup 1) (match_dup 2)))
> - (set (match_dup 3) (minus:SI (match_dup 1) (match_dup 2)))])
> + (set (match_dup 4) (minus:SI (match_dup 1) (match_dup 2)))])
> (parallel
> [(set (reg:CC CC_REGNUM)
> - (compare:CC (const_int 0) (match_dup 3)))
> - (set (match_dup 0) (minus:SI (const_int 0) (match_dup 3)))])
> + (compare:CC (const_int 0) (match_dup 4)))
> + (set (match_dup 3) (minus:SI (const_int 0) (match_dup 4)))])
> (parallel
> - [(set (match_dup 0)
> - (plus:SI (plus:SI (match_dup 0) (match_dup 3))
> + [(set (match_dup 3)
> + (plus:SI (plus:SI (match_dup 3) (match_dup 4))
> (geu:SI (reg:CC CC_REGNUM) (const_int 0))))
> (clobber (reg:CC CC_REGNUM))])])
>
> +(define_peephole2
> + [(set (reg:CC CC_REGNUM)
> + (compare:CC (match_operand:SI 0 "register_operand" "")
> + (match_operand:SI 1 "arm_rhs_operand" "")))
> + (cond_exec (ne (reg:CC CC_REGNUM) (const_int 0))
> + (set (match_operand:SI 2 "register_operand" "") (const_int 0)))
> + (cond_exec (eq (reg:CC CC_REGNUM) (const_int 0))
> + (set (match_dup 2) (const_int 1)))
> + (match_scratch:SI 3 "r")]
> + "TARGET_32BIT && !peep2_reg_dead_p (3, operands[0])"
> + [(set (match_dup 3) (minus:SI (match_dup 0) (match_dup 1)))
> + (parallel
> + [(set (reg:CC CC_REGNUM)
> + (compare:CC (const_int 0) (match_dup 3)))
> + (set (match_dup 2) (minus:SI (const_int 0) (match_dup 3)))])
> + (set (match_dup 2)
> + (plus:SI (plus:SI (match_dup 2) (match_dup 3))
> + (geu:SI (reg:CC CC_REGNUM) (const_int 0))))])
> +
> (define_insn "*cond_move"
> [(set (match_operand:SI 0 "s_register_operand" "=r,r,r")
> (if_then_else:SI (match_operator 3 "equality_operator"
>
--
Meador Inge
CodeSourcery / Mentor Embedded
More information about the Gcc-patches
mailing list