This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] ARM: Don't clobber CC reg when it is live after the peephole window


On 29/05/13 18:15, Meador Inge wrote:
Hi All,

This patch fixes a bug in one of the ARM peephole2 optimizations.  The
peephole2 optimization in question was changed to use the CC-updating
form for all of the instructions produced by the peephole so that the
encoding will be smaller when compiling for thumb [1].  However, I don't
think that is always safe.

For example, the CC register might be used by something *after* the
peephole window.  The current peephole will transform:


     (insn:TI 7 49 18 2 (set (reg:CC 24 cc)
             (compare:CC (reg:SI 3 r3 [orig:136 *endname_1(D) ] [136])
                 (const_int 0 [0]))) repro.c:5 212 {*arm_cmpsi_insn}
          (nil))

     (insn:TI 18 7 11 2 (cond_exec (ne (reg:CC 24 cc)
                 (const_int 0 [0]))
             (set (reg:SI 3 r3 [140])
                 (const_int 0 [0]))) repro.c:8 3366 {*p *arm_movsi_vfp}
          (expr_list:REG_EQUIV (const_int 0 [0])
             (nil)))

     (insn 11 18 19 2 (cond_exec (eq (reg:CC 24 cc)
                 (const_int 0 [0]))
             (set (reg:SI 3 r3 [138])
                 (const_int 1 [0x1]))) repro.c:6 3366 {*p *arm_movsi_vfp}
          (expr_list:REG_EQUIV (const_int 1 [0x1])
             (nil)))

     (insn:TI 19 11 12 2 (cond_exec (ne (reg:CC 24 cc)
                 (const_int 0 [0]))
             (set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32])
                 (reg:SI 3 r3 [140]))) repro.c:8 3366 {*p *arm_movsi_vfp}
          (expr_list:REG_DEAD (reg/f:SI 2 r2 [143])
             (nil)))

     (insn:TI 12 19 22 2 (cond_exec (eq (reg:CC 24 cc)
                 (const_int 0 [0]))
             (set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32])
                 (reg:SI 3 r3 [138]))) repro.c:6 3366 {*p *arm_movsi_vfp}
          (nil))

     (insn:TI 22 12 58 2 (cond_exec (ne (reg:CC 24 cc)
                 (const_int 0 [0]))
             (set (mem:QI (reg/v/f:SI 0 r0 [orig:135 endname ] [135]) [0 *endname_1(D)+0 S1 A8])
                 (reg:QI 3 r3 [140]))) repro.c:9 3115 {*p *arm_movqi_insn}
          (expr_list:REG_DEAD (reg:CC 24 cc)
             (expr_list:REG_DEAD (reg:QI 3 r3 [140])
                 (expr_list:REG_DEAD (reg/v/f:SI 0 r0 [orig:135 endname ] [135])
                     (nil)))))

into the following:


     (insn 59 49 60 2 (parallel [
                 (set (reg:CC 24 cc)
                     (compare:CC (reg:SI 3 r3 [orig:136 *endname_1(D) ] [136])
                         (const_int 0 [0])))
                 (set (reg:SI 1 r1)
                     (minus:SI (reg:SI 3 r3 [orig:136 *endname_1(D) ] [136])
                         (const_int 0 [0])))
             ]) repro.c:6 -1
          (nil))

     (insn 60 59 61 2 (parallel [
                 (set (reg:CC 24 cc)
                     (compare:CC (const_int 0 [0])
                         (reg:SI 1 r1)))
                 (set (reg:SI 3 r3 [140])
                     (minus:SI (const_int 0 [0])
                         (reg:SI 1 r1)))
             ]) repro.c:6 -1
          (nil))

     (insn 61 60 19 2 (parallel [
                 (set (reg:SI 3 r3 [140])
                     (plus:SI (plus:SI (reg:SI 3 r3 [140])
                             (reg:SI 1 r1))
                         (geu:SI (reg:CC 24 cc)
                             (const_int 0 [0]))))
                 (clobber (reg:CC 24 cc))
             ]) repro.c:6 -1
          (nil))

     (insn:TI 19 61 12 2 (cond_exec (ne (reg:CC 24 cc)
                 (const_int 0 [0]))
             (set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32])
                 (reg:SI 3 r3 [140]))) repro.c:8 3366 {*p *arm_movsi_vfp}
          (nil))

     (insn:TI 12 19 22 2 (cond_exec (eq (reg:CC 24 cc)
                 (const_int 0 [0]))
             (set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32])
                 (reg:SI 3 r3 [138]))) repro.c:6 3366 {*p *arm_movsi_vfp}
          (expr_list:REG_DEAD (reg/f:SI 2 r2 [143])
             (nil)))

     (insn:TI 22 12 58 2 (cond_exec (ne (reg:CC 24 cc)
                 (const_int 0 [0]))
             (set (mem:QI (reg/v/f:SI 0 r0 [orig:135 endname ] [135]) [0 *endname_1(D)+0 S1 A8])
                 (reg:QI 3 r3 [140]))) repro.c:9 3115 {*p *arm_movqi_insn}
          (expr_list:REG_DEAD (reg:CC 24 cc)
             (expr_list:REG_DEAD (reg:QI 3 r3 [140])
                 (expr_list:REG_DEAD (reg/v/f:SI 0 r0 [orig:135 endname ] [135])
                     (nil)))))


This gets compiled into the incorrect sequence:


     ldrb    r3, [r0, #0]
     ldr     r2, .L4
     subs    r1, r3, #0
     rsbs    r3, r1, #0
     adcs    r3, r3, r1
     strne   r3, [r2, #0]
     streq   r3, [r2, #0]
     strneb r3, [r0, #0]


The conditional stores are now dealing with an incorrect condition state.

This patch fixes the problem by ensuring that the CC reg is dead after the
peephole window for the current peephole definition and falls back on the
original pre-PR46975 peephole when it is live.  Unfortunately I had trouble
coming up with a reproduction case against mainline.  I only noticed the bug
while working with some local changes that exposed it.

Built and tested a full ARM GNU/Linux toolchain.  No regressions in the GCC
test suite.

OK?

gcc/

2013-05-29  Meador Inge  <meadori@codesourcery.com>

	* config/arm/arm.md (conditional move peephole2): Only clobber CC
	register when it is dead after the peephole window.

[1] http://gcc.gnu.org/ml/gcc-patches/2010-12/msg01336.html

Index: gcc/config/arm/arm.md
===================================================================
--- gcc/config/arm/arm.md	(revision 199414)
+++ gcc/config/arm/arm.md	(working copy)
@@ -9978,29 +9978,48 @@
  ;; Attempt to improve the sequence generated by the compare_scc splitters
  ;; not to use conditional execution.
  (define_peephole2
-  [(set (reg:CC CC_REGNUM)
+  [(set (match_operand 0 "cc_register" "")
  	(compare:CC (match_operand:SI 1 "register_operand" "")
  		    (match_operand:SI 2 "arm_rhs_operand" "")))
     (cond_exec (ne (reg:CC CC_REGNUM) (const_int 0))
-	      (set (match_operand:SI 0 "register_operand" "") (const_int 0)))
+	      (set (match_operand:SI 3 "register_operand" "") (const_int 0)))
     (cond_exec (eq (reg:CC CC_REGNUM) (const_int 0))
-	      (set (match_dup 0) (const_int 1)))
-   (match_scratch:SI 3 "r")]
-  "TARGET_32BIT"
+	      (set (match_dup 3) (const_int 1)))
+   (match_scratch:SI 4 "r")]
+  "TARGET_32BIT && peep2_reg_dead_p (3, operands[0])"
    [(parallel
      [(set (reg:CC CC_REGNUM)
  	  (compare:CC (match_dup 1) (match_dup 2)))
-     (set (match_dup 3) (minus:SI (match_dup 1) (match_dup 2)))])
+     (set (match_dup 4) (minus:SI (match_dup 1) (match_dup 2)))])
     (parallel
      [(set (reg:CC CC_REGNUM)
-	  (compare:CC (const_int 0) (match_dup 3)))
-     (set (match_dup 0) (minus:SI (const_int 0) (match_dup 3)))])
+	  (compare:CC (const_int 0) (match_dup 4)))
+     (set (match_dup 3) (minus:SI (const_int 0) (match_dup 4)))])
     (parallel
-    [(set (match_dup 0)
-	  (plus:SI (plus:SI (match_dup 0) (match_dup 3))
+    [(set (match_dup 3)
+	  (plus:SI (plus:SI (match_dup 3) (match_dup 4))
  		   (geu:SI (reg:CC CC_REGNUM) (const_int 0))))
       (clobber (reg:CC CC_REGNUM))])])


I understand (and agree with) this bit...

+(define_peephole2
+  [(set (reg:CC CC_REGNUM)
+	(compare:CC (match_operand:SI 0 "register_operand" "")
+		    (match_operand:SI 1 "arm_rhs_operand" "")))
+   (cond_exec (ne (reg:CC CC_REGNUM) (const_int 0))
+	      (set (match_operand:SI 2 "register_operand" "") (const_int 0)))
+   (cond_exec (eq (reg:CC CC_REGNUM) (const_int 0))
+	      (set (match_dup 2) (const_int 1)))
+   (match_scratch:SI 3 "r")]
+  "TARGET_32BIT && !peep2_reg_dead_p (3, operands[0])"
+  [(set (match_dup 3) (minus:SI (match_dup 0) (match_dup 1)))
+   (parallel
+    [(set (reg:CC CC_REGNUM)
+	  (compare:CC (const_int 0) (match_dup 3)))
+     (set (match_dup 2) (minus:SI (const_int 0) (match_dup 3)))])
+   (set (match_dup 2)
+	(plus:SI (plus:SI (match_dup 2) (match_dup 3))
+		 (geu:SI (reg:CC CC_REGNUM) (const_int 0))))])
+

... but what's this bit about?

R.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]