This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[Bug target/54236] New: [SH] Improve addc and subc insn utilization

From: "olegendo at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
To: gcc-bugs at gcc dot gnu dot org
Date: Sun, 12 Aug 2012 22:25:32 +0000
Subject: [Bug target/54236] New: [SH] Improve addc and subc insn utilization
Auto-submitted: auto-generated

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54236

             Bug #: 54236
           Summary: [SH] Improve addc and subc insn utilization
    Classification: Unclassified
           Product: gcc
           Version: 4.8.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: target
        AssignedTo: olegendo@gcc.gnu.org
        ReportedBy: olegendo@gcc.gnu.org
            Target: sh*-*-*


There are currently a couple of cases, where it would be better if addc or subc
insns were used.  For example:

int test00 (int a, int b)
{
  return a + b + 1;
}


gets compiled to:

        mov     r4,r0   ! MT
        add     r5,r0   ! EX
        rts
        add     #1,r0   ! EX

could be better as:

        mov     r4,r0   ! MT
        sett    r5,r0   ! MT (SH4)
        rts
        addc    #1,r0   ! EX

As a proof of concept, I've applied the following to handle the above case:

Index: gcc/config/sh/sh.md
===================================================================
--- gcc/config/sh/sh.md    (revision 190326)
+++ gcc/config/sh/sh.md    (working copy)
@@ -1465,7 +1465,7 @@

 (define_insn "addc"
   [(set (match_operand:SI 0 "arith_reg_dest" "=r")
-    (plus:SI (plus:SI (match_operand:SI 1 "arith_reg_operand" "0")
+    (plus:SI (plus:SI (match_operand:SI 1 "arith_reg_operand" "%0")
               (match_operand:SI 2 "arith_reg_operand" "r"))
          (reg:SI T_REG)))
    (set (reg:SI T_REG)
@@ -1516,6 +1516,24 @@
   "add    %2,%0"
   [(set_attr "type" "arith")])

+(define_insn_and_split "*addsi3_compact"
+  [(set (match_operand:SI 0 "arith_reg_dest" "")
+    (plus:SI (plus:SI (match_operand:SI 1 "arith_reg_operand" "")
+               (match_operand:SI 2 "arith_reg_operand" ""))
+         (const_int 1)))
+   (clobber (reg:SI T_REG))]
+  "TARGET_SH1"
+  "#"
+  "&& 1"
+  [(set (reg:SI T_REG) (const_int 1))
+   (parallel [(set (match_dup 0)
+           (plus:SI (plus:SI (match_dup 1)
+                     (match_dup 2))
+                (reg:SI T_REG)))
+          (set (reg:SI T_REG)
+           (ltu:SI (plus:SI (match_dup 1) (match_dup 2))
+               (match_dup 1)))])])
+
 ;; -------------------------------------------------------------------------
 ;; Subtraction instructions
 ;; -------------------------------------------------------------------------

.. and observed some code from the CSiBE set for -O2 -m4-single -ml
-mpretend-cmove.  It doesn't affect code size that much (some incs/decs here
and there), but more importantly it does this (libmpeg2/motion_comp.c):

_MC_avg_o_16_c:            --> 
        mov.b   @r5,r1            mov.b    @r5,r2
.L16:                .L16:
        mov.b   @r4,r2                  sett
        extu.b  r1,r1                   mov.b    @r4,r1
        extu.b  r2,r2                   extu.b    r2,r2
        add     r2,r1                   extu.b    r1,r1
        add     #1,r1                   addc    r2,r1
        shar    r1                      shar    r1
        mov.b   r1,@r4                  mov.b    r1,@r4
        mov.b   @(1,r5),r0              sett
        extu.b  r0,r1                   mov.b    @(1,r5),r0
        mov.b   @(1,r4),r0              extu.b    r0,r1
        extu.b  r0,r0                   mov.b    @(1,r4),r0
        add     r0,r1                   extu.b    r0,r0
        add     #1,r1                   addc    r1,r0
        shar    r1                      shar    r0
        mov     r1,r0                   mov.b    r0,@(1,r4)
        mov.b   r0,@(1,r4)

In such cases, the sett,addc sequence can be scheduled much better and in most
cases the sett insn can be executed in parallel with some other insn.
Unfortunately, on SH4A the sett insn has been moved from MT group to EX group,
but still it seems beneficial.  I've also seen a couple of places, where
sett-subc sequences would be better.

Follow-Ups:
- [Bug target/54236] [SH] Improve addc and subc insn utilization
  - From: olegendo at gcc dot gnu.org
- [Bug target/54236] [SH] Improve addc and subc insn utilization
  - From: olegendo at gcc dot gnu.org

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]