This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/54236] New: [SH] Improve addc and subc insn utilization
- From: "olegendo at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Sun, 12 Aug 2012 22:25:32 +0000
- Subject: [Bug target/54236] New: [SH] Improve addc and subc insn utilization
- Auto-submitted: auto-generated
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54236
Bug #: 54236
Summary: [SH] Improve addc and subc insn utilization
Classification: Unclassified
Product: gcc
Version: 4.8.0
Status: UNCONFIRMED
Severity: enhancement
Priority: P3
Component: target
AssignedTo: olegendo@gcc.gnu.org
ReportedBy: olegendo@gcc.gnu.org
Target: sh*-*-*
There are currently a couple of cases, where it would be better if addc or subc
insns were used. For example:
int test00 (int a, int b)
{
return a + b + 1;
}
gets compiled to:
mov r4,r0 ! MT
add r5,r0 ! EX
rts
add #1,r0 ! EX
could be better as:
mov r4,r0 ! MT
sett r5,r0 ! MT (SH4)
rts
addc #1,r0 ! EX
As a proof of concept, I've applied the following to handle the above case:
Index: gcc/config/sh/sh.md
===================================================================
--- gcc/config/sh/sh.md (revision 190326)
+++ gcc/config/sh/sh.md (working copy)
@@ -1465,7 +1465,7 @@
(define_insn "addc"
[(set (match_operand:SI 0 "arith_reg_dest" "=r")
- (plus:SI (plus:SI (match_operand:SI 1 "arith_reg_operand" "0")
+ (plus:SI (plus:SI (match_operand:SI 1 "arith_reg_operand" "%0")
(match_operand:SI 2 "arith_reg_operand" "r"))
(reg:SI T_REG)))
(set (reg:SI T_REG)
@@ -1516,6 +1516,24 @@
"add %2,%0"
[(set_attr "type" "arith")])
+(define_insn_and_split "*addsi3_compact"
+ [(set (match_operand:SI 0 "arith_reg_dest" "")
+ (plus:SI (plus:SI (match_operand:SI 1 "arith_reg_operand" "")
+ (match_operand:SI 2 "arith_reg_operand" ""))
+ (const_int 1)))
+ (clobber (reg:SI T_REG))]
+ "TARGET_SH1"
+ "#"
+ "&& 1"
+ [(set (reg:SI T_REG) (const_int 1))
+ (parallel [(set (match_dup 0)
+ (plus:SI (plus:SI (match_dup 1)
+ (match_dup 2))
+ (reg:SI T_REG)))
+ (set (reg:SI T_REG)
+ (ltu:SI (plus:SI (match_dup 1) (match_dup 2))
+ (match_dup 1)))])])
+
;; -------------------------------------------------------------------------
;; Subtraction instructions
;; -------------------------------------------------------------------------
.. and observed some code from the CSiBE set for -O2 -m4-single -ml
-mpretend-cmove. It doesn't affect code size that much (some incs/decs here
and there), but more importantly it does this (libmpeg2/motion_comp.c):
_MC_avg_o_16_c: -->
mov.b @r5,r1 mov.b @r5,r2
.L16: .L16:
mov.b @r4,r2 sett
extu.b r1,r1 mov.b @r4,r1
extu.b r2,r2 extu.b r2,r2
add r2,r1 extu.b r1,r1
add #1,r1 addc r2,r1
shar r1 shar r1
mov.b r1,@r4 mov.b r1,@r4
mov.b @(1,r5),r0 sett
extu.b r0,r1 mov.b @(1,r5),r0
mov.b @(1,r4),r0 extu.b r0,r1
extu.b r0,r0 mov.b @(1,r4),r0
add r0,r1 extu.b r0,r0
add #1,r1 addc r1,r0
shar r1 shar r0
mov r1,r0 mov.b r0,@(1,r4)
mov.b r0,@(1,r4)
In such cases, the sett,addc sequence can be scheduled much better and in most
cases the sett insn can be executed in parallel with some other insn.
Unfortunately, on SH4A the sett insn has been moved from MT group to EX group,
but still it seems beneficial. I've also seen a couple of places, where
sett-subc sequences would be better.