This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/76535] New: [SH] Replace shll addc sequence with cmp/pz subc
- From: "olegendo at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Sun, 14 Aug 2016 13:14:34 +0000
- Subject: [Bug target/76535] New: [SH] Replace shll addc sequence with cmp/pz subc
- Authentication-results: sourceware.org; auth=none
- Auto-submitted: auto-generated
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=76535
Bug ID: 76535
Summary: [SH] Replace shll addc sequence with cmp/pz subc
Product: gcc
Version: 7.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: olegendo at gcc dot gnu.org
Target Milestone: ---
Target: sh*-*-*
In CSiBE jikespg-1.3/src/tabutil, function itoc:
static const char digits[] = "0123456789";
extern char *output_ptr;
void itoc(int num)
{
int val;
char *p;
char tmp[12];
val = (((num) < 0) ? -(num) : (num));
tmp[11] = '\0';
p = &tmp[11];
do
{
p--;
*p = digits[val % 10];
val /= 10;
} while(val > 0);
if (num < 0)
{
p--;
*p = '-';
}
while (*p != '\0')
*(output_ptr++) = *(p++);
}
There is the following sequence when compiling with -O2:
mov.l .L14,r1
mov r6,r2 <<<
mov #0,r0
mov r7,r3
dmuls.l r1,r6
sts mach,r5
shar r5
shar r5
shll r2 <<<
addc r0,r5 <<< r5 = 0 + r5 + (r2 < 0)
Because shll mutates the operand, another register is needed. The sequence can
be improved by using cmp/pz and subc instead:
mov.l .L14,r1
mov #-1,r0
mov r7,r3
dmuls.l r1,r6
sts mach,r5
shar r5
shar r5
cmp/pz r6
subc r0,r5 <<< r5 = r5 - (-1) - (r2 >= 0)
= r5 + 1 - (r2 >= 0)
= r5 + (r2 < 0)
Unfortunately, the addc pattern is captured as:
(insn 54 53 56 4 (parallel [
(set (reg:SI 205)
(plus:SI (gt:SI (reg:SI 211)
(reg/v:SI 187 [ val ]))
(reg:SI 209)))
(clobber (reg:SI 147 t))
]) sh_tmp.cpp:16 41 {*addc_t_r}
(expr_list:REG_UNUSED (reg:SI 147 t)
(expr_list:REG_DEAD (reg:SI 209)
(nil))))
where reg 211 is initialized in another BB to zero:
(insn 51 45 119 2 (set (reg:SI 211)
(const_int 0 [0])) sh_tmp.cpp:16 180 {movsi_ie}
(nil))
This makes it difficult to add a special case in the main *addc insn_and_split
pattern, as the constant load is CSE'ed before combine and it will not get a
chance to canonicalize the comparison.
The initial expansion of the comparison looks strange:
(insn 69 68 70 (set (reg:SI 223)
(const_int 0 [0])) sh_tmp.cpp:17 -1
(nil))
(insn 70 69 71 (set (reg:SI 147 t)
(gt:SI (reg:SI 223)
(reg/v:SI 187 [ val ]))) sh_tmp.cpp:17 -1
(nil))
(insn 71 70 72 (set (reg:SI 222)
(neg:SI (reg:SI 147 t))) sh_tmp.cpp:17 -1
(expr_list:REG_EQUAL (ashiftrt:SI (reg/v:SI 187 [ val ])
(const_int 31 [0x1f]))
(nil)))
... and it doesn't go through the cstoresi4 expander and hence also has no
chance to get canonicalized. Looks like this comes from the division / modulo
optimization...