Summary: | [4.7 Regression] IRA generates extra register move | ||
---|---|---|---|
Product: | gcc | Reporter: | H.J. Lu <hjl.tools> |
Component: | rtl-optimization | Assignee: | Not yet assigned to anyone <unassigned> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | areg.melikadamyan, ebotcazou, gcc-bugs, vmakarov |
Priority: | P2 | Keywords: | missed-optimization, ra |
Version: | 4.6.0 | ||
Target Milestone: | 4.8.0 | ||
Host: | Target: | ||
Build: | Known to work: | 4.8.0 | |
Known to fail: | 4.7.4 | Last reconfirmed: | 2010-11-24 15:14:39 |
Description
H.J. Lu
2010-05-22 23:40:42 UTC
There isn't just an extra move, the code is also different. Are you sure that it results in inferior performances? Reload creates additional insn for insn (insn 9 7 11 2 (parallel [ (set (reg:DI 71) (lshiftrt:DI (reg/v:DI 60 [ tag ]) (const_int 4 [0x4]))) (clobber (reg:CC 17 flags)) ]) b.i:5 533 {*lshrdi3_1} (expr_list:REG_DEAD (reg/v:DI 60 [ tag ]) (expr_list:REG_UNUSED (reg:CC 17 flags) (nil)))) That is because r60 and r71 got different registers (0 an 1) even although there is a copy between r71 and r60 which should result in getting r70 hard register 0 as r60 one. It does not happen because r68 already got 0 and it conflicts with r71: r71: preferred GENERAL_REGS, alternative NO_REGS, cover GENERAL_REGS r68: preferred AREG, alternative GENERAL_REGS, cover GENERAL_REGS r60: preferred GENERAL_REGS, alternative NO_REGS, cover GENERAL_REGS ;; a0(r68,l0) conflicts: a1(r71,l0) ;; a4(r67,l0) conflicts: cp0:a1(r71)<->a3(r60)@1000:constraint Popping a0(r68,l0) -- assign reg 0 Popping a3(r60,l0) -- assign reg 0 Popping a1(r71,l0) -- assign reg 1 Analogous insn for gcc-4.3 looks like (insn:HI 9 7 11 2 b.i:4 (parallel [ (set (reg/v:DI 58 [ tag ]) (lshiftrt:DI (reg/v:DI 58 [ tag ]) (const_int 4 [0x4]))) (clobber (reg:CC 17 flags)) ]) 514 {*lshrdi3_1_rex64} (expr_list:REG_UNUSED (reg:CC 17 flags) (nil))) It means there is no such problem as in gcc4.4+. Insn 9 for gcc-4.3 is a result of regmove transformation. I have no idea why regmove (which is present in gcc4.4+) does not do the same for gcc4.4+ (probably because of some changes since 4.3). The problem could be fixed in regmove or in IRA (which is probably harder). But I don't know is it worth to do it. Because such transformations result in longer live ranges of pseudos and might result in worse code for other programs. 4.4 branch is being closed, moving to 4.5.4 target. The 4.5 branch is being closed, adjusting target milestone. No extra move with trunk today: $ cat t.c extern unsigned long table[]; unsigned long foo(unsigned char *p) { unsigned long tag = *p; return table[tag >> 4] + table[tag & 0xf]; } $ cat t.s .file "t.c" .text .p2align 4,,15 .globl foo .type foo, @function foo: .LFB0: .cfi_startproc movzbl (%rdi), %edx movq %rdx, %rax shrq $4, %rdx andl $15, %eax movq table(,%rax,8), %rax addq table(,%rdx,8), %rax ret .cfi_endproc .LFE0: .size foo, .-foo .ident "GCC: (GNU) 4.8.0 20121008 (experimental) \ [trunk revision 192219]" .section .note.GNU-stack,"",@progbits GCC 4.6.4 has been released and the branch has been closed. Fixed with LRA. . |