This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug middle-end/79012] New: basic block reordering causes suboptimal code
- From: "saaadhu at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Fri, 06 Jan 2017 06:47:57 +0000
- Subject: [Bug middle-end/79012] New: basic block reordering causes suboptimal code
- Auto-submitted: auto-generated
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79012
Bug ID: 79012
Summary: basic block reordering causes suboptimal code
Product: gcc
Version: 7.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: saaadhu at gcc dot gnu.org
CC: segher at kernel dot crashing.org
Target Milestone: ---
For this C code (slightly modified from PR 30908)
void wait(int i)
{
while (i-- > 0)
asm volatile("nop" ::: "memory");
}
gcc 4.8 at -Os produces
jmp .L2
.L3:
nop
decl %edi
.L2:
testl %edi, %edi
jg .L3
ret
whereas gcc trunk (and 4.9 onwards, from a quick check) produces
.L2:
testl %edi, %edi
jle .L5
nop
decl %edi
jmp .L2
.L5:
ret
The code size is identical, but the trunk version executes one more
instruction everytime the loop runs (explicit jump to .L5 with trunk vs
fallthrough with 4.8) - it's faster only if the loop never runs. This
happens irrespective of the memory clobber inline assembler statement.
Digging into the dump files, I found that the transformation occurs in
the bb reorder pass, when it calls cfg_layout_initialize, which
eventually calls try_redirect_by_replacing_jump with in_cfglayout set to
true. That function then removes the jump and causes the RTL
transformation that eventually results in slower code.
RTL before and after bbro.
Before:
(jump_insn 24 6 25 2 (set (pc)
(label_ref 15)) "pr30908.c":3 678 {jump}
(nil)
-> 15)
(barrier 25 24 17)
(code_label 17 25 12 3 3 "" [1 uses])
(note 12 17 13 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
(insn 13 12 14 3 (parallel [
(asm_operands/v ("nop") ("") 0 []
[]
[] pr30908.c:4)
(clobber (mem:BLK (scratch) [0 A8]))
(clobber (reg:CCFP 18 fpsr))
(clobber (reg:CC 17 flags))
]) "pr30908.c":4 -1
(expr_list:REG_UNUSED (reg:CCFP 18 fpsr)
(expr_list:REG_UNUSED (reg:CC 17 flags)
(nil))))
(insn 14 13 15 3 (parallel [
(set (reg:SI 5 di [orig:90 ivtmp.9 ] [90])
(plus:SI (reg:SI 5 di [orig:90 ivtmp.9 ] [90])
(const_int -1 [0xffffffffffffffff])))
(clobber (reg:CC 17 flags))
]) 210 {*addsi_1}
(expr_list:REG_UNUSED (reg:CC 17 flags)
(nil)))
(code_label 15 14 16 4 2 "" [1 uses])
(note 16 15 18 4 [bb 4] NOTE_INSN_BASIC_BLOCK)
(insn 18 16 19 4 (set (reg:CCNO 17 flags)
(compare:CCNO (reg:SI 5 di [orig:90 ivtmp.9 ] [90])
(const_int 0 [0]))) "pr30908.c":3 3 {*cmpsi_ccno_1}
(nil))
(jump_insn 19 18 30 4 (set (pc)
(if_then_else (gt (reg:CCNO 17 flags)
(const_int 0 [0]))
(label_ref 17)
(pc))) "pr30908.c":3 646 {*jcc_1}
(expr_list:REG_DEAD (reg:CCNO 17 flags)
(int_list:REG_BR_PROB 8500 (nil)))
-> 17)
(note 30 19 28 5 [bb 5] NOTE_INSN_BASIC_BLOCK)
(note 28 30 29 5 NOTE_INSN_EPILOGUE_BEG)
(jump_insn 29 28 31 5 (simple_return) "pr30908.c":5 708
{simple_return_internal}
(nil)
-> simple_return)
After:
<snip>
(code_label 15 6 16 3 2 "" [1 uses])
(note 16 15 18 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
(insn 18 16 19 3 (set (reg:CCNO 17 flags)
(compare:CCNO (reg:SI 5 di [orig:90 ivtmp.9 ] [90])
(const_int 0 [0]))) "pr30908.c":3 3 {*cmpsi_ccno_1}
(nil))
(jump_insn 19 18 12 3 (set (pc)
(if_then_else (le (reg:CCNO 17 flags)
(const_int 0 [0]))
(label_ref:DI 34)
(pc))) "pr30908.c":3 646 {*jcc_1}
(expr_list:REG_DEAD (reg:CCNO 17 flags)
(int_list:REG_BR_PROB 1500 (nil)))
-> 34)
(note 12 19 13 4 [bb 4] NOTE_INSN_BASIC_BLOCK)
(insn 13 12 14 4 (parallel [
(asm_operands/v ("nop") ("") 0 []
[]
[] pr30908.c:4)
(clobber (mem:BLK (scratch) [0 A8]))
(clobber (reg:CCFP 18 fpsr))
(clobber (reg:CC 17 flags))
]) "pr30908.c":4 -1
(expr_list:REG_UNUSED (reg:CCFP 18 fpsr)
(expr_list:REG_UNUSED (reg:CC 17 flags)
(nil))))
(insn 14 13 35 4 (parallel [
(set (reg:SI 5 di [orig:90 ivtmp.9 ] [90])
(plus:SI (reg:SI 5 di [orig:90 ivtmp.9 ] [90])
(const_int -1 [0xffffffffffffffff])))
(clobber (reg:CC 17 flags))
]) 210 {*addsi_1}
(expr_list:REG_UNUSED (reg:CC 17 flags)
(nil)))
(jump_insn 35 14 36 4 (set (pc)
(label_ref 15)) -1
(nil)
-> 15)
(barrier 36 35 34)
(code_label 34 36 30 5 5 "" [1 uses])
(note 30 34 28 5 [bb 5] NOTE_INSN_BASIC_BLOCK)
(note 28 30 29 5 NOTE_INSN_EPILOGUE_BEG)
(jump_insn 29 28 31 5 (simple_return) "pr30908.c":5 708
{simple_return_internal}
(nil)
-> simple_return)