This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
basic block reordering question.
- From: Andrew MacLeod <amacleod at redhat dot com>
- To: gcc mailing list <gcc at gcc dot gnu dot org>
- Date: 14 Nov 2003 10:59:17 -0500
- Subject: basic block reordering question.
Any quick suggestions on what might be going on here?
I have a testcase where I've hacked the compiler up such that I have 2
versions. One version has one less stmt than the other. The difference
is the following:
(code_label 86 85 111 254 "" [0 uses])
(call_insn 111 86 112 (call_placeholder 101 88 0 0 (call_insn 109 108 110 (set (reg:SI %eax)
(call (mem:QI (symbol_ref:SI ("sprintf") [flags 0x41] <function_decl 0x4008e32c sprintf>) [0 S1 A8])
(const_int 20 [0x14]))) -1 (nil)
(expr_list:REG_EH_REGION (const_int 0 [0x0])
(nil))
(nil))) -1 (nil)
(nil)
(nil))
and the other has:
(code_label 86 85 88 254 "" [0 uses])
(insn 88 86 110 (set (reg:HI 74 [ T.266 ])
(mem/s:HI (plus:SI (reg/v/f:SI 59 [ PartTkn ])
(const_int 4 [0x4])) [13 <variable>.DbId+0 S2 A32])) -1 (nil)
(nil))
(call_insn 110 88 111 (call_placeholder 101 89 0 0 (call_insn 108 107 109 (set (reg:SI %eax)
(call (mem:QI (symbol_ref:SI ("sprintf") [flags 0x41] <function_decl 0x4008e32c sprintf>) [0 S1 A8])
(const_int 20 [0x14]))) -1 (nil)
(expr_list:REG_EH_REGION (const_int 0 [0x0])
(nil))
(nil))) -1 (nil)
(nil)
(nil))
The original tree-ssa stmt passed to the expanders is:
<Uaa6c>:;
T.268 = sprintf (&Msg, (const char *)" Traverse Part [%3u:%8u] Level = %2u.\n", (int)PartTkn->DbId, PartTkn->Handle, Level);
vs.
<Uaa6c>:;
T.266 = PartTkn->DbId;
T.268 = sprintf (&Msg, (const char *)" Traverse Part [%3u:%8u] Level = %2u.\n", (int)T.266, PartTkn->Handle, Level);
This code is from VORTEX. The first program runs in 120 seconds on my
machine. The second program runs in 113 seconds. the basic block
reordering algorithm changes the order of blocks in these 2 different
cases, and I see alomst a 6% regression in VORTEX run times, even though
I would fully expect the first example to be easier for RTL to sort
through, and uses one less variable.
I can consistantly reproduce this. Its quite a dramatic difference.
Suggestions?
Andrew