This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Tree tail merging breaks __builtin_unreachable optimization
- From: "Ulrich Weigand" <uweigand at de dot ibm dot com>
- To: gcc-patches at gcc dot gnu dot org
- Cc: tom at codesourcery dot com
- Date: Wed, 4 Jul 2012 19:02:16 +0200 (CEST)
- Subject: Tree tail merging breaks __builtin_unreachable optimization
Hello,
starting with 4.7, if multiple __builtin_unreachable statements occur in
a single function, they are no longer optimized as they used to be.
For example,
int foo(int a)
{
if (a <= 0)
__builtin_unreachable();
if (a > 2)
__builtin_unreachable();
return a > 0;
}
results in the following (ARM) code:
foo:
cmp r0, #0
ble .L3
cmp r0, #2
bgt .L3
mov r0, #1
bx lr
.L3:
with the label .L3 hanging off after the end of the function.
With 4.6, we instead get the expected:
foo:
mov r0, #1
bx lr
The problem seems to be an unfortunate interaction between tree and
RTL optimization passes. In 4.6, we had something like:
<bb 2>:
if (a_1(D) <= 0)
goto <bb 3>;
else
goto <bb 4>;
<bb 3>:
__builtin_unreachable ();
<bb 4>:
if (a_1(D) > 2)
goto <bb 5>;
else
goto <bb 6>;
<bb 5>:
__builtin_unreachable ();
<bb 6>:
return 1;
on the tree level; during RTL expansion __builtin_unreachable expands to just a
barrier, and subsequent CFG optimization detects basic blocks containing just a
barrier and optimizes the predecessor blocks.
With 4.7, we get instead:
<bb 2>:
if (a_1(D) <= 0)
goto <bb 3>;
else
goto <bb 4>;
<bb 3>:
__builtin_unreachable ();
<bb 4>:
if (a_1(D) > 2)
goto <bb 3>;
else
goto <bb 5>;
<bb 5>:
return 1;
where there is just a single basic block containing __builtin_unreachable,
and multiple predecessors branching to it. Now unfortunately the RTL
optimizers detecting unreachable blocks appear to have difficulties if
such a block has multiple predecessors, and fail to optimize them.
The tree pass that merged the two blocks is a new pass called "tail merging",
which was added in the 4.7 cycle. In fact, using -fno-tree-tail-merge gets
the expected result back.
Any suggestions how to fix this? Should tail merging detect
__builtin_unreachable and not merge such block? Or else, should
the CFG optimizer be extended (how?) to handle unreachable blocks
with multiple predecessors better?
Bye,
Ulrich
--
Dr. Ulrich Weigand
GNU Toolchain for Linux on System z and Cell BE
Ulrich.Weigand@de.ibm.com