This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug rtl-optimization/58033] New: counterproductive bb-reorder
- From: "olegendo at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Tue, 30 Jul 2013 21:00:01 +0000
- Subject: [Bug rtl-optimization/58033] New: counterproductive bb-reorder
- Auto-submitted: auto-generated
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58033
Bug ID: 58033
Summary: counterproductive bb-reorder
Product: gcc
Version: 4.9.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: olegendo at gcc dot gnu.org
CC: steven at gcc dot gnu.org, tejohnson at google dot com
Target: sh*-*-*
On SH, compiling the following code with -O2
#include <bitset>
std::bitset<32> make_bits (void)
{
std::bitset<32> r;
for (auto&& i : { 4, 5, 6, 10 })
if (i < r.size ())
r.set (i);
return r;
}
results in the following code:
mov.l .L8,r1
mov #0,r0
mov #31,r7
mov #1,r6
mov #4,r2
.L2:
mov.l @r1,r3
cmp/hi r7,r3
bf/s .L7
mov r6,r5
.L3:
dt r2
bf/s .L2 // branch if value not > 31, i.e. in each iteration
add #4,r1
rts
nop
.align 1
.L7:
shld r3,r5
bra .L3
or r5,r0
.L9:
.align 2
.L8:
.long _._45+0
_._45:
.long 4
.long 5
.long 6
.long 10
Disabling the bb-reorder pass or using -Os results in more compact and faster
code:
mov.l .L7,r1
mov #0,r0
mov #31,r7
mov #1,r6
mov #4,r2
.L2:
mov.l @r1,r3
cmp/hi r7,r3
bt/s .L3 // branch if value > 31, i.e. never.
mov r6,r5
shld r3,r5
or r5,r0
.L3:
dt r2
bf/s .L2
add #4,r1
rts
nop
Of course the bb-reorder pass doesn't know that the values in this case are
always in range. Still, maybe it could be improved by not splitting out a BB
if it consists only of a few insns? I've tried increasing the branch cost but
it won't do anything.
Teresa, Steven,