Serious code size regression from 3.0.2 to now part two
tm
tm@mail.kloo.net
Fri Jul 26 13:02:00 GMT 2002
On Fri, 26 Jul 2002, Joern Rennecke wrote:
> tm wrote:
> >
> > Okay, I've started using -fno-reorder-blocks on my testcase map_fog.i, and
> > the code size is still about 10% worse than 3.0.x.
> >
> > I think I've tracked this down to really bad branches being generated by
> > gcc. Take a look at this code sequence:
> >
> > 5124 .L320:
> > 5125 2a02 4011 cmp/pz r0
> > 5126 2a04 8F0C bf/s .L322
> > 5127 2a06 6813 mov r1,r8
> > 5128 2a08 9226 mov.w .L683,r2
> > 5129 2a0a 3027 cmp/gt r2,r0
> > 5130 2a0c 8F09 bf/s .L323
> > 5131 2a0e 6103 mov r0,r1
> > 5132 2a10 A007 bra .L323
> > 5133 2a12 6123 mov r2,r1
> > 5134 2a14 00090009 .align 5
> > 5134 00090009
> > 5134 00090009
> > 5135 .L322:
> > 5136 2a20 E100 mov #0,r1
> > 5137 .L323:
> > 5138 2a22 6013 mov r1,r0
> > 5139 2a24 4818 shll8 r8
> > ..
> >
> > This is really twisted branch logic.
>
> It appears the code is geared towards the r0 < 0 case. Assuming both
> r1 and r0 need to contain the result, optimized code would be:
> would be:
> cmpz/pz r0
> mov r1,r8
> bf/s .L322
> mov #0,r1
> mov.w .L683,r2
> mov r0,r1
> cmp/gt r2,r0
> bf L322
> mov r2,r1
> L323:
> mov r1,r0
> L322:
Yes. That's very similar to what 3.0.4 generates:
4334 .L719:
4335 2004 4A11 cmp/pz r10
4336 2006 8F05 bf/s .L323
4337 2008 E100 mov #0,r1
4338 200a 61A3 mov r10,r1
4339 200c 31C7 cmp/gt r12,r1
4340 200e 8F02 bf/s .L315
4341 2010 6A13 mov r1,r10
4342 2012 911C mov.w .L648,r1
4343 .L323:
4344 2014 6A13 mov r1,r10
4345 .L315:
I'll try to investigate some more to determine the culprit.
Toshi
More information about the Gcc-bugs
mailing list