powercp-linux cross GCC 4.2 vs GCC 4.0.0: -Os code size regression?

Wed Jan 16 17:10:00 GMT 2008

Sergei Poselenov writes:
 > Hello,
 > 
 > I've just noted an error in my calculations: not 40%, but 10%
 > regression (used gdb to do the calculations and forgot to convert
 > inputs to float). Sorry.
 > 
 > But the problem still persists for me - I'm building an embedded
 > firmware (U-Boot) and it doesn't fit into the reserved space
 > anymore.
 > 
 > Andrew Haley wrote:
 > > Sergei Poselenov writes:
 > >  > Hello all,
 > >  > 
 > >  > I'm using the ppc-linux gcc-4.2.2 compiler and noted the code
 > >  > size have increased significantly (about 40%!), comparing with
 > >  > old 4.0.0 when using the -Os option. Same code, same compile-
 > >  > and configuration-time options. Binutils are differ
 > >  > (2.16.1 vs 2.17.50), though.
 > >  > 
 > >  > I've looked at the CSiBE testing results for ppc-elf with -Os,
 > >  > comparing gcc_4_0_0 with mainline and found that the mainline
 > >  > actually optimizes better, at least for the CSiBE test environment.
 > >  > After some analysis I've came to the following results:
 > >  >   Number of packages in the CSiBE test environment: 863
 > >  >   N of packages where mainline GCC optimizes better:   290
 > >  >   N of packages where mainline GCC optimizes worse: 436
 > >  > 
 > >  > And the regression in code size is up to 40%, like in my case.
 > > 
 > > 40% seems severe, but it may be an outlier.  What is the average
 > > increase in code size, including the packages where it got better?
 > > 
 > 
 > 
 > Specifically, in my case the digits are as follows (as reported by
 > 'size'):
 > gcc 4.2.2:
 >     text    data     bss     dec     hex filename
 >     2696      60    1536    4292    10c4 interrupts.o
 > 
 > gcc 4.0.0:
 >   text    data     bss     dec     hex filename
 >     2424      88    1536    4048     fd0 interrupts.o
 > 
 > (about 10% regression)

Sure, but this is a tiny sample.

 > As for the CSiBE results - the average regression is
 > 3%, including top 3 winners:
 > 100% (32768 vs 16384 for "linux-2.4.23-pre3-testplatform - 
 > arch/testplatform/kernel/init_task")
 > 35% (1440 vs 1064 for "teem-1.6.0-src - src/air/enum")
 > 34% (1712 vs 1280 for "teem-1.6.0-src - src/nrrd/encodingHex")

I've just re-read what you wrote, and noticed your comment above: "the
mainline actually optimizes better, at least for the CSiBE test
environment."

Quite so:

http://www.inf.u-szeged.hu/csibe/ocomp.php?branchid_a=gcc_4_0_0_release&branchid_b=mainline&targetid_a=arm-elf&targetid_b=arm-elf&timestamp_a=2003-01-01%2012:00:00&timestamp_b=2008-01-14%2012:00:00&flags_a=-Os&flags_b=-Os&csibever_a=2.x.x&csibever_b=2.x.x&dataview=Code%20size&viewmode=Summarized%20bar%20chart&finish_button=Finish

So we're actually doing better now than we were in 4.0.0.

Now, I sympathize that in your particular case you have a code size
regression.  This happens: when we do optimization in gcc, some code
bases will lose out.  All that we can promise is that we try not to
make it worse for most users.

What we can do is compare your code that has got much worse, and try
to figure out why.

Andrew.

-- 
Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SL4 1TE, UK
Registered in England and Wales No. 3798903