This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
RE: Odd performance regression with -Os
- From: "Weddington, Eric" <eweddington at cso dot atmel dot com>
- To: "Mark Mitchell" <mark at codesourcery dot com>, "Andrew Haley" <aph at redhat dot com>, "Eric Botcazou" <ebotcazou at adacore dot com>
- Cc: <gcc at gcc dot gnu dot org>, "Georg-Johann Lay" <avr at gjlay dot de>
- Date: Tue, 30 Dec 2008 15:38:39 -0700
- Subject: RE: Odd performance regression with -Os
- References: <49591C0F.4030006@codesourcery.com>
> -----Original Message-----
> From: Mark Mitchell [mailto:mark@codesourcery.com]
> Sent: Monday, December 29, 2008 11:51 AM
> To: Andrew Haley
> Cc: Eric Botcazou; gcc@gcc.gnu.org; Georg-Johann Lay
> Subject: Re: Odd performance regression with -Os
>
> Andrew Haley wrote:
> > Eric Botcazou wrote:
> >>> Thanks. Are you holding this because we're in Stage 3?
> >> The patch was written very recently so I wanted to let it
> go through a good
> >> deal of internal testing. Moveover I haven't measured its
> impact on anything
> >> else than Ada benchmarks (and on a patched 4.3 branch).
> If people think that
> >> it would be worth having for the 4.4 release, I can port
> it and conduct basic
> >> testing with it on the mainline, but that's pretty much it.
> >
> > Well, it's a fairly nasty regression on embedded targets with no
> > multiplier, where people are likely to use -Os. Sounds to me like
> > it qualifies for 4.4
>
> I agree.
I just tried Eric's patch <http://gcc.gnu.org/ml/gcc/2008-12/msg00330.html> for the AVR on 4.3.2 (patch slightly modified to patch against the 4.3.2 release) and tested with Andrew's original test case. The AVR is a perfect target to test this as it is an 8-bit embedded processor, there are a number of variants that do not have multiply instructions, and almost all applications are compiled with -Os.
I compiled the original test case using -mmcu=at90usb82 -Os. It compiled to 15 instructions at 32 bytes, with a call to __mulhi3. With the patch, the test case compiled to 10 instructions at 20 bytes, and no call to libgcc.
I haven't regtested the patch yet, but so far I like what I see.
Thanks,
Eric Weddington