This is the mail archive of the
mailing list for the GCC project.
Re: Loop unrolling-related SPEC regressions?
- From: Jan Hubicka <jh at suse dot cz>
- To: Dale Johannesen <dalej at apple dot com>
- Cc: Laurent Guerby <guerby at acm dot org>, Andreas Jaeger <aj at suse dot de>, Paolo Carlini <pcarlini at unitus dot it>, gcc at gcc dot gnu dot org, rth at redhat dot com
- Date: Thu, 7 Feb 2002 12:46:45 +0100
- Subject: Re: Loop unrolling-related SPEC regressions?
- References: <3C61A96C.email@example.com> <E2995D70-1B52-11D6-826C-003065C86F94@apple.com>
> I looked at Spec quite a bit for my last job and I can
> suggest some things that are important.
> Intelligent use of profiling info from the first pass is
> important. You'll see the published numbers do this.
> Last time I looked gcc used this only for branch
> straightening; it can also be used effectively to
> drive inlining and register allocation.
I did quite active development on this. In 3.1 gcc can do some
of optimizations based on profile info, like register allocation.
On the cfg-branch we are still focusing on this path for 3.2
> crafty is heavily dependent on efficiency of "long long".
> It's a chess program, full of 64-bit bitmasks.
This is actually big problem for gcc. It may be workaroundable
by using SSE/MMX arithmetics when available.
> eon is the only one in C++. If there are any problems
> in exception handling they will show up here. The program
> does not actually throw any exceptions, so turning off
> the handling for peak may help (SPEC won't let you turn
> it off for base). Good inlining decisions are also important.
Yes, eon basically appears to be very huge, so everything
that shrinks the footprint is usefull.
> the two most heavily executed functions in perl are big;
> IME register allocation & scheduling don't always work
> well for big functions. They also both call setjmp; if
> this disables any substantial amount of optimization it
> will hurt.
Our setjmp handling should be aggressive enought. We represent
it as abnormal edge in the CFG and this optimize the rest of
function w/o much of degradation.