This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [rtl-optimization] Improve Data Prefetch for IA-64

On Tuesday 05 April 2005 09:24, Canqun Yang wrote:
> >On Mon, 28 Mar 2005, James E Wilson wrote:
> >> Steven Bosscher wrote:
> >>> OK, so I know this is not a popular subject, but
> can we *please* stop
> >>> working on loop.c and focus on getting the new RTL and tree loop passes
> >>> to do what we want?
> >>
> >> I don't think anyone is objecting to this. [...]
> >> I would however make a distinction here between new development work and
> >> maintenance.  It would be better if new development work happened in the
> >> new loop optimizer.  However, we still need to do maintenance work in
> >> loop.c. 

This is *not* maintenance work.  This is clearly new development.
Very counterproductive development against years of work to get
to the point where the old loop pass could be removed.

This was in fact already proposed a few times.

It was held off for 4.0 because it was expected that there would
be little objection in stage1 of GCC 4.1, because on every target
I test on the old loop optimizer contributes almost nothing.  If
this patch is allowed in, we can start over again figuring out
what needs to be changed to the new loop optimizers to turn loop.c
into a no-op again.  Canqun could instead have done this work on
the new loop optimizers, and we would not have had this problem.

Just last week Zdenek formally proposed to disable the old loop
optimizer, see
Zdenek mentions just one important reason to remove it: "it would
enable some pretty significant cleanups of the compiler, like removal
of loop notes, removal of rtl level branch prediction pass, maybe
removal of libcall notes, etc.".  I have already given many other
reasons as well.

> >...and since Canqun reported 2.5% improvement on SPEC CFP2000 on ia64 with
> >his current patch, I really think we should consider it.

As I told you in a private mail, you can get those same 2.5% and
much more by *removing* the old loop optimizer.  For example
because it you remove the major obstackle for profile guided
inlining with it.

> Besides this, I¡¯ve got another patch for improving
> general induction variable optimizations defined in
> loop.c.

GIV optimizations should be done at the tree level in IVopts.  If
you think you can win something there for ia64, you should look at
why IVopts is not doing its job for you.  You probably can tune it
to get it to do what you want, IVopts is tuned for x86* and ppc
now, nobody has even looked at ia64.

> With these two patches and properly setting 
> the loop unrolling parameters, the tests of both NAS
> and SPEC CPU2000 benchmarks on IA-64 1GHz system show
> a good result.

What happens if you use the memory address unrolling patch, turn on
-fweb, and set the unrolling parameters properly?

> >We all know how hard it is to get this kind of improvement on any of the
> >SPECs -- and in fact improving the current optimizers will make raise
> > the bar for the new ones. ;-)

That is exactly the problem.  Improving the *old* current loop pass
raises the bar for the people who are working to get rid of it.  It
is uncooperative.

> >Question is: who is going review/potentially approve this patch?

Sorry for Canqun, but I still hope no-one will.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]