This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug rtl-optimization/46920] suboptimal register allocation with local register variables


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46920

--- Comment #3 from Vladimir Makarov <vmakarov at redhat dot com> 2010-12-14 16:02:09 UTC ---
(In reply to comment #2)
> > To generate the proposed code, we should assign r12 to p63. ÂIRA marks p63
> > conflicting with r12 because DF-infrastructure reports r12 having intersected
> > live ranges with p63.
> >
> > It is possible to solve the problem if we have conflicts based on values (not
> > live ranges). ÂI'd not recommend to do that, because it will slow down RA
> > without visible improvement on majority benchmarks (I did such experiment about
> > 7 years ago and reported about the results on GCC summit in 2004).
> 
> One alternative is to rematerialize values that have been copied to a
> hard register before their uses (by inserting an r12:DI=r63:DI before
> the use of r63).  This breaks the live ranges of the pseudos and
> facilitates coalescing.
> 

I'd not call it rematerialization.  I think it is more live range shrinking
(LRS) of hard register through additional copies.  It is an interesting idea (I
partially investigated LRS about 6 years ago).  Probably I should think about
this again.  Thanks, Paolo.

> > By the way, usage of implicit hard registers in RTL (when it can be avoided.
> > Example when hard registers can be avoided is their usage as call arguments) is
> > very bad idea for RA. ÂI see it a lot such code in x86-64 code. ÂI'd recommend
> > to prevent optimizations before RA to abuse hard register usage.
> 
> As I said, the improvement from hard register variable here is 25% on
> x86-64 and probably more (I can collect data) on i386.  This testcase
> is distilled from a bytecode interpreter.

Paolo, I did not mean that you should avoid to use hard register in this
particular case.  I just wrote that I saw a lot x86-64 code where hard
registers were propagated and that is a bad for RA.  I never had an opportunity
to investigate what optimization does it.

Again by the way :).  My experience with implementation of interpreters shows
me that usage of computed gotos does not work well (especially when there are a
lot such labels) with modern OOO processors because of worse branch
predictions.  I found a switch statement works better.  But I guess it is not
your goal to rewrite the interpriter.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]