This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug rtl-optimization/46920] suboptimal register allocation with local register variables
- From: "vmakarov at redhat dot com" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Tue, 14 Dec 2010 16:02:25 +0000
- Subject: [Bug rtl-optimization/46920] suboptimal register allocation with local register variables
- Auto-submitted: auto-generated
- References: <bug-46920-4@http.gcc.gnu.org/bugzilla/>
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46920
--- Comment #3 from Vladimir Makarov <vmakarov at redhat dot com> 2010-12-14 16:02:09 UTC ---
(In reply to comment #2)
> > To generate the proposed code, we should assign r12 to p63. ÂIRA marks p63
> > conflicting with r12 because DF-infrastructure reports r12 having intersected
> > live ranges with p63.
> >
> > It is possible to solve the problem if we have conflicts based on values (not
> > live ranges). ÂI'd not recommend to do that, because it will slow down RA
> > without visible improvement on majority benchmarks (I did such experiment about
> > 7 years ago and reported about the results on GCC summit in 2004).
>
> One alternative is to rematerialize values that have been copied to a
> hard register before their uses (by inserting an r12:DI=r63:DI before
> the use of r63). This breaks the live ranges of the pseudos and
> facilitates coalescing.
>
I'd not call it rematerialization. I think it is more live range shrinking
(LRS) of hard register through additional copies. It is an interesting idea (I
partially investigated LRS about 6 years ago). Probably I should think about
this again. Thanks, Paolo.
> > By the way, usage of implicit hard registers in RTL (when it can be avoided.
> > Example when hard registers can be avoided is their usage as call arguments) is
> > very bad idea for RA. ÂI see it a lot such code in x86-64 code. ÂI'd recommend
> > to prevent optimizations before RA to abuse hard register usage.
>
> As I said, the improvement from hard register variable here is 25% on
> x86-64 and probably more (I can collect data) on i386. This testcase
> is distilled from a bytecode interpreter.
Paolo, I did not mean that you should avoid to use hard register in this
particular case. I just wrote that I saw a lot x86-64 code where hard
registers were propagated and that is a bad for RA. I never had an opportunity
to investigate what optimization does it.
Again by the way :). My experience with implementation of interpreters shows
me that usage of computed gotos does not work well (especially when there are a
lot such labels) with modern OOO processors because of worse branch
predictions. I found a switch statement works better. But I guess it is not
your goal to rewrite the interpriter.