This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: RFC: Code size improvement for global alloc

From: Vladimir Makarov <vmakarov at redhat dot com>
To: Daniel Jacobowitz <drow at false dot org>
Cc: gcc-patches at gcc dot gnu dot org, Richard Earnshaw <rearnsha at gcc dot gnu dot org>, Mark Mitchell <mark at codesourcery dot com>
Date: Tue, 06 Jul 2004 12:55:44 -0400
Subject: Re: RFC: Code size improvement for global alloc
References: <20040706150716.GA10639@nevyn.them.org>

Daniel Jacobowitz wrote:

Testing: I tested this patch with bootstrap / make check (all languages
except Ada) on x86_64-pc-linux-gnu.  I also tested an equivalent patch for
arm-none-elf using csl-arm-branch.  There were no regressions in either
case.  Bootstrap got 5.5 seconds faster out of 81 minutes, which is in the
noise.  On x86_64 there was a size improvement of 269 bytes out 590K in some
random files from cc1-i-files; on ARM there was a size improvement of about
0.05% in CSiBE.

OK? Comments?

I think that at least you need to introduce a new flag for this. You can not make it by default until you prove that there is a performance improvement on a credible benchmark (better for SPEC95 or SPEC2000) on major platforms (x86, x86-64, ppc). Although the optimization removes move insns, the final result might be worse after the reload because the reload might expel the coalesced pseudo-registers from a hard register and only one pseudo-register when coalescing did not happen.

Also I have a code for more common form of coalescing in global (please see gcc summit proceeding) which coalesces all registers not only global and local ones (as in your patch) and trying the two hard registers (not only one of the global). It even coalesces two registers one of which or the both ones got memory if it is profitable. It also coalesces pseudo-register according to the frequencies of the move insns to get a better results.

Besides features mentioned above I see the following possible improvements in your patch:

 1. trying hard registers from alternative register class too.
 2. trying call used hard registers if it is profitable.

Still my patch (and yours) is a constrained form of coalescing because it is done after the register allocation (assigning). More common form of coalescing would require an iterative approach to the register allocation (and implementing register live range spilling to undo coalescing if it is necessary). It is one more way to improve the register allocation.

I am going to submit my patch in a few weeks. It would be interesting to compare your patch with mine. I will probably do that.

Vlad

Follow-Ups:
- Re: RFC: Code size improvement for global alloc
  - From: Daniel Jacobowitz

References:
- RFC: Code size improvement for global alloc
  - From: Daniel Jacobowitz

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]