This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Checked in - Out Of SSA Rewrite - The New Coalescer.

-------- Forwarded Message --------
> From: Andrew MacLeod <>
> To: gcc-patches <>
> Subject: Out Of SSA Rewrite - The New Coalescer patch 5/5
> Date: Tue, 21 Nov 2006 14:09:15 -0500
> This is the big one. This patch restructures the out of ssa pass and
> implements the slower parts faster. 
> The original plan was to abandon the conflict graph completely, but with
> the change to maintaining BB based live-on-entry data, that became more
> difficult. Instead, the amount of things kept in the conflict graph has
> been reduced to a minimum, and the only remaining slowdown I have seen
> from large conflict graphs (390,000 live variables) will eventually be
> addressed by the ssa_name pressure reduction code. Even in this case,
> there is significant reduction in time.
> The motivation for the rewrite was twofold. I needed some restructuring
> in order to move forward with my longer term plan of combining out of
> ssa and expand, as well as the shorter term ssa-name pressure reduction
> optimizations.  The other factor was to address some of the performance
> issues on the larger testcase.  While doing this, I removed a lot of the
> extraneous crud that was leftover from previous work which was no longer
> needed.  The reslut should be much more readable.
> So what exactly is different?  The coalescing code has been split out to
> tree-ssa-coalesce.c.  the original abstract implementation was based on
> a TPA data structure which stood for Tree Partition Associator. It
> allowed the coalescing code to be shared, and both ssa-name could be
> coalesced, as well as non-ssa names when -fcombine-temps was being used.
> THat flag has been removed, and so has all the TPA_ data structure.  
> There also use ot be a root_var structure which was used to determine
> what ssa versions were associated with any given VAR_DECL.  This data
> structure has been removed, and any required equivalent functionality
> placed directly in the var_map structure. 
> I have also removed the -ftree-no-lrs option. This option disabled the
> live range splitting out of ssa does.  It was actually more work to not
> do the splitting, and the required additional code that polluted the
> code base. That also helps clean things up somewhat.
> In vary large functions, the sorted list of coalesceable pairs was still
> causing some time issues. It turns out that something like 50% of all
> coalescable pairs have a cost of 1. So there is a minor tweak in that
> any "cost one" pair is simply stuck in a "cost one" list, and pulled
> when needed. This made a noticeable difference on a couple of large
> functions.
> The conflict graph in conflict.c that I was reusing had some issues when
> very large graphs were created. it is hash based, and when I had 390,000
> interferences, no small amount of time was spent in the hash functions.
> I implemented a custom interference graph for out of ssa using bitmaps
> which runs faster.
> Building the interference graph has been streamlined quite a bit as
> well, and a special data structure for tracking what versions of base
> variables are live (called live_track) help add conflicts more
> efficiently without nearly as many bitmap examinations/traversals.
> The mechanism for dealing with abnormal coalesces was combined with
> normal coalesces by introducing a MUST_COALESCE_COST.  These are
> max-cost pairs which if they don't coalesce, trigger an error.  This
> helped clean code up as well.
> The old var_map compaction mechanism was a little obscure, so it has
> been replaced with a clearer partition_view set of routines. It amounts
> to the same thing, but easily allows different views of the same data.
> Thats most of it. This has been bootstrapped and regression tested on
> i686-pc-linux-gnu.

I just checked in this patch.  I've re-bootstrapped and verified no new
regressions on i686-pc-linux-gnu.  

I've attached the actual check-in patch.

This resolves most of the speed issues with out of ssa, to the best of
my knowledge.  The remaining issue I am aware of is all.i, which  is
much improved, but needs more work (from 300 to 40 seconds on my
machine). The remaining time is due to 39,000 basic block with 1000
variable live on entry chewing up a lot of conflict graph building time.
This issue should be resolved with the first batch of RABLET changes in
(hopefully) early 2007.


Attachment: PP
Description: Text document

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]