[dataflow][patch] Convert conflict graph to lower triangular bit matrix

Mon Feb 6 16:48:00 GMT 2006

Peter Bergner wrote:

>  I wasn't planning on implementing a hashed based conflict graph, as they
>
>can be fairly expensive compared to a bit matrix when determining that
>two pseudos do not in fact conflict, since you need to scan down the
>hash chain all the way to the end before you can say for certain they
>don't conflict (which for sparse matrices, would be the common case
>for two arbitrary pseudos).  You could try and sort the hash chain, but
>that just makes creating it more expensive and I'm guessing you'd still
>on average have to scan half way down the hash chain.
>  
>
You can count the total number of conflicts before you create the hash 
table, so
that you have a choice of short hash chains, or using one big array with 
no chains
at all - just a collision strategy that uses another element in the array.

>My plan for the future is to implement an idea I came up with a while
>back (but never tried) for minimizing the size of the bit matrix to
>something even smaller than a lower triangular bit matrix, hopefully
>a lot smaller.  Conceptually, my idea is to create another mapping
>similar to the pseudo regno -> allocno mapping (conflictno?) that
>allows us to determine for a (large?) class(es) of pseudos, that two
>pseudos do not conflict simply by comparing their conflictnos,
>For this class(es) of pseudos, since we can tell they don't conflict
>simply by looking at their conflictnos, there's no need to allocate
>space in the bit matrix for them.
>  
>
You could easily find out for lots of registers that they don't conflict if
you have a bitmap for each register that says in which basic blocks the
register is live at all.  If these bitmaps don't intersect for two 
registers,
the registers don't conflict.
However, that dosn't actually allow you to represent the sparse matrix in
a more compact form unless you have an efficient way to map two potentially
conflicting registers to the data that tells if they actually conflict.
I know bison / byacc have some clever algorithms to compress parsing tables,
but I suspect they will be too costly to make sense to use in global_alloc.

Something more feasible (i.e. linear time for the optimization of bitvector
allocation) might be:

- Scan all insns, and linearily assign new numbers to pseudos when they are
  first mentioned.  Also, whenever a register is mentioned, keep track what
  the next-to-be-assigned number is.
- All conflicts of a register with another register that has a higher number
  can be represented with a bits vector with as many bits as the difference
  between the number of the register and the next-to-be assigned number
  when the register was last mentioned.  Sum up these differences to 
calculate
  the total size of the bitvector required to represent all conflicts.  
While
  doing this summation, also note for each pseudo register at which offset
  into the total vector its conflict information is stored.

Registers local to a basic block are handled efficently, just as are 
ones that
are local to an extended basic block or a block with preheader, as long 
as the
preheader and loop blocks are adjacent.
The degenerate case here is again a triangular matrix, with little overhead;
however, iterating over all conflicts for a particular register is awkward
without an adjacency list, even a bit worse than with your plain 
triangular matrix.

If this compressed bit vector compares well in size to an adjacency list, we
could likewise calculate the lowest numbered register which is last 
mentioned,
and use a conflict bit vector for each register which is valid for any 
register where
the range of insns it is mentioned in overlaps.
The degenrate case is a here is a full matrix (minus the diagonal :-) , 
but we can do
a word-at-a-time scan for all conflicts pertaining to one register, 
without visiting
the ranges where we know that the range the registers are mentioned in 
don't overlap.

>What are these class(es) of pseudos you ask?  The simplest form, and
>the one I'll attack first, are pseudos that are local to different basic
>blocks (local in the sense that their entire live range/web is limited
>to a single basic block). 
>
Strength reduction will create registers that are live in the loop 
preheader as
well as in the loop itself.  Partial redundency elimination will also create
pseudos that are live in multiple basic blocks.  I don't think the number of
pseudos live in multiple basic blocks is low enough to avoid the quadratic
component from them dominating the cost.  You'll need some significant gain
to offset the cost of the added code complexity.

> 
>A related form that might come in handy if for a particular program
>where the number of global pseudos is fairly large with respect to the
>total number of pseudos, would be to use region information to split
>the global pseudos into multiple non-conflicting groups.  This would
>be reflected in their conflictnos allowing us to easily determine that
>two pseudos from different non-overlapping regions do not conflict.
>  
>
I don't know what kind of conflictnos you have in mind for 'global' 
registers -
assuming, that is, that you actually have some idea how to generate and use
these conflictnos.