This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

RE: [rfc] proposal for compilation unit wide alias analyis

From: Chris Lattner <sabre at nondot dot org>
To: Kenneth Zadeck <zadeck at naturalbridge dot com>
Cc: gcc at gcc dot gnu dot org
Date: Mon, 28 Jun 2004 02:37:11 -0500 (CDT)
Subject: RE: [rfc] proposal for compilation unit wide alias analyis

Kenneth Zadeck wrote:
> 1) Analyze the call graph to discover which functions do not
> transitively call uncontained functions.  The call sites to these
> functions need only account for the side effects contained in the
> function bodies that may be transitively called from those sites.

> 2) Analyze the contained variables to see which ones escape the
> compilation unit.  By escape, we mean which ones have their address
> taken and that address passed outside of the compilation unit.  If
> no such operations exist for that variable (and we expect that
> taking the address of a contained variable is rare), these variables
> can be explicitly accounted for at calls (knowing the call graph)
> rather than taking pessimistic assumptions about all of them as we
> currently do.

Hi Kenneth/gcc listers,

I haven't really seen a discussion of the algorithm itself, so I spent a
few hours and hacked together a simple prototype of the flow-insensitive
version in the LLVM system. (the analysis starts in the run method):
http://mail.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20040628/015541.html

It takes a very simple approach, first identifying "escaping" variables,
then tracking the mod/ref sets of these variables bottom-up on the call
graph.  I've tested this simple algorithm, and it allows the elimination
of hundreds of load instructions in common programs like the SPEC
benchmarks.

This is just a prototype for several reasons.  First, 175.vpr is
miscompiled (and I don't have time right now to look into why).  Second,
it uses really dead-simple but inefficient data structures (std::maps of
pointers instead of bitvectors).  Third, it is extremely pessimistic about
calls to external functions.  In particular, say there is a static
function that accesses non-escaping global "G".  If the address of this
function can escape, then we must know that calls to external functions
could call this function, and, as such, access G.

The correct way to implement this is the keep track of whether or not
functions transitively escape.  For now, the code is just extremely
pessimistic, assuming that anything that can transitively call an external
function can modify all globals.

This analysis seems to be pretty effective, though the current
implementation is slow.  For example, when I used it to drive a simple
GCSE algorithm at link-time on 176.gcc, it is able to eliminate 485 more
loads (which made 658 other instructions redundant) than an aggressive
local alias analysis (about twice as many).  I think that the analysis
could be sped up by an order of magnitude (if currently takes ~4.9s on a
1.8Ghz AMD) by using bitvectors and not storing obvious duplicates of data
for each function.

Does this sound like a sane approach?  The analysis seems to strike a
reasonably good balance between precision and speed, as it is basically a
simple context-sensitive alias/modref analysis for global variables.  I
don't know if this is useful at all as a prototype in your quest for
implementing this in GCC, but could be a good way to play around with
different ideas or something.  *shrug*  If nothing else, perhaps this will
spawn a discussion of the algorithm itself instead.

-Chris

-- 
http://llvm.cs.uiuc.edu/
http://www.nondot.org/~sabre/Projects/

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]