This is the mail archive of the
mailing list for the GCC project.
[rfc] proposal for compilation unit wide alias analyis
- From: Kenneth Zadeck <zadeck at naturalbridge dot com>
- To: gcc at gcc dot gnu dot org
- Date: Thu, 24 Jun 2004 17:41:51 -0400
- Subject: [rfc] proposal for compilation unit wide alias analyis
In GCC, alias analysis currently starts with overly pessimistic
assumptions for static variables (variables that are allocated at link
time). Whole compilation unit analysis can provide significant
improvement of this information.
The existence of aliases for a variable significantly inhibits the set
of transformations that can be applied to that variable. This improved
analysis described here should discover a large set of variables that
are unaliased, but that currently GCC believes are aliased.
This proposal is written in a manner that is independent of the
language and the number of files that are in a compilation unit.
However, it is understood that the mechanisms for describing the scope
of a variable currently in GCC are very C centric and the other front
ends are forced to fit into this. Other languages have much more
general mechanisms than C for describing the scope of a variable and
in some cases, limit its ability to escape the compilation unit. This
information is lost when translated into what GCC currently expects.
Names can be divided into two groups: names that do not escape the
compilation unit are "contained names" and names that escape the
compilation unit are called "uncontained names". Names can be either
variables or functions. Names that are allocated within the scope
of a function are not relevant to this proposal (since they appear to
be done in a reasonable manner).
In c, if the file is the compilation unit, "contained names" are the
names that have the "static" attribute, but are not allocated within a
function. In Java, if a single class is the compilation unit, these
same names are called "private static". I do not know about C++ or
Ada or any other front ends that are there.
If this analysis is profitable, we should strongly consider adding
more language hooks to support the representation of contained names
for other languages as well as for compilation units that are larger
than a single file. I am not planning to do that part since it will be
mainly a front end task.
The words "compilation unit" generally means whatever is passed to
GCC as a single invocation. Generally this is a single file in C but
may be larger as GCC evolves towards supporting whole program
Currently, GCC does minimal compilation unit level analysis before
starting the detailed compilation of each function within the
compilation unit. Because of this, pessimistic assumptions are made
about contained variables and functions within the compilation unit.
For contained names, compilation unit level analysis could determine
if the value associated with that name ever escapes the compilation
unit. If the value never escapes, more aggressive techniques can be
used when compiling references to that name.
1) Currently all contained and uncontained variables are call
clobbered at each function call. Call clobbered variables cannot be
2) Currently all contained variables are never allocated to registers.
3) Currently contained variables may not have their representation
changed. Contained variables that are structures cannot have their
fields scalarized. Contained variables that do not escape can have
their representation changed assuming that certain other conditions
apply. The analysis proposed here, but done with the compilation
unit being the entire user program, is a necessary condition for
the kind of restructuring proposed in the "art hack" (see
We can improve on this significantly by preprocessing the compilation
unit before the actual compilation of each function begins.
Specially we can:
1) Analyze the call graph to discover which functions do not
transitively call uncontained functions. The call sites to these
functions need only account for the side effects contained in the
function bodies that may be transitively called from those sites.
There is an existing call graph analyzer in cgraphunit.c. This will
be my starting point for this part of the analysis.
2) Analyze the contained variables to see which ones escape the
compilation unit. By escape, we mean which ones have their address
taken and that address passed outside of the compilation unit. If
no such operations exist for that variable (and we expect that
taking the address of a contained variable is rare), these variables
can be explicitly accounted for at calls (knowing the call graph)
rather than taking pessimistic assumptions about all of them as we
There are two independent engineering issues that need to be
1) Where should the analysis be performed?
Currently, GCC processes a compilation unit by first parsing the
entire unit into a language dependent parse tree form and then
transforms those parse trees into gimple one function at a time.
There is some whole compilation unit analysis that is performed over
these parse trees in order to support the inlining infrastructure but
there is no time in which there existed a gimplified representation of
the entire compilation unit.
However, Stuart Hastings of Apple appears to be close to having a
working branch of the compiler that gimplifies an entire compilation
unit. This code is on the tree-profile-branch and was done in order
to support better inlining.
While we could make due with the parse trees, it seems to make more
sense to start the development of this on the profile branch or a
branch of that branch. Stuart has provided me with a road map as to
where to make the changes in this branch. Unless someone points out a
good reason not to build on this branch, this is where the changes
will be made.
2) What type of analysis should be performed?
The big question here is "is there anything to be gained by doing
entire compilation unit flow sensitive analysis?" In flow sensitive
analysis, the control structure of the program is taken into account
to achieve better information. In a flow insensitive algorithm, the
statements are processed as a single unordered list and all of the
control structure is thrown away.
The big gain is that most contained variables and at least some
contained functions do not ever have their address taken and thus
trivially do not escape the compilation unit. For a contained
variable to escape the compilation unit, the address of the variable
must be taken and that address either passed to a non contained
function or the value stored into a uncontained variable. For a
function to escape, either the address of the function is taken and
passed to a non contained or stored in a uncontained variable or the
function needs to call some other external function.
In the simplest version of this analysis, the act of taking the
address of the contained name makes it a uncontained name.
Additional refinement is possible. An iterative algorithm may be
performed. First, the initial partitioning is used to build a
representation of the program which is then analyzed to produce a
better (fewer of the names are made uncontained) solution. The new
partitioning is then used to produce a better representation of the
program that can then be analyzed again and so on for either some
fixed number of iterations or until the partitioning does not change.
The analysis consists of examining the statements in the program to
see the act of taking the address really is harmful enough to cause
the name to be marked as uncontained. The problem is that uncontained
names are effectively a black hole. Once an assignment gets assigned
to a uncontained name or is passed to uncontained function, its value
is lost. As long as the address of some name only passes thru the
contained names, there is hope that the analysis can discover that it
used in a safe way.
If the compilation unit is small, like a single file, it may be
possible to do several iterations of this analysis process. If the
compilation unit is large, the space required to represent the
program may be so large that doing much compilation unit wide analysis
may be too expensive.
It is the view of Zadeck and Novillo that while there is certainly
some benefit to doing flow sensitive versions of these algorithms,
it will take a lot of engineering and the gain will not be large.
So as a first cut, we propose doing flow insensitive analysis but over
the entire compilation unit.
Call clobber analysis.
Currently GCC assumes that all uncontained and contained variables
must remain in memory and cannot be allocated to registers.
Furthermore GCC assumes that all calls may both read and write every
uncontained and contained variable (call clobbering). The goal of
this analysis is to remove many of these restrictions so that at least
the contained variables can live in registers across most calls and
the uncontained variables can live in registers across a few calls.
To do better call clobber analysis we need to first analyze the call
graph to determine the set of transitive contained functions that are
called. For the purposes of this analysis, whenever a call is made to
a function defined outside the compilation unit, it must be assumed
that that call may both read and write all uncontained variables as
well as may call all uncontained functions. Thus, the larger the
compilation unit, the smaller the percentage of calls made to these
When a function call is made, what is the effect of that call on the
static variables of a program? For calls to contained functions that
only make calls to contained functions, the set of contained and
uncontained variables that may be affected by the call can be
enumerated. For calls functions defined outside of the compilation
unit, or for calls to contained or uncontained functions that may make
calls to functions outside the compilation unit, the the set of
contained variables that can be effected can be enumerated. However,
this set contains all of the variables clobbered by ALL of the
uncontained functions in the compilation unit. Furthermore, it must
be assumed that all of the uncontained variables are clobbered by the
We need to make a conservative estimate of which calls may either read
or write which contained variables. Calls that only read the contained
variable only need to assure that the variable in memory has the same
value as the register before the call. Calls that write the contained
variable need to assure that the value is not kept in the register
across the call. However, calls that neither read nor write the contained
variable can keep the variable in a register across a call.
Once the compilation unit has been entirely gimplified, the processing
is relatively cheap (at least if you are not doing the flow sensitive
analysis described above).
The first step to computing this is to look at the body of each
function and compute three sets, the set of contained variables read in
this function, the set of contained variables written in this function and
the compilation wide set of contained variables whose address is taken.
Any contained variable whose address is taken in any function is marked as
being a uncontained and is not considered further by the analysis.
Then the transitive closure of the call graph is computed. Any call
that is made to a function that is external to the compilation unit is
considered as a call to the set of all functions that have exposed
names beyond the compilation unit and the set of all functions whose
address is taken.
Currently, GCC has two engines for analyzing a function and computing
the points to sets for each variable in the program. In addition to
this, GCC also has a type based analyzer that compares two alias sets
to see if they conflict. The analysis described here can improve all
three engines. Variables marked as contained variables cannot be in
any points to list since there addresses are never taken. Without
this analysis, only the type system keeps all of these variables from
pointing to each other after every call.
In addition to analysis described above, it should be profitable to
look for staticly allocated structures that are pointed to by the
contained variables that never have any of their address taken. The
obvious point is that if a contained variable points to a structure that
is only allocated once and assigned to the contained variable and neither
it nor none of it's fields ever have there address taken, then the
fields of this structure can be treated as if they were contained
variables themselves. This specific form of scalarization that is likely
to be very profitable.
Once the mechanisms described here are in place, we can play with
adding dynamically allocated structures that are only allocated once
or only allocated within the scope of the compilation unit but do not
have their address taken except to free them and scalarized from there