Single Static Assignment.
This is a representation of the program where each variable is assigned exactly once. Multiple assignments to the same variable in the source code are rewritten as assignments to subscripted independent variables, eg:
a = 2; a = b * a;
becomes:
a.0 = 2; a.1 = b * a.0;
Where there are control flow merges that require multiple versions of a variable to be merged,a PHI (<?plugin TeX2png text="$\phi$"?>) node is used to join together the appropriate versions:
if (x) a.0 = 2; else a.1 = 3; a.2 = PHI(a.0, a.1)~; a.3 = b * a.2
There are many reasons for using an SSA form, but the three most important ones are:
- In SSA form every USE only has a single reaching DEF, so in SSA form you have fewer
- USE-DEF chains.
- Because every variable is assigned only once, the USE-DEF chain is simply a pointer
- to a single statement defining the temporary that is being used. No data flow equations have to be solved to find the reaching DEFs!
- Because different versions of the same variable may overlap, SSA form effectively
- splits live ranges.
Real programs never are in SSA form. SSA form is just a convenient abstraction for many kinds of compiler analysis and code transformation algorithms. At some point in the compilation, the compiler has to rewrite the program from SSA form back into normal form. To convert SSA form back to standard code, a PHI node with n arguments is basically converted to n assignments, each in the basic block with the (only) definition of the argument.
if (x) a.0 = 2; a.2 = a.0; else a.1 = 3; a.2 = a.1; a.3 = b * a.2
and each subscripted variable becomes a different variable ("pseudo" in GCC parlance) -- there are more sophisticated algorithms, but they are only optimization of this basic idea.
In practice, each argument of the PHI function also refers to a predecessor edge of the block holding the PHI, which allows to perform constant propagation trivially like this:
if (x) goto BB3; <edge 1> else goto BB3; <edge 2> BB3: a.2 = PHI (2 <from edge 1>, 3 <from edge 2>) a.3 = b * a.2
This already looks very much like GIMPLE! When going out of SSA, this becomes
if (x) a.2 = 2; goto BB3; else a.2 = 3; goto BB3; BB3: a.3 = b * a.2
In reality, going out-of-SSA is a little more complicated:
- Overlapping live ranges of the same variable may exist, which means you can not just
- insert assignments to variables in basic blocks.
- The compiler wants to minimize the number of assignments that have to be inserted.
- The compiler wants to place the assignments it must insert on cheap execution paths
- or in ways that do not create situations where an extra basic block is necessary.