Contributors:
- Dan Berlin
Delivery date:
- Now
Benefits:
- Superior code generation.
Risks:
- Compile-time performance.
Description:
- This is part of structure aliasing support (a.b vs a.c) This work enables us to differentiate between different fields of structures in tree-ssa's virtual form. This lets stores/loads from non-overlapping structure fields not conflict. The upshot is that we can generate the same code for both loops, at the tree level (though for the below example, we couldn't even do it at the rtl level) in things like:
int x; int y;
struct { int x; int y; } global;
int foo() {
int i;
for ( i=0; i<10; i++)
y += x*x;
for ( i=0; i<10; i++)
global.y += global.x*global.x;
}- In addition, it allows greater code movement of existing structure references at the tree level, where previously only the rtl level could perform this. This should help us remove some more code from the rtl optimizers. Note that SRA doesn't help cases like the above, because it can't SRA something that lives in memory. The structure aliasing code can work on all structures (and for that matter, fixed sized arrays, though this not currently implemented), and is a complement to SRA. The only portions of the compiler that were modified were tree-ssa-alias.c, tree-ssa-operands.c, the associated .h files, and the two passes that currently attempt to create alias tags for variables (ivopts and the vectorizer). The work involves creating a list of structure fields in use, and then translating each of these structure fields into a fake variable for aliasing purposes (much like we create fake variables to represent all things in a given alias set that alias each other). These fake variables work just like other alias set fake variables, and are used in the virtual form. Each structure field in a structure that is used has a fake variable associated with it. This form was chosen after much discussion with Andrew Macleod, Diego, and rth over what to do, and trying a different approach first (partial defs/uses). Thus, the above would be represented by the following pseudo-code form:
int x; int y;
struct { int x; int y; } global;
int foo() {
int i;
for ( i=0; i<10; i++)
y += x*x;
for ( i=0; i<10; i++)
{
VUSE (<fake variable named globalfield1:0:32>)
t = global.x * global.x
V_MAY_DEF (<fake variable named globalfield2:32:32>)
global.y += t;
}
}- where the loads from global.x and store to global.y don't interfere (though the load from global.y and store to global.y interferes. V_MAY_DEF implies a VUSE) whereas previously it would be
int x; int y;
struct { int x; int y; } global;
int foo() {
int i;
for ( i=0; i<10; i++)
y += x*x;
for ( i=0; i<10; i++)
{
VUSE (<fake variable named global>)
t = global.x * global.x;
V_MAY_DEF (<fake variable named global>)
global.y += t;
}
}- and all the loads and stores would falsely interfere. The main risk of this code is in compile time performance. Do not misunderstand. It currently has comparable (within 1%) compile time performance to mainline. It currently passes all regression tests, and regressions are rather easy to fix if they occur, because the code is simple, and it's easy to tell whether the structure overlaps were created correctly (IE it's clear from the dumps what fields it thinks are being accessed by a given structure reference, and it's easy to see whether this is correct or not) However, there are currently pruning heuristics used when creating the "fake field variables" so that they are created only for fields between the max and min used parts of the structure. This works very well for most structures. However, one can create a testcase with a very large (in terms of number of fields) structure, and use the first and last fields, but none in between, and cause our compile time to increase. This is easily fixed by limiting what structures we create the fake variables for, or by tracking at a more granualar level, what fields are really used. One really needs to see actual code where people do this in order to decide which approach to take. One place this could probably occur is in gfortran code with lots of common variables (since they go through a structure access, and are in no particular order). As I said, the risks are known, and are easily fixable. We just need to see bad code in order to determine what is the best thing
to do about them
.