This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Proposal: changing representation of memory references
- From: Zdenek Dvorak <rakdver at atrey dot karlin dot mff dot cuni dot cz>
- To: gcc at gcc dot gnu dot org
- Date: Wed, 4 Apr 2007 16:35:08 +0200
- Subject: Proposal: changing representation of memory references
Hello,
at the moment, any pass that needs to process memory references are
complicated (or restricted to handling just a limited set of cases) by
the need to interpret the quite complex representation of memory
references that we have in gimple. For example, there are about 1000 of
lines of quite convoluted code in tree-data-ref.c and about 500 lines in
tree-ssa-loop-ivopts.c dealing with parsing and analysing memory
references.
I would like to propose (and possibly also work on implementing) using a
more uniform representation, described in more details below. The
proposal is based on the previous discussions
(http://gcc.gnu.org/ml/gcc/2006-06/msg00295.html) and on what I learned
about the way memory references are represented in ICC.
It also subsumes the patches of Daniel to make p[i] (where p is pointer)
use ARRAY_REFs/MEM_REFs.
I am not sure whether someone does not already work on something
similar, e.g. with respect to LTO (where the mentioned discussion
started), or gimple-tuples branch?
Proposal:
For each memory reference, we remember the following information:
-- base of the reference
-- constant offset
-- vector of indices
-- type of the accessed location
-- original tree of the memory reference (or another summary of the
structure of the access, for aliasing purposes)
-- flags
for each index, we remeber
-- lower and upper bound
-- step
-- value of the index
The address of the reference can be computed as
base + offset + sum_{idx} offsetof(idx)
where offsetof(idx) = (value - lower) * step
For example, a.x[i][j].y would be represented as
base = &a
offset = offsetof (x) + offsetof (y);
indices:
lower = 0 upper = ? step = sizeof (a.x[i]) value = i
lower = 0 upper = ? step = sizeof (a.x[j]) value = j
p->x would be represented as
base = p;
offset = offsetof (x);
indices: empty
p[i] as
base = p;
offset = 0
indices:
lower = 0 upper = ? step = sizeof (p[i]) value = i
Remarks:
-- it would be guaranteed that the indices of each memory reference are
independent, i.e., that &ref[idx1][idx2] == &ref[idx1'][idx2'] only
if idx1 == idx1' and idx2 = idx2'; this is important for dependency
analysis (and for this reason we also need to remember the list of
indices, rather than trying to reconstruct them from the address).
-- it would be guaranteed that the computations of the address do not
overflow.
-- possibly there could be a canonicalization pass that, given
for (p = q; p < q + 100; p++)
p->x = ... {represented the way described above}
would transform it to
for (p = q, i = 0; p < q + 100; p++, i++)
{base = q, offset = offsetof(x), indices: lower = 0 upper = ? step = sizeof (*p) value = i}
so that dependence analysis and friends do not have to distinguish
between accesses through pointers and arrays
Zdenek