This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: [ tree-ssa] PRE problem
- From: Daniel Berlin <dberlin at dberlin dot org>
- To: law at redhat dot com
- Cc: "gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>,Diego Novillo <dnovillo at redhat dot com>
- Date: Fri, 20 Feb 2004 21:44:32 -0500
- Subject: Re: [ tree-ssa] PRE problem
- References: <200402210136.i1L1a9nR007226@speedy.slc.redhat.com>
Actually, I've been thinking about this some more -- let's hold off on
any patches for a little while.
Okey dokey.
I've reordered your message so the longest reply is last.
Figured it might be easier:
1. I don't think we need/want may_propagate_copy to be used in this
code.
Honestly, I don't either, i just consider it a workaround.
2. Nor do we want to use propagate_value/propagate_copy.
Right, we shouldn't have to
3. There is no need for this code to signal a failure to its
callers. A failure for this code is worthy of an abort
given the clarifications we've done for copies appearing
in the IL.
For insertion, yes. For verification, i'm not so sure.
4. We need a function which updates the annotations and only aborts
if there is an inconsistency in the memory tag info.
Right. That's what insertion wants. verification just wants the
versions to be correct. :)
What's going on in this code really isn't copy propagation or reverse
copy propagation. I've always been aware of that, but used those
terms since what we were doing has a lot of similarities to copy
propagation. Using those terms probably led Diego and myself down
the wrong path.
This code is doing something that is fundamentally different and
probably
needs a different set of APIs to interact with.
Probably, but you said it was blocking major work, so i was more
concerned with working around it and figuring out a better API later.
Ideally it just wants something that updates annotations for insertion,
and we don't care about annotations for verification (only versions
matter).
This all gets trickier with SSAPRE strength reduction, but at this
point i'm willing to wait for someone to invent an easier algorithm if
it means our code is nicer looking.
The problem is, if we are going to modify variable annotations, i still
have to differentiate between inserts and non-inserts for this
function.
See, what it really does is this:
In ESSA renaming (which is just like SSA renaming, except it's a bit
more complex because there can be multiple variables that affect your
version),when we hit an EPHI operand to rename (IE need something to
use as the EPHI_ARG_DEF for an edge), we optimistically assume that
whatever is at the top of the renaming stack will still be valid as
that EPHI operand.
This may or may not be true. We later verify that assumption (which is
what this code is generating expressions for), and if the generated
expression doesn't match the versions in the real/ephi statement we
assumed would be correct above, we correct the ephi operand to be NULL.
In order to verify the versions, we need the vuses and ops to be
correct for that block. So we back-substitute the phi arguments[1]
into the expression so it has the right versions for that ephi operand.
Later, if we perform an elimination of a partial redundancy, we have
something that looks like this
EPHI (NULL, <EREF of a real occurrence>).
In order to fill in the NULL operand, we need the same expression we
generated before, so we just call the same function.
I just realized someone else's explanation of delayed renaming might be
easier to understand.
So here it is from
http://www.google.com/url?sa=U&start=3&q=http://llvm.cs.uiuc.edu/
ProjectsWithLLVM/2002-Fall-CS426-SSAPRE.ps&e=7933
[I changed the terminology to match ours. If you like their wording
better, i can ask them if they mind if I use their description in the
comments for the code.]
Because performing rename this way requires the examination of many
versions of variables that may not appear in any PRE candidate
expression, the algorithm is not sparse. Thus, Kennedy et al. presents
an alternative algorithm called Delayed Renaming[1].
For pure redundancy class/e-version assignment, Delayed Renaming uses
a redundancy class/e-version stack for the expression being analyzed.
Delayed Renaming maintains the invariant that, at any point during
analysis, the top of the stack represents the current class/e-version
and the representative occurrence node for the expression. Each
class/e-version has a representative occurrence, which means that we
can safely replace other occurrences with the same class/eversion and
still maintain the original program semantics. This is due to the
property expressed above that two occurrences of the same
class/e-version have the same value. A representative occurrence is
always a real occurrence or an EPHI Occurrence, and EPHI Occurrences
always get a new class/e-version (since they represent a merge of
expression computations), so there are only four situations that might
arise when attempting to assign a e-class/e-version to an Occurrence:
1. The top of the stack is a Real Occurrence and
(a) Our current occurrence is a Real
(b) Our current occurrence is a EPHI Operand
2. The top of the stack is a EPHI Occurrence and
(a) Our current occurrence is a Real
(b) Our current occurrence is a EPHI Operand
Delayed Renaming is performed in two steps. The first, Rename1,
processes each Occurrence separately, pushing items onto the stack when
they are assigned a new class/e-version, and popping items if they do
not dominate the current occurrence. If the top of the stack is a Real
Occurrence, we have the current version of the variables available and
assigning a new class/e-version is as simple as comparing those
versions to the current occurrence. If the top of the stack is a EPHI
Occurrence, the versions of variables are not provided. To resolve this
issue, Rename1 uses dominance information to determine which
class/e-version is appropriate. This dominance relation is that if all
variable definitions of the current occurrence dominate the EPHI
Occurrence at the top of the stack, then the versions are identical[1].
However, there is one small detail overlooked in Rename1. When the
current Occurrence is a EPHI Operand, there exists no Real Occurrence
which can provide us with the current versions of the variables. In
these cases, Rename1 makes an optimistic assumption and assumes that
the top of the redundancy class stack provides its variables versions
and therefore is given the same class/e-version. This assumption is
either correct or the EPHI operand will have no representative
occurrence, (IE EPHI_ARG_DEF == NULL). Having no representative
occurrence means that the EPHI Operand is not partially redundant.
Rename1 keeps track of each Real Occurrence that is defined by an EPHI
and places them into a set to be processed. This set is processed by
Rename2 which corrects the optimistic assumption regarding EPHI
operands if necessary.
Rename2 processes each item in the set constructed by Rename1. Each of
these Real Occurrences are defined by an EPHI and provides the versions
of the variables at that EPHI that defines it. Rename2 first obtains
the EPHI for the Real Occurrence and notes what basic block it resides
in. If there exists a PHI for any of the variables of that Real
Occurrence in the basic block of its defining EPHI , we must double
check our optimistic assumption made to the EPHI operands.
For each EPHI Operand, a Real Occurrence is manufactured with the
correct versions of the variables at that point. The PHI for the
variable provides us with the version to use when manufacturing this
real occurrence. The manufactured Real Occurrence is compared to the
representative occurrence for the EPHI Operands. If the representative
occurrence is a Real Occurrence then versions are compared. If it is a
EPHI Occurrence, we check if all the definitions of the variables in
the manufactured occurrence dominate that EPHI . If not in either case,
the optimistic assumption was indeed wrong and the EPHI Operand is set
to NULL. If the e-class/e-version is determined to be correct and the
representative occurrence is an EPHI , the manufactured occurrence
needs to be added to the set for further processing in order to ensure
that the operands of that EPHI are also correct.
[This is my favorite part, and why this part of the algorithm has
always been a source of trouble]
vvvvvv
It is important to note that the paper did not explain how to create
this manufactured real occurrence, nor how its def edge ought to be
set. Initially it seemed as simple as cloning the Real Occurrence, but
later proved to be more complicated because a critical detail was
simply left out in the algorithm. The EREF_STMT from this manufactured
occurrence must not be an exact copy, but should be to the
representative occurrence for the EPHI Operand being examined. It is
critical to recursively check EPHI Occurrences and their operands as
mentioned above.
Upon completion, Delayed Renaming will have assigned class/e-versions,
and created EREF_STMTS for each redundancy class of the variable. This
pass is crucial to the success of the algorithm as a whole and during
our implementation and testing, several bugs have been linked back to
this pass due to its complexity.