This is the mail archive of the
mailing list for the GCC project.
Re: [tree-ssa] AST optimizer in C++?
- From: Chris Lattner <sabre at nondot dot org>
- To: Daniel Berlin <dberlin at dberlin dot org>
- Cc: Diego Novillo <dnovillo at redhat dot com>,Pop Sébastian <pop at gauvain dot u-strasbg dot fr>,<gcc at gcc dot gnu dot org>
- Date: Mon, 26 Aug 2002 20:14:53 -0500 (CDT)
- Subject: Re: [tree-ssa] AST optimizer in C++?
On Mon, 26 Aug 2002, Daniel Berlin wrote:
> On Mon, 26 Aug 2002, Chris Lattner wrote:
> > On Sun, 25 Aug 2002, Daniel Berlin wrote:
> > > > Is there a reason that you'd like to have a SIMPLE form that is not in
> > > > SSA?
> > >
> > > Um, yes.
> > > It artificially limits us to that representation.
> > > Say later, someone wants to use some other form of SSA, or non-SSA
> > > (dependence flow graph, etc).
> > > If you make it actually part of the trees, they now have to care about
> > > it.
> > > If it's just annotations, only what wants to look at those annotations
> > > look at it.
> > Actually, that's not true at all. Using SSA as the base representation is
> > just like using a conventional register based underlying representation:
> > you can still layer other representations ON TOP of SSA, you just need to
> > keep the SSA up-to-date as you transform the layered representation (just
> > as you must do with the register based representation).
> No kidding, which is the problem.
> Just to be clear, i mean the fact that we have to layer things on top of
> it is the problem. not the updating.
I understand exactly what you're saying. My point is that if you want to
do a non-SSA representation TODAY, that you have to deal with layering.
Changing the base representation to SSA form doesn't change this fact.
> > In practice, there are several simple extensions to SSA that are
> > representable as direct extensions to the representation [eta nodes, pi
> > nodes from the ABCE paper, etc], which don't require a layered
> > representation. Other representations are really major extensions of SSA
> > which are enhanced by having SSA as the base representation.
> Not necessarily true at all.
> But i'm also not saying SSA is a bad base, i just don't think we need to
> make it the *actual* IR.
I understand what you're saying. :)
> > Of course, I'm sure there are others I'm not familiar with, but worst
> > case, you just have to keep the base representation up to date with the
> > layered representation, which is nothing different than the current
> > situation with SSA.
> This has nothing to do with keeping it up to date. It has to do with the
> fact that other things would need to know about PHI nodes, etc.
> We can easily just rebuild the SSA form if we want to run a non-SSA pass
> in between two SSA ones (or whatever). But it's quicker to be able to just
> *RUN* that pass without having to drop PHI nodes, insert copies, etc, just
> so that non-SSA pass doesn't get confused.
This has everything to do with keeping it up to date. This is the
situation as it exists today with a layered SSA reprsentation:
1. All SSA transformations are penalized by having to update both SSA form
and standard register form. This makes SSA transformations harder and
less likely to be written.
2. Alternative representations can be represented just like SSA is now,
3. Non-SSA transformations don't have to deal with SSA at all, they can
just ignore it and have it garbage collected away.
4. Non-SSA transformations cause a huge compile-time performance hit,
because they invalidate the SSA information, requiring it to be
rebuilt. SSA generation isn't SLOW, but regenerating it frequently can
certainly add noticably to compile time. If dominator information is
also out of date then things get really slow for big codes.
5. Transformations on alternative representations can either be aware of
SSA or not. If they aren't, then they suffer the same problems as #4.
If they are, then they have to maintain and manipulate _3_ different
representations, keeping them up to date.
... and this is the situation with SSA built into the representation:
1. SSA transformations just update the one true representation. There is
nothing to get out of date, and they are simple to write.
2. Transformation in alternative representations are implemented as layers
on top of the SSA representation. There isn't a huge difference
between this case and a register based representation.
3. You don't have to allocate as much to maintain two reprentations. You
don't have to track whether or not things are up to date, because SSA
is always available. Thus, the optimizer runs faster.
4. Non-SSA transformations are harder to write. See below.
5. Transformations on alternative representations have to maintain _at
most_ two representations. The register based representation doesn't
factor into the picture at all.
About Non-SSA transformations:
1. I have yet to be convinced that there is a need for such a thing at
all. Entire optimizers (Pro64, LLVM) have been written in SSA, and I
haven't come across an example of a transformation that is awkward to
implement in SSA. If you have a good example, please let me know.
2. Even if you do come up with a transformation, you can use a (generally
useful) pass like the LLVM mem2reg pass. This transformation basically
promotes stack allocated values accessed with load & store
instructions into SSA registers (memory is not in SSA form).
If you have a transformation that really doesn't want to maintain and
update SSA form, you simply have it so that it converts the stuff it's
working on into loads and stores of a stack allocated value. After the
pass you run the mem2reg pass, which cleans everything up for you
(basically constructs SSA just for the values you're interested in)
Either way, non-ssa transformations are not a problem, and again, I'm not
aware of anything that doesn't like to be in SSA.
> > The key difference with using SSA as the base representation, however, is
> > that almost ALL transformations benefit from it. SSA will probably be the
> > _most_ commonly used representation when the tree-ssa branch is finally
> > mature (hence the name), beating out the register based representation by
> > a lot. I'm not really sure when and if other representations would
> > actually be useful, and if their usefulness is worth the cost of creating
> > and maintaining the representation: whether layered on top of registers or
> > SSA.
> DFG, PDG, etc are generally worth the cost of creating.
These representations are also easily built "on the side" on top of SSA,
so they aren't very good for showing how SSA gets in the way...
> > > Once again, forcing SSA on the representation itself means they have to
> > > care about it (or you've limited when they can be run), while annotations
> > > have no such limitations.
> > Sure, of course. If, however, they don't understand the annotations, they
> > will be corrupted by the pass. This means that you have to destroy and
> > regenerate the SSA annotations.
> The other option is to not be able to run the pass at all.
> Why limit ourselves by making sure we can't run that other pass, when we
> get no actual *benefit* from doing so?
That's not true at all, please see the remarks above. You seem very
concerned about these transformations... can you give me some examples?
> > This is greatly expensive, which means
> > that the transformation will be updated to use SSA anyway (often making it
> > more efficient). While forcing SSA from the start makes it harder to
> > transfer legacy code,
> It's not necessarily legacy code.
> All the world is not SSA.
> Even if it was, it may not stay that way forever.
That's true, but even though SSA is not perfect, it's a HUGE step up over
just using a register based representation. Why limit ourselves to the
lowest common denominator?
> > > There is no good reason to actually make it part of the form, other than
> > > having to insert copies yourself. But all algorithms i've seen that are
> > > written to keep SSA form up to date do this anyway, or make it easy to do.
> > I'm not sure if I understand what you're saying here... it seems as though
> > you are contradicting yourself by saying that SSA _is_ easy to keep
> > up-to-date (which I believe it is, of course :) What am I
> > misinterpreting?
> You keep missing the actual point.
> 1. There is no benefit to putting SSA in the trees, as opposed to
> 2. There is a cost to putting SSA in the trees, regardless of whether
> you feel the things that aren't SSA, are worth squat.
> Thus, given this, *why* do it?
> What does it buy us?
> That's what i want to know.
Obviously because maintaining the representation on the side has costs,
costs that you seem to be ignoring. If you want me to enumerate some of
them, I can.
The whole point is that SSA is a very nice representation for performing
optimizations on, and therefore you want to make writing these
optimizations easier. If it makes the less common case a bit harder, then
that's ok, because it's not common. Sorta like profile guided
optimization I suppose, were we are optimizing initial development and
ongoing maintenance instead of execution time.