This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [tree-ssa] AST optimizer in C++?


On Sun, Aug 25, 2002 at 04:36:59PM -0500, Chris Lattner wrote:
> 
> > OTOH this information is not ready to be used dirrectly by optimizers since IRs
> > can produce different representations for the same semantic-equivalent program.
> 
> I'm not really sure what you mean by this.  In my view, it is always best
> to eliminate the number of redundancies and number of cases that must be
> handled in the representation (ie, by lowering ++X to use primitives, you
> don't have to handle the ++ case anymore).  I'm not sure if this is what
> you're getting at, but if so, I certainly agree.
> 
Exact.  

  In fact by reducing the number of cases to be handled (as in reducing synactic 
sugar, and control sugar) we simplify the optimizer.  Reducing "control sugar" is 
something that is well done in LLVM since all loop constructs are lowered to the 
same common denominator: branch instructions.

  However after the discussion about the SWITCH_STMT elimination I saw things 
differently:  in a first step we reduce control structures to a common 
representative, then it is worth to consider that some architectures support high 
level control constructs and then a promoting pass has to be applied. 
  Thus the underlying question is: is it worth to lower statements for promoting 
them later?  

  My opinion is that we win in having less complex optimizers.

[...]
> 
> Note that this representation is equivalent to how RTL stores the Mode
> information in all of the "values" it uses.  Here, however, the type
> system is much more powerful (and useful).  My key point is that this
> information is kept IN the representation itself, not along-side in a
> table 

same thing happens in tree representation where symbol table is stored in 
DECL_STMTs dirrectly in the tree.

> that is:
> 
>  A. Slow - in the best case, accessing it is a hash table lookup vs a
>     simple pointer indirection
>  B. Large - again, hash tables have overhead
>  C. Painful - The symbol table must always be kept up-to-date.
> 

you're right... sorry for previous mail.

> 
> This is great, but it doesn't really answer how we get a CFG out of this.
> The simplest CFG interface can tell you all of the incoming and outgoing
> edges of a block.  To do this, we have iterators that wrap the following
> [trivial] analyses:
> 
[...]
>   the succ_iterator.
[...]
>   the pred_iterator basically loops over the users of the basic
>   block, looking for terminators instructions.
> 

Yep that's a nice interface for handling CFG nodes.
I saw something like this also in open64.  
It's so simple to just say: iterate over all BB in topological order
and just write "for (topological_CFG::iterator i = ..."  
:-)

> This just leaves one feature lacking from the CFG support:
> 
> >   However LLVM does not provide support for high level loop constructs
> > (nested loops for example).  It defines instead branch "br" instructions.
> > You then need extra informations for loop nest optimizers.
> 
> That is very true.  In fact we intentionally do not provide these
> facilities as part of LLVM itself, for several reasons (below).  To
> provide LoopInfo [http://llvm.cs.uiuc.edu/doxygen/classLoopInfo.html], we
> use the LLVM Pass infrastructure (described in detail in:
> http://llvm.cs.uiuc.edu/docs/WritingAnLLVMPass.html) to automate the
> construction and deconstruction of loop information as neccesary.  Our
> "loop information" simply identifies Natural loops and their nesting
> properties.
> 
> There are several important aspects of loop information that make it
> different from other parts of the representation, warranting different
> treatment [instead of being part of the representation itself, it gets
> treated like alias analysis information or dominator information]:
> 
> 1. Updating loop information is not trivial.  
[...]
> 2. Most transformations don't need it.  
[...]
> 
right, and right
It seems that the same thing happens at this level again. 
We can store information on the side structures and deal with reduced 
sets of loop structures.

> 
> Do you have any idea what kinds of things are possible in simple that you
> don't think would be possible (or pleasant) in LLVM?  I'd like to get an
> idea if there is something that would be hard to do in LLVM, but is easy
> in SIMPLE (to figure out how much overlap there is).
> 
Hmmm, I have to think more about this...

For the moment I see that LLVM can express high level loop structures by 
handling them as side information.  This trick can be used also to store 
information about loop induction variables, thus fortran's DO-loops can 
be effectively handled by optimizers in the same way as on tree structures.
And since you handle only the lowest level loop structures you win in 
having low level complexity optimizers.

[flat trees]
LLVM could be obtained by goto introduction from SIMPLE 
  break, continue -> br
  for, while, do -> br
  switch -> switch ("an extended br")

But then you have to construct loop information before you flatten statements.

> I'm really sorry for the length of this mail, but I think it helps explain
> a bit of the philosophy behind LLVM...
> 
thanks a lot for contributing ideas :-)



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]