This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

[RFC] New intermediate languages

I am working on some infrastructure changes to introduce multiple
intermediate languages in the compiler. The primary goal is to
have a gradual IL lowering scheme to allow the implementation of
analysis and optimizing transformations at the appropriate level
of abstraction.

Essentially, I want to remove the tree<->RTL interface and
introduce one or more intermediate representations in between the
two. In addition, transformations that are currently done at the
RTL level should be moved (or replicated) to higher level ILs
(eg, LCM, constant propagation, control flow transformations,
loop transformations).

An important design consideration is to minimize the impact these
changes are going to make to the rest of the compiler. This is
not something that will happen overnight, obviously. These
changes are very pervasive and introducing them in one monstrous
patch would be impossible.

As a first step, I am implementing an SSA representation for
trees. Although trees may not be the ideal IL for an SSA
representation, they do have sufficient information to get the
job done: we can build a flowgraph and find def/uses of program
variables. The implementation does the following (so far I've
only dealt with the C front-end):

1- In c_expand_body() we insert a call to the tree optimizer
   before calling expand_stmt().

2- The tree optimizer walks the tree representation of the
   function looking for defs/uses of scalar variables (I'm
   avoiding array/structure references for the time being).
   Defs/uses are stored in a separate data structure with
   pointers into the trees.

3- Use find_basic_blocks to build a flowgraph. Here we need to
   make a few changes. Currently, flow.c assumes that basic
   blocks contain RTL. 
   I am making changes so that basic blocks contain a union of
   trees and RTL, and the generic flow routines use callbacks for
   IL specific things like counting basic blocks and finding
   block boundaries. Although I'm not done yet, it doesn't seem
   that I'll need to change too many things. Once I have this I
   can re-use all the CFG related functions with impunity.

4- Building the SSA form can be done without modifying the trees.
   I don't think I want to try and re-use the current SSA code, I
   do not believe in SSA re-writing the program. It's better to
   overlay the SSA information as a web of pointers on top of the

Things I see happening at this level are control flow clean-ups,
constant propagation (SCC), PRE, and things like that. Using a
similar approach we can build a loop oriented IL to do array and
loop transformations. Trees already seem to contain much of what
we need.

I realize that we are gearing toward GCC3.0 so I don't expect to
be introducing these changes anytime soon. However, I should have
patches ready in the next few days if anyone is interested.
Nothing is set in stone, new ideas and patches are most welcome.

Another idea is to start implementing a mid-level RTL
representation (MRTL). The key idea is to reduce target ties by
using canonical RTL patterns for everything needed by the
tree->RTL transformation pass. This is more of a long-term thing.
Currently the tree<->RTL interface is a bit too intertwined.

These are some of the general properties of MRTL.

- Reduced target dependencies. The RTL patterns generated at this
  stage are fixed and have well-known semantics. Some
  architectural features will likely creep in, like word size,
  number of physical registers and whatnot.

- No information loss in the tree->MRTL transition. After the
  MRTL form is transformed it will not be possible to go back
  to the tree information.

- Control flow representation is similar to RTL. No structured
  control-flow, everything is represented with jumps and

- Arrays are lowered to pointer arithmetic and structures
  flattened. No libcalls are generated for floating point and/or
  integer arithmetic.

These patterns would be generated directly from stmt.c/expr.c.
The plan is to write two passes tree->MRTL and MRTL->RTL which
should not change existing behaviour. Once this is done, I would
start building a CFG and an SSA form for MRTL (this can be mostly
ripped off the current implementation). At that point we can
start porting scalar transformations to the new form.

Data types

Use the same modes defined in RTL.

Control Statements

(mcall (addr) (arg1) (arg2) ... (argN))
    This is a new pattern for function calls. Alternatively we
    could define an abstract stack machine and expand calls using
    the existing code. I'd rather see a more compact
    representation, though.

(jump_insn (set (pc) (expr)))
    Standard jump. expr can be either a label_ref or an
    if_then_else for unconditional and conditional jumps.
(return X)
    Standard return with and without a value.

(note NOTE_*)


  (const_int i)
  (const_double:m addr i0 i1)
  (const_string str)
  (symbol_ref:m symbol)
  (label_ref label)

  (mem:m addr alias)
  (reg:m n)
  (set lval x)

  (plus:m x y)
  (minus:m x y)
  (compare:m x y)
  (neg:m x)
  (mult:m x y)
  (div:m x y)
  (udiv:m x y)
  (mod:m x y)
  (umod:m x y)
  (smin:m x y)
  (smax:m x y)
  (umin:m x y)
  (umax:m x y)
  (not:m x)
  (and:m x y)
  (ior:m x y)
  (xor:m x y)
  (ashift:m x c)
  (lshiftrt:m x c)
  (ashiftrt:m x c)
  (rotate:m x c)
  (rotatert:m x c)
  (abs:m x)

  (eq:m x y)
  (ne:m x y)
  (gt:m x y)
  (gtu:m x y)
  (lt:m x y)
  (ltu:m x y)
  (ge:m x y)
  (geu:m x y)
  (le:m x y)
  (leu:m x y)
  (unord:m x y)
  (if_then_else cond then else)

  (sign_extend:m x)
  (zero_extend:m x)
  (float_extend:m x)
  (truncate:m x)
  (float_truncate:m x)
  (float:m x)
  (unsigned_float:m x)
  (fix:m x)
  (unsigned_fix:m x)
  (fix:m x)

  (asm_input s)

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]