This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

SIMPLE: A language-independent tree IR

A few days ago, Sebastian Pop pointed me to the McCAT project
from McGill University.  They defined a tree IR based on GCC's
trees which looks like a good candidate for our
language-independent trees.

The IR, called SIMPLE, is a very simple C-like three-address
language that looks pretty straightforward to analyze and keeps
all the high-level attributes of data types.

The most visible paper describing SIMPLE is

"Designing the McCAT Compiler Based on a Family of Structured
Intermediate Representations,"
L. Hendren, C. Donawa, M. Emami, G. Gao, Justiani, and B. Sridharan,
Proceedings of the 5th International Workshop on Languages and
Compilers for Parallel Computing, no. 757 in Lecture Notes in
Computer Science, New Haven, Connecticut, pp. 406-420,
Springer-Verlag, August 3-5, 1992.

The online material describes all the gory details:

Essentially, the IR defines 14 basic statements (x and y are
variables.  a, b and c are values):

     1	 x = a OP b
     2	*a = b OP c
     3	 x = OP c
     4	*a = x
     5	*a = f(args)
     6	*a = *c
     7	*a = (cast)b
     8	*a = &x
     9	 x = y
    10	 x = &y
    11	 x = *c
    12	 x = f(args)
    13   x = (cast)b
    14	 f(args)

Any tree expression that does not conform to this must be broken
down into a sequence of these 14 expressions.  For reference, I'm
attaching SIMPLE's BNF grammar.  For instance,

a = b * c + (*d) / e;			temp1 = b * c;
					temp2 = *d;
					temp3 = temp2 / e;
					a = temp1 + temp3;

Control flow is represented using all the _STMT trees defined in
c-common.def.  The proposal I'd like to discuss:

- Every front-end rewrites its tree representation into SIMPLE.
  Sebastian has already implemented some of the simplification
  steps for the C front-end.

  The question here is whether the SIMPLE IR is enough to
  be used as a target language for all the front-ends we have.
  I'm particularly interested in Java and Objective-C.  Are there
  nuances in these languages that SIMPLE won't be able to handle?

  The McGill folks are now working on Sable.  An environment for
  optimizing Java.  They have a new IR (Jimple) to represent
  bytecodes.  It looks almost identical to SIMPLE, but I'd like
  to know if we should incorporate things from Jimple as well.

- Once converted to SIMPLE, the trees are analyzed/optimized by
  the tree SSA framework.  We should not need many changes in the
  existing code, as we already handle a superset of SIMPLE.

- The resulting trees are converted to RTL.  This is another
  possible source of problems.  As it stands, the SIMPLE IR can
  be handled by the RTL expander in the C front-end without

  In this scenario, we would remove all the existing RTL
  expanders to be replaced by a single SIMPLE->RTL expander.
  This is where things will most likely break.  The code that now
  expands trees into RTL would be replaced by code that expands
  trees into SIMPLE.  But, is it possible to translate SIMPLE
  into RTL in a language-independent way?

I'm about to start incorporating these things into the AST
branch.  I'd like to know what folks think about this and point
out things that I'm missing and/or would be difficult to do.

Thanks.  Diego.

Attachment: simple-grammar.txt
Description: Text document

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]