This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: Treelang patch


On 28-Jul-2001, Tim Josling <tej@melbpc.org.au> wrote:
> + @cindex GNU Back End (GBE)
> + @cindex GBE
> + @cindex @code{gcc}, back end
> + @cindex back end, gcc
> + @cindex code generator
> + One chunk is the so-called @dfn{GNU Back End}, or GBE,
> + which knows how to generate fast code for a wide variety of processors.
> + The same GBE is used by the C, C++, and Treelang compiler programs @code{cc1},
> + @code{cc1plus}, and @code{tree1}, plus others.
> + Often the GBE is referred to as the ``gcc back end'' or

I suggest s/gcc/GCC/

> + even just ``gcc''---in this manual, the term GBE is used
> + whenever the distinction is important.

Likewise here.

> + @cindex GNU Treelang Front End (FFE)

FFE?  Where did that abbreviation come from?
Wouldn't that be the Fortran Front End?

I suggest s/FFE/TFE/ here and in several places below.

> + @node Interoperating with C and C++,  , Other Languages, Other Languages
> + @section Tools and advice for interoperating with C and C++
> + 
> + The output of treelang programs looks like c program code to the linker
> + and everybody else, so you should be able to freely mix treelang and C
> + (and C++) code.

Does treelang promote function argument types and return types to `int'?
I found that was needed for binary compatibility with C when writing
the Mercury GCC front-end.

> + Makefile in turn is the main instruction to actually build
> + everything. The build instructions are held in the main gcc manual and
> + web site so they are not repeated here. 

s/gcc/GCC/, I think.

> + @cindex lang-options
> + @item
> + lang-options. This file is included into gcc.c, the main gcc driver, and

s/gcc driver/GCC driver/

> + @item
> + lexer. This breaks the input into words and passes these to the
> + parser. This is lex.l in treelang, which is passed through flex, a lex
> + variant, to produce c code lex.c.

s/c code/C code/

> + Note there is a school of thought that
> + says real men hand code their own lexers, however you may prefer to

s/men/programmers/

> + @item
> + parser. This breaks the program into recognizable constructs such as
> + exprerssions, statemente etc. This is parse.y in treelang, which is

Fix typos: "expressions, statements, etc. This ...".

> + @item
> + compiler main file. gcc comes with a program toplev.c which is a
> + perfectly serviceable main program for your compiler. treelang uses
> + toplev.c but other languages have been known to replace it with their
> + own main program. Again this is a matter of taste and how much code you
> + want to write. 

Is this really true?
Which other languages replaced toplev.c??
Duplicating the code in toplev.c sounds like a maintenance headache.

Did you mean main.c?

> + The driver (gcc.c) will then drive (exec) in turn a preprocessor, the main
> + compiler, the assembler and the link editor. gcc options allow you to

s/gcc options/Options to gcc/

> + override all of this. In the case of treelang programs there is no
> + preprocessor, and mostly these days the C preprocessor is run within the
> + main C compiler apparently for reasons of speed.

I suggest inserting "rather than as a separate process, "
before "apparently".

> + @node treelang main compiler,  , treelang driver, treelang compiler interfaces
> + @subsection treelang main compiler
> + 
> + The main compiler for treelang consists of toplev.c from the main GCC
> + compiler, the parser, lexer and back end interface routines, and the
> + back end routines themselves, of which there are many.
> + 
> + toplev.c does a lot of work for you and you shoudl seriously consider
> + whether you want to reinvent it. It is quite possible to reuse it, as in
> + the case of treelang. 
> + 
> + Writing this code is the hard part of creating a compiler using GCC. The
> + back end interface documentation is incomplete and the interface is
> + complex. 

Even if it is technically possible to replace toplev.c,
I'm not sure if we should mention that in this manual.

> + @node Interfacing to the garbage collection, Interfacing to the code generation code. , Interfacing to toplev.c, treelang main compiler
> + @subsubsection Interfacing to the garbage collection

It might be worth mentioning lang_mark_tree somewhere in this section.

> + Interfacing to the garbage collection. In treelang this is mainly in
> + tree1.c. 
> + 
> + Memory allocation in the compiler should be done using the ggc_alloc and
> + kindred routines in ggc*.*. At the end of every function, toplev.c calls
> + the garbage collection several times. The garbage collection calls mark
> + routines which go through the memory which is still used, telling the
> + garbage collection not to free it. Then all the memory not used is
> + freed.

"at the end of every function" is a bit ambiguous,
e.g. this could be interpreted as referring to functions in toplev.c.

This section should have a pointer to the documentation in

> + @item
> + GDB: the GCC back end works well with gdb. It traps abort() and allows
> + you to trace back what went wrong. 

It's probably worth mentioning that

	Some gdb macro commands that are useful for debugging GCC,
	e.g. commands to display the values of GCC tree nodes,
	are defined in gdbinit.in (which is sourced by the .gdbinit
	file in the gcc directory).

> + @node Projects, Index, Service, Top
> + @chapter Projects
> + @cindex projects
> + 
> + If you want to contribute to @code{treelang} by doing research,
> + design, specification, documentation, coding, or testing,
> + the following information should give you some ideas.
> + 
> + Send a message to @email{@value{email-general}} if you plan to add a
> + feature.
> + 
> + The main requirement for treelang is to add features and to add
> + documentation. Features are things that the GCC back end can do but
> + which are not reflected in treelang. Examples include structures,
> + unions, pointers, arrays.

Also nested functions and exception handling.

> --- newgcc/gcc/treelang/treetree.c	Sat Jul 28 12:41:52 2001
> + tree 
> + tree_code_create_function_prototype (unsigned char* chars,
...
> +   if (lineno > 1000000)
> +     ; /* Probably the line # is rubbish because someone forgot to set
> +     the line number - and unfortunately impossible line #s are used as
> +     magic flags at various times. The longest known COBOL program for
> +     example is about 550,000 lines.  */
> +   DECL_SOURCE_LINE (fn_decl) = lineno;

That looks like it is COBOL-specific.

> +   TREE_PUBLIC (fn_decl) = 0;
> +   DECL_EXTERNAL (fn_decl) = 0; 
> +   TREE_STATIC (fn_decl) = 0; 
> +   switch (storage_class)
> +     {
> +     case STATIC_STORAGE:
> +       TREE_PUBLIC (fn_decl) = 0; 
> +       break;

Do you really want to set TREE_PUBLIC (func_decl) = 0 twice?
IMHO that's just confusing.

> + tree 
> + tree_code_create_variable (unsigned int storage_class,
> +                                unsigned char* chars,
> +                                unsigned int length,
> +                                unsigned int expression_type,
> +                                tree init,
> +                                unsigned char* filename,
> +                                int lineno)
> + {
...
> +     case EXTERNAL_REFERENCE_STORAGE:
> +       DECL_EXTERNAL (var_decl) = 1;
> +       break;

You also need to set TREE_PUBLIC (var_decl) = 1 here, otherwise you will run
into problems (on at least on x86-linux) when compiling with `-fpic'.
I found out the hard way ;-)

> + /* Return a tree for a constant integer value in the token TOK.  No
> +    size checking is done.  */
> + 
> + tree 
> + tree_code_get_integer_value (unsigned char* chars, unsigned int length)
> + {
> +   long long int val = 0;

That's not portable ISO C89.  You should put a comment here or at least somewhere
in the treelang source explaining why it is OK to use GCC extensions in this file.

> + void
> + init_decl_processing ()
> + {
...
> +   /* Set standard type names.  */
> + 
> +   ridpointers = (tree *) xcalloc ((int) RID_MAX, sizeof (tree));
> +   ggc_add_tree_root (ridpointers, RID_MAX);
> +     
> +   ridpointers[ (int) RID_INT] = get_identifier ("int");
> +   ridpointers[ (int) RID_CHAR] = get_identifier ("char");
> +   ridpointers[ (int) RID_VOID] = get_identifier ("void");
> +   ridpointers[ (int) RID_FLOAT] = get_identifier ("float");
> +   ridpointers[ (int) RID_DOUBLE] = get_identifier ("double");
> +   ridpointers[ (int) RID_SHORT] = get_identifier ("short");
> +   ridpointers[ (int) RID_LONG] = get_identifier ("long");
> +   ridpointers[ (int) RID_UNSIGNED] = get_identifier ("unsigned");
> +   ridpointers[ (int) RID_SIGNED] = get_identifier ("signed");
> +   ridpointers[ (int) RID_INLINE] = get_identifier ("inline");
> +   ridpointers[ (int) RID_CONST] = get_identifier ("const");
> +   ridpointers[ (int) RID_RESTRICT] = get_identifier ("restrict");
> +   ridpointers[ (int) RID_VOLATILE] = get_identifier ("volatile");
> +   ridpointers[ (int) RID_BOUNDED] = get_identifier ("__bounded");
> +   ridpointers[ (int) RID_UNBOUNDED] = get_identifier ("__unbounded");
> +   ridpointers[ (int) RID_AUTO] = get_identifier ("auto");
> +   ridpointers[ (int) RID_STATIC] = get_identifier ("static");
> +   ridpointers[ (int) RID_EXTERN] = get_identifier ("extern");
> +   ridpointers[ (int) RID_TYPEDEF] = get_identifier ("typedef");
> +   ridpointers[ (int) RID_REGISTER] = get_identifier ("register");
> +   /*  ridpointers[ (int) RID_ITERATOR] = get_identifier ("iterator"); */
> +   ridpointers[ (int) RID_COMPLEX] = get_identifier ("complex");
> +   ridpointers[ (int) RID_ID] = get_identifier ("id");
> +   ridpointers[ (int) RID_IN] = get_identifier ("in");
> +   ridpointers[ (int) RID_OUT] = get_identifier ("out");
> +   ridpointers[ (int) RID_INOUT] = get_identifier ("inout");
> +   ridpointers[ (int) RID_BYCOPY] = get_identifier ("bycopy");
> +   ridpointers[ (int) RID_BYREF] = get_identifier ("byref");
> +   ridpointers[ (int) RID_ONEWAY] = get_identifier ("oneway");

Is that code needed?  If so, why?

> --- newgcc/gcc/treelang/treetree.h	Mon Jun 11 08:45:27 2001
> + /* Storage modes.  */
> + #define STATIC_STORAGE 0
> + #define AUTOMATIC_STORAGE 1
> + #define EXTERNAL_REFERENCE_STORAGE 2
> + #define EXTERNAL_DEFINITION_STORAGE 3

Why use #define rather than enum?

> + #define SIGNED_CHAR 1
> + #define UNSIGNED_CHAR 2
> + #define SIGNED_INT 3 
> + #define UNSIGNED_INT 4
> + #define VOID_TYPE 5
> + 
> + 
> + #define EXP_PLUS 0 /* Addition expression.  */
> + #define EXP_REFERENCE 1 /* Variable reference.  */
> + #define EXP_ASSIGN 2 /* Assignment.  */
> + #define EXP_FUNCTION_INVOCATION 3  /* Call function.  */
> + #define EXP_MINUS 4  /* Subtraction.  */
> + #define EXP_EQUALS 5  /* Equality test.  */

Likewise.

-- 
Fergus Henderson <fjh@cs.mu.oz.au>  |  "I have always known that the pursuit
The University of Melbourne         |  of excellence is a lethal habit"
WWW: <http://www.cs.mu.oz.au/~fjh>  |     -- the last words of T. S. Garp.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]