Using and Porting GNU Fortran

Node: Overview of Translation Process, Next: Philosophy of Code Generation, Previous: Overview of Sources, Up: Front End

Overview of Translation Process

The order of phases translating source code to the form accepted by the GBE is:

Stripping punched-card sources (g77stripcard.c)
Lexing (lex.c)
Stand-alone statement identification (sta.c)
INCLUDE handling (sti.c)
Order-dependent statement identification (stq.c)
Parsing (stb.c and expr.c)
Constructing (stc.c)
Collecting (std.c)
Expanding (ste.c)

To get a rough idea of how a particularly twisted Fortran statement gets treated by the passes, consider:

           FORMAT(I2 4H)=(J/
          &   I3)

The job of lex.c is to know enough about Fortran syntax rules to break the statement up into distinct lexemes without requiring any feedback from subsequent phases:

     `FORMAT'
     `('
     `I24H'
     `)'
     `='
     `('
     `J'
     `/'
     `I3'
     `)'

The job of sta.c is to figure out the kind of statement, or, at least, statement form, that sequence of lexemes represent.

The sooner it can do this (in terms of using the smallest number of lexemes, starting with the first for each statement), the better, because that leaves diagnostics for problems beyond the recognition of the statement form to subsequent phases, which can usually better describe the nature of the problem.

In this case, the = at "level zero" (not nested within parentheses) tells sta.c that this is an assignment-form, not FORMAT, statement.

An assignment-form statement might be a statement-function definition or an executable assignment statement.

To make that determination, sta.c looks at the first two lexemes.

Since the second lexeme is (, the first must represent an array for this to be an assignment statement, else it's a statement function.

Either way, sta.c hands off the statement to stq.c (via sti.c, which expands INCLUDE files). stq.c figures out what a statement that is, on its own, ambiguous, must actually be based on the context established by previous statements.

So, stq.c watches the statement stream for executable statements, END statements, and so on, so it knows whether A(B)=C is (intended as) a statement-function definition or an assignment statement.

After establishing the context-aware statement info, stq.c passes the original sample statement on to stb.c (either its statement-function parser or its assignment-statement parser).

stb.c forms a statement-specific record containing the pertinent information. That information includes a source expression and, for an assignment statement, a destination expression. Expressions are parsed by expr.c.

This record is passed to stc.c, which copes with the implications of the statement within the context established by previous statements.

For example, if it's the first statement in the file or after an END statement, stc.c recognizes that, first of all, a main program unit is now being lexed (and tells that to std.c before telling it about the current statement).

stc.c attaches whatever information it can, usually derived from the context established by the preceding statements, and passes the information to std.c.

std.c saves this information away, since the GBE cannot cope with information that might be incomplete at this stage.

For example, I3 might later be determined to be an argument to an alternate ENTRY point.

When std.c is told about the end of an external (top-level) program unit, it passes all the information it has saved away on statements in that program unit to ste.c.

ste.c "expands" each statement, in sequence, by constructing the appropriate GBE information and calling the appropriate GBE routines.

Details on the transformational phases follow. Keep in mind that Fortran numbering is used, so the first character on a line is column 1, decimal numbering is used, and so on.