Using and Porting GNU Fortran

Node: Philosophy of Code Generation, Next: Two-pass Design, Previous: Overview of Translation Process, Up: Front End

Philosophy of Code Generation

Don't poke the bear.

The g77 front end generates code via the gcc back end.

The gcc back end (GBE) is a large, complex labyrinth of intricate code written in a combination of the C language and specialized languages internal to gcc.

While the code that implements the GBE is written in a combination of languages, the GBE itself is, to the front end for a language like Fortran, best viewed as a compiler that compiles its own, unique, language.

The GBE's "source", then, is written in this language, which consists primarily of a combination of calls to GBE functions and tree nodes (which are, themselves, created by calling GBE functions).

So, the g77 generates code by, in effect, translating the Fortran code it reads into a form "written" in the "language" of the gcc back end.

This language will heretofore be referred to as GBEL, for GNU Back End Language.

GBEL is an evolving language, not fully specified in any published form as of this writing. It offers many facilities, but its "core" facilities are those that corresponding most directly to those needed to support gcc (compiling code written in GNU C).

The g77 Fortran Front End (FFE) is designed and implemented to navigate the currents and eddies of ongoing GBEL and gcc development while also delivering on the potential of an integrated FFE (as compared to using a converter like f2c and feeding the output into gcc).

Goals of the FFE's code-generation strategy include:

High likelihood of generation of correct code, or, failing that, producing a fatal diagnostic or crashing.
Generation of highly optimized code, as directed by the user via GBE-specific (versus g77-specific) constructs, such as command-line options.
Fast overall (FFE plus GBE) compilation.
Preservation of source-level debugging information.

The strategies historically, and currently, used by the FFE to achieve these goals include:

Use of GBEL constructs that most faithfully encapsulate the semantics of Fortran.
Avoidance of GBEL constructs that are so rarely used, or limited to use in specialized situations not related to Fortran, that their reliability and performance has not yet been established as sufficient for use by the FFE.
Flexible design, to readily accommodate changes to specific code-generation strategies, perhaps governed by command-line options.

"Don't poke the bear" somewhat summarizes the above strategies. The GBE is the bear. The FFE is designed and implemented to avoid poking it in ways that are likely to just annoy it. The FFE usually either tackles it head-on, or avoids treating it in ways dissimilar to how the gcc front end treats it.

For example, the FFE uses the native array facility in the back end instead of the lower-level pointer-arithmetic facility used by gcc when compiling f2c output). Theoretically, this presents more opportunities for optimization, faster compile times, and the production of more faithful debugging information. These benefits were not, however, immediately realized, mainly because gcc itself makes little or no use of the native array facility.

Complex arithmetic is a case study of the evolution of this strategy. When originally implemented, the GBEL had just evolved its own native complex-arithmetic facility, so the FFE took advantage of that.

When porting g77 to 64-bit systems, it was discovered that the GBE didn't really implement its native complex-arithmetic facility properly.

The short-term solution was to rewrite the FFE to instead use the lower-level facilities that'd be used by gcc-compiled code (assuming that code, itself, didn't use the native complex type provided, as an extension, by gcc), since these were known to work, and, in any case, if shown to not work, would likely be rapidly fixed (since they'd likely not work for vanilla C code in similar circumstances).

However, the rewrite accommodated the original, native approach as well by offering a command-line option to select it over the emulated approach. This allowed users, and especially GBE maintainers, to try out fixes to complex-arithmetic support in the GBE while g77 continued to default to compiling more code correctly, albeit producing (typically) slower executables.

As of April 1999, it appeared that the last few bugs in the GBE's support of its native complex-arithmetic facility were worked out. The FFE was changed back to default to using that native facility, leaving emulation as an option.

Later during the release cycle (which was called EGCS 1.2, but soon became GCC 2.95), bugs in the native facility were found. Reactions among various people included "the last thing we should do is change the default back", "we must change the default back", and "let's figure out whether we can narrow down the bugs to few enough cases to allow the now-months-long-tested default to remain the same". The latter viewpoint won that particular time. The bugs exposed other concerns regarding ABI compliance when the ABI specified treatment of complex data as different from treatment of what Fortran and GNU C consider the equivalent aggregation (structure) of real (or float) pairs.

Other Fortran constructs--arrays, character strings, complex division, COMMON and EQUIVALENCE aggregates, and so on--involve issues similar to those pertaining to complex arithmetic.

So, it is possible that the history of how the FFE handled complex arithmetic will be repeated, probably in modified form (and hopefully over shorter timeframes), for some of these other facilities.