Warnings in the C++ Front-End and GCC in General

Craig Burley burley@gnu.org
Fri Sep 11 09:53:00 GMT 1998


>I have a 2nd proposal for Issue 2.  Instead of some sort of warning
>control pragma, how about some general mechanism for passing command
>line options from within the code?

I've been long thinking about how to design this into the future
GNU Fortran dialect I'd like g77 to support.

As with the progressively-more-intricate requirements for code-based
enabling/disabling of warnings, the big problems start with getting a
handle on the requirements phase, and continue well into the design,
implementation, debugging, testing, and maintenance phases.

For example, all pertinent options must be classified into one or
more of the following classes (almost certainly a partial list):

  -  Affects preprocessing token-by-token

  -  Affects preprocessing on a macro-substitution basis

  -  Affects preprocessing on a #include-is-a-token basis

  -  Affects preprocessing on a dive-into-#include basis

  -  Affects lexing token-by-token

  -  Affects lexing on a macro-substitution basis

  -  Affects semantic analysis on a linear basis

  -  Affects semantic analysis on a binding-contour basis

  -  Affects semantic analysis on a namespace basis

  -  Affects semantic analysis on a name-by-name basis

  -  Affects semantic analysis on an end-of-file basis

  -  Affects compiler behavior on a linear basis

  -  Affects compiler behavior on a binding-contour basis

  -  Affects compiler behavior on a namespace basis

  -  Affects compiler behavior on a name-by-name basis

  -  Affects compiler behavior on an end-of-file basis

My guess is that only a few of the above could be proven to be
equivalent for the *current* options, and probably none of them
could be proven equivalent for all of them.

For example, -Wuninitialized pertains to semantic analysis on
a name-by-name basis, the defaults being set (inherited) from
the semantic analyses for binding-contour and then linear bases,
I would think.  It also pertains to compiler behavior on a
binding-contour basis, etc.  Just how to classify it is important,
because that affects just where the source text to specify the
option (or its -Wno- equivalent) must appear to affect whether
the option affects a particular variable, all the variables defined by
a binding contour (say, a function definition), all the variables
used by a binding contour (which might include those inherited
from an outer scope, class, template, whatever), and so on.

Another example: g77 has a code-generation option, -fno-f2c, which
turns off trying to generate f2c-compatible code.  Generally, this
means all the code for a program must be compiled using it, but
a clever ("macho") programmer -- precisely the audience for any
facility allowing specification of compiler options in source code --
could use it judiciously to control specific instances of code
generation, and, given any facility to put options in source code,
would expect to be able to do this.

However, just where to require specification of the option is
problematic.  It mostly affects how procedures are called, but,
consider:

	SUBROUTINE X
	EXTERNAL B
	REAL B
	A = B()
	PRINT *, A, B()
	END

For clarity, I've omitted any actual stuff that'd make -fno-f2c
relevant -- pretend there are CHARACTER*(*) arguments everywhere
if you must -- because the questions I have to ask are:

  -  Precisely where must -fno-f2c be in effect, in the source,
     to ensure that X is generated using non-f2c (g77) conventions?

  -  Same question for both calls to B

  -  Same question for either call to B

  -  Are defaults inherited from outer scopes, and at what "slice" --
     does the default for a call to B inherit from B's declaration(s)
     (EXTERNAL and/or REAL), and how does this interact with explicit
     IMPLICIT and with implicit IMPLICIT; and/or does it inherit from
     how X will be code-generated (which means, if X's state can
     be affected by specifying -fno-f2c only just before its END
     statement, that can affect B's code generation, which might be
     in conflict with code to diagnose unsupported or erroneous
     calls to B, since that'd likely test the -fno-f2c status while
     parsing); and/or does it inherit from the "linear text" state
     just before and/or just after the lexemes denoting the binding
     contour?

Another example: -traditional for macros (or whatever), which could
affect how stringizing works (right?).  But can one control how
a specific *macro* does its stringizing, and, if so, how?  By enabling
and disabling -traditional immediately surrounding the macro
definition?  How would that interact with a macro expansion that
has the opposite status, and/or that requires other expansions
apparently to agree/disagree on the state?

All of these examples can be "answered" at the requirements level;
the requirements can be designed, implemented, and maintained.  (But
remember all these emails talking about system header files that
can't be changed, requirements for fine-grained control, etc.)

Having thought about these sorts of issues for about 20 years, however,
it is my offhand guess that doing so would take about as many
man-years, and as much contentiousness, as all the egcs/gcc development
that has gone on to date.  The effort to document the details *alone*
would be vast, and any failure to precisely document how each and
every option interacts with all the (conceptual, if not actual)
compiler phases would presumably lead to valid bug reports (never mind
all the invalid bug reports that'd flood the lists).

Not that there's anything *wrong* with that.  :)

>lots of compilers do this sort of thing.

I really doubt *any* compiler does what I'm talking about, but
that's just the tip of the iceberg of what would be *required*
to incorporate most of the initial statements we've seen as
to what's required of any facility to enable/disable individual
warnings.

As far as supporting some initial text in the source file to
specify options: I think that's a great idea, lots of compilers
do apparently do it, and it's also pretty easy to code around
not having using a combination of head, sed, and so on.  E.g.
if the first comment block, within the first 3 lines, contains
a marker string like "*gcc Compiler Options: ...", they can
be easily lifted out before most any pertinent processing happens
(though, #include and other interactions need to be thought through,
since they can be affected by options like the -I ones; and, as
comment-string stuff, macro substitution won't be able to yield
options, which would likely bug some people).

>This also allows for doing things like turning off optimization or
>tuning optimization parameters for a particular section of code for
>which maybe gcc is broken, thus allowing people to get on with their
>life instead of waiting for bug fixes.

Note that this is impossible with the just-some-initial-text
approach used by some compilers to support specification of
arbitrary options for a compilation.  My guess is that any
compiler offering the above-mentioned features does so only
for a very specific, thoughtfully engineered *subset* of compiler
options, and, even then, might still not meet the precise needs
of the user base (e.g. controlling optimization for nested
functions, a la Fortran's statement functions).

>Of course, I'm sure there are lots of command line options which
>couldn't be easily localized for sections of code, so these would have
>to be disabled from embedded setting.

I'm not quite sure what this means, but I think it suggests you
already have at least some understanding of the issues I'm raising here.

As with the individual enabling/disabling of warnings issue, the
problem is not that we can't decide exactly what "we" will and won't
support, document that, and ship it.

The problem is that the unique (open-source) nature of egcs practically
ensures that we either magically get this 100% "right" for all
front ends from the beginning -- which'd take bazillions of man-years,
IMO -- or, having shipped a less-than-perfect effort, try to figure
out how to cope with the inevitable long-term deluge of reports that
"just one tiny fix" will make a particular option work just a little
bit better, and with the resulting increase in complexity of the
entire compiler code base.

And, also as with the warnings issue, there shouldn't be a problem
providing *specific* options for enabling/disabling, as long as
issues, such as those I've highlighted above, are carefully thought
out ahead of time, to avoid having to re-design the facility after
it's "done".

For example, I can't imagine how it'd be worthwhile to provide
a source-based facility to enable/disable the g77 `-ff77' option,
because it's kind of a "macro" option (affects a variety of
things in different compiler phases), whereas the `-fbackslash'
option might get some special source treatment, using some
construction more portable (to other mythical compilers at least)
than using the spelling `-fbackslash', such as 'FOO'C to specify
a C string (meaning C-style backslash), 'FOO'B to specify a
Fortran-with-C-backslash string, 'FOO'F to specify a vanilla
Fortran string, 'FOO' to specify whatever it the default (which is
currently supported), and perhaps a means to change the default
within the source on a line-by-line basis, controlling subsequent
lexemes only.

So, I'd suggest thinking less about facilities to control *compiler*
options from source code and more about *linguistic* facilities to
accomplish a *subset* of the effects.

To repeat the a few of the warnings issues I raised earlier, doing
the latter is harder, but leads to source code that has more
useful linguistic information, if properly done, and probably leads
to more widely portable, and long-term-maintainable, source code,
both for the end user, and for egcs developers.

The last warning I have about this sort of thing: there is a *strong*
temptation to design new features like this using the assumption that
the way the compiler parses source code will never change, and thus
end up requiring the compiler to never take advantage of new, better
ways to parse code (or make it harder to write analysis/authoring
tools that deal with code) while still supporting the features you
designed.  E.g. I could easily make that mistake with g77, if I didn't
carefully think out how each new feature would interact with
various models for parsing (and reading) code, and I believe I've
at least helped some vendors, and perhaps the Fortran standards
committee, avoid similar mistakes (though they'd already made others
before I, or anyone else, noticed them, of course).

        tq vm, (burley)



More information about the Gcc mailing list