This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: named warnings & individual warning control


> > Some folks argue that the warning text should be in the source, others
> > note the benefits of having a machine-parsable warning catalog
> > (documentation, for example).  Both sides agree that gettext() is the
> > i18n solution of choice.
> 
> I don't see why these are supposed to be mutually exclusive, given that
> gettext illustrates how catalogs can be generated from the source.

The arguments against text-in-source are:

Given you already have to pass some token to identify the message so
that the enable/disable logic can work, also passing the text itself
is redundant.

Passing the text itself precludes alternate texts, such as terse vs
verbose messages.  Consider a setup where the first time a message is
emitted, a verbose message is given, but the next time a terse message
is emitted instead.

By "machine parsable catalog" I meant a separate file that contains
not only the message texts, but the matching mnemonics (and thus
command line options and pragmas), optional tags/flags for each
message, and the logic determining its default state and response to
various command line options (like -std=).

Yes, you can suck out the gettext-able strings.  But consider a future
project to programmatically cross-check each message with the
conditions under which it is enabled, or dynamically link to online
documentation.  They require more than just a list of strings.

The arguments for text-in-source are mostly as you describe.  Note,
however, that i18n'd messages are already out-of-source and indirectly
incur some of the problems you describe.

> Indeed, I've written testcases to ensure that warnings are or are not
> pedwarns in appropriate modes,

My current thought is to encode the logic mapping command line options
such as -pedwarn and -std= inside the message catalog itself, so the
programmer issuing the message just says "message goes here" and the
catalog decides if it's a warning, error, pedwarn, or ignored.  That
also means that the folks responsible for standards compliance only
need review the message catalog; and perhaps a genmessages program
could provide standards compliance reports.

> but a warning control system may be able to do things better so (a)
> it is still visible at the point the warning is emitted that it is
> for something in C99 but not C90

That was an argument for keeping such logic in the source, but the
argument against is that (1) such policies already don't work well,
and (2) they're not machine parsable for statistical reports and
audits.

> and (b) you get a -Wc90 option out of this for free.

Agreed :-)

Bonus: -std=c90 enables some things as errors and some as warnings,
but a -Wc90 would enable them all as just warnings (and -Ec90 would
enable them all as errors).

> Maybe specify in the source both the default state for the message
> (nothing, warning, error, with macros making cases such as pedwarns
> and pedwarn for C90 / nothing for C99 more convenient) and its
> identifier for individual control?

I, personally, would argue against spreading the deciding logic
around.  A single "give message" function, and let a central database
decide what to do with it.

> Naturally every warning should (ideally) have tests that its default
> nature is right in all standard modes, and its own control mnemonic does
> indeed control it.  (The latter, testing all mnemonics, helps ensure the
> stability Mark asked for.)

This is an argument for machine-parsable state logic (a genmessages
program, for example) rather than ad-hoc in-source logic.

> -Wwrite-strings for C changes the type of string constants rather than
> enabling specific warnings.

The message_p(MSG_writable_strings) API would be suitable.  If the
message is enabled, the compiler could do additional things to ensure
it's able to detect the case where the message is appropriate.

> The devil is in the detail.

Yes!  At this point, though, I'd be happy to just get enough of the
general concepts agreed on so we can start worrying about the details.

I prefer to discuss details by presenting code, as it's an ideal
language for documenting such details.

> * Errors (not generally disablable, we don't try to generate
> sensible code in their presence).

Split these into errors that gcc can recover from, and those it can't.
Recoverable errors should be controllable.  IMHO it's OK to have no
control over unrecoverable errors.

> * Mandatory warnings.

I assume "mandatory" means "if the current -std= specified it".

> * Mandatory pedwarns (these are warnings by default for C but errors for
> C++).

Hmmm... language specific defaults.  I suppose we can infer some
imaginary command line option "-Wlang-c++" to prime the defaults.
That would hook it into logic we already need.

> * Warnings controlled by some option.
> 
> * Pedwarns if pedantic.
> 
> * Pedwarns if pedantic C90 (with a -Wc90 option, these should be pedwarns 
> if pedantic C90, otherwise warnings if -Wc90).
> 
> * Mandatory pedwarns in C99 mode that are off by default, but enabled as 
> warnings rather than pedwarns, for certain -W options in C90 mode.
> 
> (See pedwarn_c90 and pedwarn_c99, which centralise for special cases the
> choice of pedwarn/warning but leave repetitive logic at the call sites to
> decide when the warning happens at all.)

With a good design, there would be no differences amongst those, aside
from the selection logic and defaults.  They do indicate that complex
enough logic is needed.

> Also, what does or does not count as an individual warning that gets its
> own control facility?

IMHO this is a deferrable decision.  However, a message catalog can
still be used to control a message group, using the
message_p(MSG_group) API to ask if the group is enabled, even if the
individual messages aren't in the catalog.  That's a policy decision,
not a design detail.

But it's easy enough to have a group "foo" and messages "foo-1"
"foo-2" etc, which allows for (but discourages) individual control,
without the difficulty of coming up with meaningful mnemonics.

> Look at c-format.c, storing as much information as possible about
> permitted formats in datastructures rather than code so that the
> code is maintainable.  (This will later need duplicating phrases in
> the tables with distinguishing prefixes so that translations can
> decline them differently in different messages.)

Managing such groups should be easy; we already need that functionality
to group other messages together (like -Wall).

> I had hoped general warning control would allow e.g. people to disable
> those warnings for %m only (see bug 15338 and linked discussions).  This
> would be possible - adding lots of individual warning identifiers to the
> tables, with the complication that many warnings arise from arbitrary
> pairs of an entry in one table and an entry in another - should we do
> that?

It's up to the source to decide which messages are emitted,
but I see no reason why huge lists of message variants can't be
generated from the *printf format tables.  I'm wondering if
we should support some kind of wildcard in the catalog, such
as message "-Wformat-%" where % matches like Makefile rules?

Alternately, some way of matching -W* options with the *parameters*
passed to the message API, rather than the message itself.  We could
peek into the format and look for a few select %'s and allow the user
to append matching strings to the warning option.

What do we do now?  As long as we don't regress on flexibility, we can
always expand on it later.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]