This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Special handling of "%H" (Re: Support for %d$c format specifierin diagnostics.c)


Ishikawa <ishikawa@yk.rim.or.jp> writes:

> I was testing my proposed support for positional format specifier
> when custom format decoder is invoked, and I noticed
> a somewhat surpriseing special handling of %H at the
> beginning of a warning message.
>
> Is the %H at the beginning of an error/warning message handled in a
> special manner by routines in diagnostic.c?

Yes, it is.  It gets eaten by text_specifies_location (called from
diagnostic_set_info) before pp_format_text ever sees it.
pp_format_text does something sensible with %H encountered in the
middle of a string, but we do not appear to use that capability; I
regenerated gcc.pot and grepped it for '[^"%]%H' and got nothing.

I agree with Gabriel that for the short term it's okay to recognize
"%1$H" as well as "%H" at the beginning of a string in
text_specifies_location, and require that translators not move the
%H.  They shouldn't be moving it anyway.

To digress a bit, take a look at locate_error in cp/error.c.  This
implements a capability used by the cp_{error,warning,pedwarn}_at
functions: in the C++ front end, one may write

  cp_error_at ("too many arguments to %s `%+#D'", called_thing, fndecl);

which has an effect identical to

  error ("%Htoo many arguments to %s `%#D'",
         &DECL_SOURCE_LOCATION (fndecl), called_thing, fndecl);

Notationally, the former is quite a bit nicer -- the C front end,
which does not have this capability, has a ton of places where
location_t variables are created just for the sake of not having to
type DECL_SOURCE_LOCATION over and over again.  However, the
implementation of the C++ feature is not particularly clean and I
suspect Gabriel wants to get rid of it.

It would be nice if we could find a general solution which would avoid
special handling of leading %H in text_specifies_location, *and* allow
implementing C++'s %+ notation in a clean manner.  Positional format
specifiers require us to prescan the argument list anyway, so I think
this could be dealt with at the same time.  Here is one way to do it.
I have not had the chance to read your code closely, so I don't know
how closely the sketch algorithm I'm about to describe matches it.

Phase 1: scan the string, build up an array of format specifiers
indexed by their %n$ number (if given) or position (if not).  This can
be done without communication with the language front end. For example:

    "too many arguments to %s `%+#D'"
    "too many arguments to %2$+#D the %1$s"

both produce the same array: { "s", "+#D" }.  For efficiency the array
elements should probably be { pointer, length } pairs with the pointers
pointing into the original string.  Ill-formed format strings -- mixed
explicit and implicit position, referencing the same argument twice
with different format codes, picking the wrong argument for the field
width or precision, leaving gaps in the list of referenced arguments --
can and should be diagnosed at this point.

Phase 2: loop over the array and the variable arguments.  We have
enough information at this point to render each argument into a
string, so the result of this stage is another array of strings, one
for each argument.  Continuing the previous example, if the variable
arguments were "constructor" and a CLASS_DECL named "foo", the array
would be { "constructor", "foo" }.  This may involve communicating
with the front end -- a plausible hook function signature might be

 const char *(*render_argument) (const char *format_spec,
                                 diagnostic_context *dc);

This is expected to consume the next variable argument from the
va_list field stored in the diagnostic_context structure, and return a
pointer to a string, allocating memory from a scratch pad in the
diagnostic_context if necessary.

Phase 3: scan the string again.  At this point we don't care what the
format specifiers mean - we just copy the requested argument from the
array of formatted strings into the output buffer.  Therefore this can
be done without further communication with the front end.


The trick is, the render_argument hook (and the language-independent
formatter that calls it) has access to the complete diagnostic
context.  Normally it uses this to get at the variable argument list
and the scratch pad.  But there's no reason it can't modify other
fields of the context if appropriate.  In particular, we can implement
both %H and C++'s + modifier here, by updating the location field of
the diagnostic_context.  Since no output occurs until phase 3, the
diagnostic prefix has not yet been emitted when this happens, so it
isn't too late to do that.

...
> The next line in dwarfout.c probably should not use warning() since
> this could be subject to translation? (I mean ABOUT-GCC-NLS mentions
> that internal error message probably should be output in ASCII string
> without translation for eventual inspection by the developer
> community, which uses English mostly.)
>
> ./dwarfout.c:      warning ("%Hinternal regno botch: '%D' has regno =
> %d\n",

The intent documented in ABOUT-GCC-NLS, as I understand it, is that
these internal messages do not *need* to be translated, but it does
no particular harm if they are.  Developers can deal with inspecting
error messages in unfamiliar languages - if no one who speaks the
language can be found, we can fall back on grepping gcc/po/*.po for
the translation...

> I have no idea for what the following two lines are used.
> (But, there are NOTHING to translate except for "<near match>" and
> so probably the original order of reference to arguments
>  could be preserved in translation.)
> ./cp/call.c:    inform ("%H%s %+#D <near match>",
> ./cp/call.c:    inform ("%H%s %+#D",

This is probably a case where someone thought they could get away with
assembling an error message sentence in pieces.  These are bugs that
need to be dealt with, but it is a separate problem to the one we are
discussing right now.  If you want to make sure the issue doesn't get
forgotten, file a report in bugzilla.

zw


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]