Special handling of "%H" (Re: Support for %d$c format specifierindiagnostics.c)

Ishikawa ishikawa@yk.rim.or.jp
Thu Jul 31 00:23:00 GMT 2003


"Joseph S. Myers" wrote:
> 
> On Thu, 31 Jul 2003, Ishikawa wrote:
> 
> > Ok, while I am waiting for the patch to support positional format
> > specifiers in diagnostic (and formated output) routines being merged
> > into mainline tree (hint, hint), I looked at
> 
> For the detailed review process to start, please submit a version of the
> patch formatted according to the GNU Coding Standards.  (There may then
> need to be several rounds of technical review of the patch following
> that.)

I will modify the change to use a suggestion from Gabriel Dos Reis
mentioned in his separate follow-up.

> > I ran this modified msgfmt on the PO files under
> > gcc/gcc/po directory.
> >
> > Funny, there was NO warning at all!
> 
> Please investigate why the specific translations there have been bug
> reports about (e.g. bug 9390) did not generate warnings - have those
> problems been fixed, or have the messages changed so that the translation
> is now fuzzy?

Looking at 9390 bugzilla entry and checking the last comment (as of
today)
and see the URL mentioned there, I think I know the partial answer to
this.
Wolfgang Bangerth's  # 10 comment in
the bugzilla report.
>I rewrote a similar script as the one that has gone missing. It can be found in this 
>message: 
>  http://gcc.gnu.org/ml/gcc-patches/2003-07/msg02191.html 
 
As I mentioned in my previous post:
>There was also a very minor problem in msgfmt itself in that if msgid
>contained invalid specifier such as %Z, msgfmt would not complain and
>not check the translation at all(!) and move on to the next message.
>Considering that msgid is a (presumably) valid format string taken out
>from correct(!) C source code, the chance of having invalid format
>character there is small, but I would rather see a check here, too.

If format messages are indeed passed to unmodified msgfmt,
they won't be checked IF the original msgid (the untranslated
string) contains invalid C printf specifier such as %T.
This may explain the past failure to detect problems.

But I found it a little strange that the copy of PO files
in my current CVS copy, didn't produce any errors 
even after I modified msgfmt to complain about unknown
format character in msgid as opposed to msgstr (translation
string).

My guess is that most of the problematic lines have
fuzzy flag (which seems to disable format checking by msgfmt) or
c-format flag removed due to the fact that unmodified
msgfmt chokes on specifiers such as %T that are specific
to GCC warning routines.

Also, the particular problems, say, in c++/7765
http://gcc.gnu.org/ml/gcc-patches/2002-10/msg01694.html
seem to have been fixed. (Or that
the message strings have changed significantly
and the corresponding messages are gon.

So it looks to me that the current PO files
are marked up so that invalid/untranslated messages
are not checked at all, and checked message lines
are healthy. The practical problem is that
there are many unchecked messages.

Here we have a chicken and egg problem.
We need msgfmt that understands format specifiers
that understands specifiers used GCC's warning/error output formating
before the translation community starts to
put c-format string back to various messages (if indeed,
some have been taken off intentionally).
msgfmt probably needs to be aware of such format specifiers
by giving it a flag like "--gcc" as I used.
But unless such msfgmt is wide-spread
we can't make put such a flag to msgfmt check in GCC's mainline
cvs. I am not sure how to go about this distribution/sync
problem between gettext and GCC.

Well, I will start by sending a patch to msgfmt.c and
format-awk.c of gettext() to let them understand
the extended format specifiers for GCC po files and
hope for the best.

> > Now I believe that translators without being able to obtain help about
> > unknown format-like specifiers such as %H, %T, %D simply marked dubious
> > translation as "fuzzy" so that msgfmt won't complain :-(
> 
> Translators do not mark translations fuzzy - msgmerge does; see the
> gettext manual.  The sequence is: a translator translates a message

Thank you for explaining the detailed process.
Now I have a better grasp of the detailed steps.

> 
> > It seems that most of the PO files were rather outdated.  (At least
> > the ones under my gcc mainline cvs check out hierarchy.)
> 
> Some mainline files may be out of date, because Mark checked updated files
> into the 3.3 branch only before the release.  Anyone with CVS write access
> may download the current files from the TP site (see translation.html for
> details) and check them into mainline CVS (check with Mark before updating
> the branch files, but I suspect any updates of those will be welcome as
> well as it saves work just before the release).  Anyone may also
> regenerate gcc.pot on mainline (with current gettext) if they want.
> 
> Note that translators only work on .pot files from release branches.  So
> changes to messages since 3.3 was branched will not be reflected in
> translation work until 3.4 has branched and a new .pot file has been sent
> to the TP.  There is a finite translation effort and mainline is rather a
> moving target, so we only try to have translators working on translations
> of the most recent release branch.

I see that there will always be delay/lag between the
used messages in mainline CVS and 
translated message. Hmm. I guess we have to live with it.
 
> > We need some serious awareness campaign efforts to the translation
> > community to fix this state of the affairs by offering the improved
> > tool and hopefully the positional parameter support soon.  (I attach
> > the draft note to translation community. Comments/feedback welcome.)
> 
> Please *only* discuss what is GCC-specific - i.e., the GCC-specific
> formats and usage of %H.  Translators will generally already know about
> printf formats in general and positional parameters.  The general parts of
> the documentation might be of use in a general document for new
> translators, but you'll need to work with the TP and gettext maintainers -
> not the GCC maintainers - regarding getting such text into any such
> document.

Thank you for the comment. I will re-organize the document.


-- 
int main(void){int j=2003;/*(c)2003 cishikawa. */
char t[] ="<CI> @abcdefghijklmnopqrstuvwxyz.,\n\"";
char *i ="g>qtCIuqivb,gCwe\np@.ietCIuqi\"tqkvv is>dnamz";
while(*i)((j+=strchr(t,*i++)-(int)t),(j%=sizeof t-1),
(putchar(t[j])));return 0;}/* under GPL */



More information about the Gcc-patches mailing list