This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Checking format specifiers
- From: Wolfgang Bangerth <bangerth at ices dot utexas dot edu>
- To: Florian Weimer <fw at deneb dot enyo dot de>
- Cc: gcc-patches at gcc dot gnu dot org, Ishikawa <ishikawa at yk dot rim dot or dot jp>
- Date: Tue, 22 Jul 2003 11:06:40 -0500
- Subject: Re: Checking format specifiers
- References: <200307211749.08083.bangerth@ices.utexas.edu> <878yqrw6xo.fsf@deneb.enyo.de>
> > I am not familiar with the translation community, but if you are: one of
> > the really necessary additions in their processes would be a small
> > program that does what my small script tried to do in a thourough way
> > (i.e., for example, taking into account positional parameters).
>
> Doesn't 'msgfmt --check-format' already do this?
If it does, then it is not very successful: it doesn't give an error for any
of the translations we have. On the other hand, attached is a quick hack that
simply checks the number of % characters in messages and compares them. And
it finds lots of errors. For example, run it like so
cat da.po | perl x.pl
to get 119 error messages that look like this, for example:
incompatible numbers of formats in
msgid =call to function which throws incomplete type `%#T'
msgstr=kan ikke %s en henvisning til en ufuldstïdig type '%T'
at line 18100.
incompatible numbers of formats in
msgid =The -shared option is not currently supported for VAX ELF.
msgstr=den indbyggede funktion '%s' understttes i jeblikket ikke
at line 20845.
Now running this script against all the translation we have yields
gcc/po> for i in *po ; do echo $i `cat $i | perl x.pl | wc -l` ; done
be.po 470
da.po 255
de.po 15
el.po 1680
es.po 10
fr.po 255
ja.po 1905
nl.po 1830
sv.po 1415
tr.po 245
Each error is 5 lines, so we have between 2 and 381 errors in each of the
translation (at least -- these are only errors in the number of % signs, not
counting wrong formats or wrong order of formats!). So much, I guess about
msgfmt --check...
W.
--------------------------------------------------------
$line = 0;
while (<>) {
chop;
++$line;
# find next msgid line
if (/^msgid \"(.*)\"$/) {
$msgid = $1;
# concatenate with possible follow-up lines
while (($_ = <>) =~ /^\"(.*)\"$/) {
++$line;
$msgid .= $1;
}
++$line;
# parse corresponding msgstr line and make sure it is there
die ("no msgstr in line $line: $_\n") if (!/^msgstr \"(.*)\"$/);
$msgstr = $1;
# concatenate with possible follow-up lines
while (($_ = <>) =~ /^\"(.*)\"$/) {
++$line;
$msgstr .= $1;
}
++$line;
# if a translation string is empty, this means that the
# original string is to be used. then there can't be an error
if (!($msgstr eq "")) {
# count the number of % signs in both messages. don't even
# attempt to make sure they refer to the same formats
$msgidcount = 0;
$x = $msgid;
$msgidcount++ while ($x =~ s/%//);
$msgstrcount = 0;
$x = $msgstr;
$msgstrcount++ while ($x =~ s/%//);
if ($msgstrcount != $msgidcount) {
print ("incompatible numbers of formats in\n"
." msgid =$msgid\n"
." msgstr=$msgstr\n"
."at line $line.\n\n");
}
}
}
}
-------------------------------------------------------------------------
Wolfgang Bangerth email: bangerth@ices.utexas.edu
www: http://www.ices.utexas.edu/~bangerth/