This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Checking format specifiers


> > I am not familiar with the translation community, but if you are: one of
> > the really necessary additions in their processes would be a small
> > program that does what my small script tried to do in a thourough way
> > (i.e., for example, taking into account positional parameters).
>
> Doesn't 'msgfmt --check-format' already do this?

If it does, then it is not very successful: it doesn't give an error for any 
of the translations we have. On the other hand, attached is a quick hack that 
simply checks the number of % characters in messages and compares them. And 
it finds lots of errors. For example, run it like so
  cat da.po | perl x.pl
to get 119 error messages that look like this, for example:
  incompatible numbers of formats in
    msgid =call to function which throws incomplete type `%#T'
    msgstr=kan ikke %s en henvisning til en ufuldstïdig type '%T'
  at line 18100.

  incompatible numbers of formats in
    msgid =The -shared option is not currently supported for VAX ELF.
    msgstr=den indbyggede funktion '%s' understttes i jeblikket ikke
  at line 20845.


Now running this script against all the translation we have yields
  gcc/po> for i in *po ; do echo $i `cat $i | perl x.pl | wc -l` ; done
  be.po 470
  da.po 255
  de.po 15
  el.po 1680
  es.po 10
  fr.po 255
  ja.po 1905
  nl.po 1830
  sv.po 1415
  tr.po 245
Each error is 5 lines, so we have between 2 and 381 errors in each of the 
translation (at least -- these are only errors in the number of % signs, not 
counting wrong formats or wrong order of formats!). So much, I guess about 
msgfmt --check...

W.

--------------------------------------------------------
$line = 0;
while (<>) {
    chop;
    ++$line;

    # find next msgid line
    if (/^msgid \"(.*)\"$/) {
	$msgid = $1;

	# concatenate with possible follow-up lines
	while (($_ = <>) =~ /^\"(.*)\"$/) {
	    ++$line;
	    $msgid .= $1;
	}
	++$line;

	# parse corresponding msgstr line and make sure it is there
	die ("no msgstr in line $line: $_\n") if (!/^msgstr \"(.*)\"$/);
	$msgstr = $1;
	# concatenate with possible follow-up lines
	while (($_ = <>) =~ /^\"(.*)\"$/) {
	    ++$line;
	    $msgstr .= $1;
	}
	++$line;

	# if a translation string is empty, this means that the
	# original string is to be used. then there can't be an error
	if (!($msgstr eq "")) {
	    # count the number of % signs in both messages. don't even
	    # attempt to make sure they refer to the same formats
	    $msgidcount = 0;
	    $x = $msgid;
	    $msgidcount++ while ($x =~ s/%//);
	    
	    $msgstrcount = 0;
	    $x = $msgstr;
	    $msgstrcount++ while ($x =~ s/%//);
	    
	    if ($msgstrcount != $msgidcount) {
		print ("incompatible numbers of formats in\n"
		       ."  msgid =$msgid\n"
		       ."  msgstr=$msgstr\n"
		       ."at line $line.\n\n");
	    }
	}
    }
}

-------------------------------------------------------------------------
Wolfgang Bangerth              email:            bangerth@ices.utexas.edu
                               www: http://www.ices.utexas.edu/~bangerth/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]