[PATCH] gettext support for GCC 4.0 internal format, fix for PR translation/21364

Jakub Jelinek jakub@redhat.com
Wed May 18 17:06:00 GMT 2005


(Resent with bzipped patch, wonder if it is otherwise considered spam):

On Tue, May 17, 2005 at 01:46:30PM +0000, Joseph S. Myers wrote:
> Once there's a gettext release supporting GCC 4 formats (NB that %' is a 
> format which is not expected to appear in the translated messages, only 
> the untranslated ones, and as %< %> %' don't format any arguments it 
> doesn't really matter whether they match in translated/untranslated 
> formats; they can be handled like %%), I see no problem with updating 
> install.texi regarding the required gettext version, patching exgettext 
> (mainline and 4.0) and regenerating gcc.pot.

Ok, I wrote the necessary gettext support for the newly intoduced format
string stuff and then run it over current po/ files.

The second attachment is the gettext patch.

ftp://sunsite.mff.cuni.cz/private/gcc-l10n/l10n-bugs.bz2
contains all errors it found (well, if there is already one bug reported
for a msgid/msgstr pair, I have not included the rest of bugs), including
fuzzy strings.  Note that I have just used --language=GCC-source and
current --keyword= machinery, so it won't catch bugs where a msgid used
in varargish functions, but not using any va_arg, is translated into a
format string that needs va_arg.

The first patch is a patch against current GCC CVS (applies to both
HEAD and 4.0 branch), that fixes bugs that happen in non-fuzzy strings.
These are very serious bugs, as they usually crash the compiler, so IMHO we
should apply this immediately, then talk to @li.org teams.
Where the solution was obvious, I have fixed the format string, where
it was not obvious, I have added ", fuzzy", so that it will not be
used in *.gmo files and translators can check it out.

BTW, shouldn't we implement %n$ style arguments in addition to % ones?
It seems e.g. several turkish translations relied on this...

2005-05-18  Jakub Jelinek  <jakub@redhat.com>

        PR translation/21364
        * de.po: Fix errors in format specifiers or make translations
        fuzzy if non-obvious.
        * es.po: Likewise.
        * tr.po: Likewise.
        * zh_CN.po: Likewise.

	Jakub
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gcc41-po-fixes.patch.bz2
Type: application/x-bzip2
Size: 4740 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20050518/591a6eaf/attachment.bz2>
-------------- next part --------------
2005-05-18  Jakub Jelinek  <jakub@redhat.com>

	* format-gcc-internal.c: Include string.h.
	Update comment with GCC 4.0 format specifiers.
	(FAT_POINTER, FAT_SIZE_LONG_LONG, FAT_SIZE_WIDE): Add.
	(format_parse): Handle %J at the beginning of format string,
	q, ll and w flags and <, >, ', m and p specifiers.
	(format_print): Handle FAT_POINTER, FAT_SIZE_LONG_LONG and
	FAT_SIZE_WIDE.

--- gettext-0.14.3/gettext-tools/src/format-gcc-internal.c.jj	2004-09-07 13:41:25.000000000 +0200
+++ gettext-0.14.3/gettext-tools/src/format-gcc-internal.c	2005-05-18 10:40:15.000000000 +0200
@@ -1,5 +1,5 @@
 /* GCC internal format strings.
-   Copyright (C) 2003-2004 Free Software Foundation, Inc.
+   Copyright (C) 2003-2005 Free Software Foundation, Inc.
    Written by Bruno Haible <bruno@clisp.org>, 2003.
 
    This program is free software; you can redistribute it and/or modify
@@ -22,6 +22,7 @@
 
 #include <stdbool.h>
 #include <stdlib.h>
+#include <string.h>
 
 #include "format.h"
 #include "c-ctype.h"
@@ -37,37 +38,45 @@
    output_format), plus some frontend dependent extensions:
      - for the C/ObjC frontend in gcc-3.3/gcc/c-objc-common.c
      - for the C++ frontend in gcc-3.3/gcc/cp/error.c
+   It also handles GCC internal format of GCC 4.0, implemented
+   mainly in gcc-4.0/gcc/pretty-print.c.
    Taking these together, GCC internal format strings are specified as follows.
    A directive
    - starts with '%',
-   - is optionally followed by a size specifier 'l',
+   - is optionally followed by 'q' specifier (must come immediately after '%'),
+   - is optionally followed by a size specifier 'l', 'll' or 'w',
    - is optionally followed by '+' (only the specifiers of gcc/cp/error.c),
    - is optionally followed by '#' (only the specifiers of gcc/cp/error.c),
    - is finished by a specifier
 
-       - '%', that needs no argument,
+       - '%', 'm', '<', '>', '\'', that needs no argument,
        - 'c', that needs a character argument,
        - 's', that needs a string argument,
        - 'i', 'd', that need a signed integer argument,
        - 'o', 'u', 'x', that need an unsigned integer argument,
        - '.*s', that needs a signed integer argument and a string argument,
+       - 'p', that needs a pointer argument,
        - 'H', that needs a 'location_t *' argument,
-         [see gcc/diagnostic.c]
+       - 'J', that needs a general declaration argument (%J must be at the
+	      beginning of the format string),
+	 [see gcc/diagnostic.c resp. gcc/pretty-print.c]
 
        - 'D', that needs a general declaration argument,
        - 'F', that needs a function declaration argument,
        - 'T', that needs a type argument,
-         [see gcc/c-objc-common.c and gcc/cp/error.c]
+	 [see gcc/c-objc-common.c (resp. gcc/toplev.c) and gcc/cp/error.c]
+
+       - 'E', that needs an expression argument,
+	 [see gcc/c-objc-common.c (in GCC 4.0+) and gcc/cp/error.c]       
 
        - 'A', that needs a function argument list argument,
        - 'C', that needs a tree code argument,
-       - 'E', that needs an expression argument,
        - 'L', that needs a language argument,
        - 'O', that needs a binary operator argument,
        - 'P', that needs a function parameter argument,
        - 'Q', that needs an assignment operator argument,
        - 'V', that needs a const/volatile qualifier argument.
-         [see gcc/cp/error.c]
+	 [see gcc/cp/error.c]
  */
 
 enum format_arg_type
@@ -78,21 +87,24 @@ enum format_arg_type
   FAT_CHAR		= 2,
   FAT_STRING		= 3,
   FAT_LOCATION		= 4,
-  FAT_TREE		= 5,
-  FAT_TREE_CODE		= 6,
-  FAT_LANGUAGES		= 7,
+  FAT_POINTER		= 5,
+  FAT_TREE		= 6,
+  FAT_TREE_CODE		= 7,
+  FAT_LANGUAGES		= 8,
   /* Flags */
-  FAT_UNSIGNED		= 1 << 3,
-  FAT_SIZE_LONG		= 1 << 4,
-  FAT_TREE_DECL		= 1 << 5,
-  FAT_TREE_FUNCDECL	= 2 << 5,
-  FAT_TREE_TYPE		= 3 << 5,
-  FAT_TREE_ARGUMENT	= 4 << 5,
-  FAT_TREE_EXPRESSION	= 5 << 5,
-  FAT_TREE_CV		= 6 << 5,
-  FAT_TREE_CODE_BINOP	= 1 << 8,
-  FAT_TREE_CODE_ASSOP	= 2 << 8,
-  FAT_FUNCPARAM		= 1 << 10
+  FAT_UNSIGNED		= 1 << 4,
+  FAT_SIZE_LONG		= 1 << 5,
+  FAT_SIZE_LONGLONG	= 2 << 5,
+  FAT_SIZE_WIDE		= 3 << 5,
+  FAT_TREE_DECL		= 1 << 7,
+  FAT_TREE_FUNCDECL	= 2 << 7,
+  FAT_TREE_TYPE		= 3 << 7,
+  FAT_TREE_ARGUMENT	= 4 << 7,
+  FAT_TREE_EXPRESSION	= 5 << 7,
+  FAT_TREE_CV		= 6 << 7,
+  FAT_TREE_CODE_BINOP	= 1 << 10,
+  FAT_TREE_CODE_ASSOP	= 2 << 10,
+  FAT_FUNCPARAM		= 1 << 12
 };
 
 struct unnumbered_arg
@@ -114,6 +126,7 @@ format_parse (const char *format, bool t
 {
   struct spec spec;
   struct spec *result;
+  const char *orig_format = format;
 
   spec.directives = 0;
   spec.unnumbered_arg_count = 0;
@@ -128,15 +141,29 @@ format_parse (const char *format, bool t
 
 	spec.directives++;
 
+	/* Quoted argument.  */
+	if (*format == 'q')
+	  format++;
+
 	/* Parse size.  */
 	size = 0;
 	if (*format == 'l')
 	  {
+	    if (*++format == 'l')
+	      {
+		size = FAT_SIZE_LONGLONG;
+		format++;
+	      }
+	    else
+	      size = FAT_SIZE_LONG;
+	  }
+	else if (*format == 'w')
+	  {
+	    size = FAT_SIZE_WIDE;
 	    format++;
-	    size = FAT_SIZE_LONG;
 	  }
 
-	if (*format != '%')
+	if (strchr ("%m<>'", *format) == NULL)
 	  {
 	    enum format_arg_type type;
 
@@ -148,6 +175,8 @@ format_parse (const char *format, bool t
 	      type = FAT_INTEGER | size;
 	    else if (*format == 'o' || *format == 'u' || *format == 'x')
 	      type = FAT_INTEGER | FAT_UNSIGNED | size;
+	    else if (*format == 'p')
+	      type = FAT_POINTER;
 	    else if (*format == '.' && format[1] == '*' && format[2] == 's')
 	      {
 		if (spec.allocated == spec.unnumbered_arg_count)
@@ -161,6 +190,10 @@ format_parse (const char *format, bool t
 	      }
 	    else if (*format == 'H')
 	      type = FAT_LOCATION;
+	    /* %J must only appear at the beginning, and without
+	       any flags.  */
+	    else if (*format == 'J' && format == orig_format + 1)
+	      type = FAT_TREE | FAT_TREE_DECL;
 	    else
 	      {
 		if (*format == '+')
@@ -316,9 +349,19 @@ format_print (void *descr)
 	printf (" ");
       if (spec->unnumbered[i].type & FAT_UNSIGNED)
 	printf ("[unsigned]");
-      if (spec->unnumbered[i].type & FAT_SIZE_LONG)
-	printf ("[long]");
-      switch (spec->unnumbered[i].type & ~(FAT_UNSIGNED | FAT_SIZE_LONG))
+      switch (spec->unnumbered[i].type & FAT_SIZE_WIDE)
+	{
+	case FAT_SIZE_LONG:
+	  printf ("[long]");
+	  break;
+	case FAT_SIZE_LONGLONG:
+	  printf ("[long long]");
+	  break;
+	case FAT_SIZE_WIDE:
+	  printf ("[wide]");
+	  break;
+	}
+      switch (spec->unnumbered[i].type & ~(FAT_UNSIGNED | FAT_SIZE_WIDE))
 	{
 	case FAT_INTEGER:
 	  printf ("i");
@@ -332,6 +375,9 @@ format_print (void *descr)
 	case FAT_STRING:
 	  printf ("s");
 	  break;
+	case FAT_POINTER:
+	  printf ("p");
+	  break;
 	case FAT_LOCATION:
 	  printf ("H");
 	  break;


More information about the Gcc-patches mailing list