This is the mail archive of the
mailing list for the GCC project.
Using Unicode quotes (was: Re: Ada files now checked in)
- To: <dewar at gnat dot com>
- Subject: Using Unicode quotes (was: Re: Ada files now checked in)
- From: "Joseph S. Myers" <jsm28 at cam dot ac dot uk>
- Date: Sun, 7 Oct 2001 15:22:45 +0100 (BST)
- cc: <gcc at gcc dot gnu dot org>, <zack at codesourcery dot com>
On Sun, 7 Oct 2001 email@example.com wrote:
> The character ' is by the way an apostrophe, not a quote. The normal
> english use is in posessives, and there is a special rule about using
> it for nested quotations in place of normal quote marks.
> Note that this is not completely idle discussion, the GNAT message insertions
> do use quotations as in:
> j.adb:3:04: "xyz" is undefined
> compared to the c message
> j.c:1: `asdf' undeclared (first use this function)
For all non-English languages, the problem can simply be solved by having
the .po message catalogs be UTF-8 encoded, and using the proper Unicode
quotes (U+201C and U+201D for double quotes or U+2018 and U+2019 for
single quotes) since gettext will transliterate when converting to other
locale character sets (at least with glibc 2.2).
For English, things still look prettier (given a UTF-8 terminal, e.g.
recentish xterm with appropriate options and fonts) if Unicode quotes are
used. Since things need to produce ASCII double quotes when not in a
UTF-8 LC_CTYPE, but Unicode quotes where available, and since knowledge of
quotes should not be hardcoded everywhere, this suggests adding printf
modifiers to handle quoting to GCC's extensible printf reimplementation.
The common case of quoting is simply `%s', but the C++ front end also
quotes some longer strings with multiple conversions. Perhaps the flags
should be ` for an open quote to appear before the converted output and '
for a close quote to appear afterwards; the individual left and right
quotes would be translated once in each .po file, and there would be
special-case handling for when no message catalog is being used to produce
Unicode or ASCII quotes according to the value of LC_CTYPE.
Joseph S. Myers