This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Add input character set infrastructure
- From: neroden at twcny dot rr dot com (Nathanael Nerode)
- To: echristo at redhat dot com, zack at codesourcery dot com, gcc-patches at gcc dot gnu dot org
- Date: Wed, 14 Jan 2004 22:21:38 -0500
- Subject: Re: [PATCH] Add input character set infrastructure
Eric Christopher wrote, in http://gcc.gnu.org/ml/gcc-patches/2004-01/msg01397.html:
>This is cleaning up and adding the infrastructure that was started by
>Chandra a while back. Actually using it is in a future patch :)
>
>OK?
Some nitpicks.
+static int
+one_iso88591_to_utf8 (iconv_t bigend ATTRIBUTE_UNUSED, const uchar
**inbufp,
+ size_t *inbytesleftp, uchar **outbufp, size_t *outbytesleftp)
+{
+ const uchar *inbuf = *inbufp;
^^^^^^^^^^^^^^^^^^
This is a pointer to a 8-bit quantity.
+ int rval;
+
+ if (*inbytesleftp > 1)
+ return EINVAL;
+
+ rval = one_cppchar_to_utf8 (*inbuf, outbufp, outbytesleftp);
^^^^^^
The signature for the above function is
one_cppchar_to_utf8 (cppchar_t c, uchar **outbufp, size_t *outbytesleftp)
cppchar_t is either unsigned int or unsigned char; either way it's 32 bits.
Wouldn't it be clearer to use an explicit cast rather than using the implicit
conversion from uchar to cppchar_t?... that would be more self-documenting,
and would be easier to identify as an error if cppchar_t ever changed.
+ /* Default the input character set to iso-8859-1 for now. */
+ CPP_OPTION (pfile, input_charset) = "ISO-8859-1";
+
Is this really the right *default*? I actually think UTF-8 is a more common
default nowadays. :-/
+ /* Holds the name of the input character set. */
+ const char *input_charset;
+
This looks like the input_charset will only be alterable at compile time,
not at runtime. Is that right?
--
Nathanael Nerode <neroden at gcc.gnu.org>
http://home.twcny.rr.com/nerode/neroden/fdl.html