revised proposal for GCC and non-Ascii source files

Paul Eggert eggert@twinsun.com
Sun Jan 31 23:58:00 GMT 1999


   Date: Fri, 01 Jan 1999 18:20:07 -0400
   From: Horst von Brand <vonbrand@sleipnir.valparaiso.cl>

   It's confusing enough to have to handle different charsets, and then
   different charsets in the same file, switching in the middle! Maybe it
   _can_ be done, but I'd vote it _shouldn't_ be done.

Good point, and this suggests another problem with #ctype: it doesn't
work well with #include.  For example, suppose we have:

	main.c:
		#ctype "ja_JP.PCK" // Shift-JIS in Solaris 7
		#include "myfile.h"
		char s[] = S;

	myfile.h:
		#ctype "ja" // EUC in Solaris 7
		#define S "some EUC string"

This will be difficult to implement, as it will mean we'll need to
support mixed-charset translation, which would have many problems
(e.g. how do you concatenate identifiers with different ctypes?).
Also, it'd be nearly impossible to explain -- e.g. should `s' use
Shift-JIS or EUC in the example above?

To avoid these problems, #ctype would have to be allowed only at the
start of the compilation unit.

The more I think about the #ctype directive, the less I like it -- it
has so many funny restrictions, and its operands are so unportable.
Perhaps it'd be better if we support only a -ctype option, at least at
first.  We can add a #ctype directive later if the need arises.



More information about the Gcc mailing list