This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Input charsets - What's going on?

From: Tom Tromey <tromey at redhat dot com>
To: Zack Weinberg <zack at codesourcery dot com>
Cc: GCC Development <gcc at gcc dot gnu dot org>,Eric Christopher <echristo at redhat dot com>
Date: 20 May 2004 16:29:04 -0600
Subject: Re: Input charsets - What's going on?
References: <40AB8CF9.1050906@gnu.org> <878yfogo88.fsf@codesourcery.com>
Reply-to: tromey at redhat dot com

>>>>> "Zack" == Zack Weinberg <zack@codesourcery.com> writes:

Zack> Paolo Bonzini <bonzini@gnu.org> writes:
>> cppcharset.c's _cpp_default_encoding use nl_langinfo (CHARSET) to find
>> the default input charset.  This usage is guarded by a configure
>> script symbol, HAVE_LANGINFO_CODESET.  Too bad, in the current GCC
>> sources, HAVE_LANGINFO_CODESET is never defined and is not even in the
>> auto-host.h template.  This means that GCC never attempts any input
>> charset conversion except if -finput-charset is given.

Zack> Oops.  I had thought that was there, because Java uses the same
Zack> construct...

This worked at one point.  I did a little research and found a patch
that moved AM_LANGINFO_CODESET to config/gettext.m4.  I'm sure
AM_LANGINFO_CODESET did the appropriate check, but it probably got
removed in some later reorganization (which I didn't look for).

We don't have a test for this functionality AFAIK, which explains how
it would go unnoticed.  It's probably a pain to write one, since
encoding names aren't standardized across platforms (and since iconv
isn't available everywhere, we would need to explain that to the test
suite as well).

Zack> - There may be another common convention over in Java land, which
Zack>   would be friendly to support.

Nope, in Java sources the standard is that everything compiled by a
single invocation of the compiler has the same encoding.  Java doesn't
have the header file problem, since already-compiled code comes in a
portable binary format whose internal encodings are predefined.

Tom

References:
- Input charsets - What's going on?
  - From: Paolo Bonzini
- Re: Input charsets - What's going on?
  - From: Zack Weinberg

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]