Bug 15575 - HAVE_LANGINFO_CODESET never defined
: HAVE_LANGINFO_CODESET never defined
Status: RESOLVED FIXED
Product: gcc
Classification: Unclassified
Component: java
: unknown
: P2 normal
: 4.0.0
Assigned To: Not yet assigned to anyone
:
: patch
:
: 17574
  Show dependency treegraph
 
Reported: 2004-05-21 20:13 UTC by Tom Tromey
Modified: 2004-11-06 15:49 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2004-05-21 21:13:06


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Tom Tromey 2004-05-21 20:13:39 UTC
See this note and its enclosing thread:

http://gcc.gnu.org/ml/gcc/2004-05/msg01090.html

Apparently, HAVE_LANGINFO_CODESET is never defined by configure,
meaning that the user's locale's encoding will never be detected
by gcj.

I believe this is a regression against some earlier version of gcj.
I haven't verified the facts of the report personally.
Comment 1 Andrew Pinski 2004-05-21 21:13:06 UTC
Confirmed.
Comment 2 Andrew Pinski 2004-05-22 00:40:41 UTC
Patch here: <http://gcc.gnu.org/ml/gcc-patches/2004-05/msg01414.html>.
Comment 3 Paolo Bonzini 2004-05-22 07:48:22 UTC
Please wait before applying that libcpp is moved to the toplevel.  Otherwise 
the patch is going to break libgfortran which has 8-bit characters in it (and 
up to a few days, C and Java had too).

For more information, see http://gcc.gnu.org/ml/gcc/2004-05/msg01007.html and 
the reply at http://gcc.gnu.org/ml/gcc/2004-05/msg01026.html
Comment 4 Paolo Bonzini 2004-06-30 10:02:54 UTC
This can be applied now. 
Comment 5 Bryce McKinlay 2004-10-20 17:52:27 UTC
Do we really want to fix this?

The "buggy" behaviour actually seems better here because it more closely matches
what other Java compilers do and seems to have resulted in less complaints from
users since it "broke". 

I propose we close this as WONTFIX and update the documentation to specify that
Utf8 is the default encoding for input files unless specified otherwise with the
--encoding flag. Comments?
Comment 6 Joseph S. Myers 2004-10-20 17:59:49 UTC
Subject: Re:  HAVE_LANGINFO_CODESET never defined

On Wed, 20 Oct 2004, mckinlay at redhat dot com wrote:

> Do we really want to fix this?
> 
> The "buggy" behaviour actually seems better here because it more closely matches
> what other Java compilers do and seems to have resulted in less complaints from
> users since it "broke". 
> 
> I propose we close this as WONTFIX and update the documentation to specify that
> Utf8 is the default encoding for input files unless specified otherwise with the
> --encoding flag. Comments?

I don't know what is best for Java, but for the C compiler POSIX specifies 
use of locale to determine the encoding of source files.  In addition, if 
HAVE_LANGINFO_CODESET were set properly then people using UTF-8 locales 
would get proper quotes in error messages.  If particular languages do not 
want this or don't work with it at present, they need not use the locale 
for source files, but the configure test should go in for the use of 
diagnostics if nothing else.

I understand Zack has proposals for changes to cpplib which would mean 
that for well-behaved locale character sets (supersets of ASCII, roughly) 
stray invalid characters in comments can be ignored rather than causing an 
error through not being in the locale character set (and speed up cpplib 
by not needing to pass most of most files through iconv).

Comment 7 Tom Tromey 2004-10-20 18:03:17 UTC
My understanding is that other java compilers do use the locale's
default encoding.  However, unlike the glibc iconv() converter, 
typically javac treats ASCII as equivalent to Latin 1.
Comment 8 Bryce McKinlay 2004-10-20 18:10:22 UTC
Forget what I said, Tom is right. I just tested this again, and javac from JDK
1.5 does indeed use the Locale setting to determine the default encoding.
Further more, javac does appear to distinguish between ASCII and Latin1 now. I
will re-test the patch and ping it to gcc-patches.
Comment 9 CVS Commits 2004-10-20 21:36:52 UTC
Subject: Bug 15575

CVSROOT:	/cvs/gcc
Module name:	gcc
Changes by:	bryce@gcc.gnu.org	2004-10-20 21:36:48

Modified files:
	gcc            : ChangeLog configure.ac aclocal.m4 configure 
	                 config.in 

Log message:
	2004-10-20  Bryce McKinlay  <mckinlay@redhat.com>
	
	PR java/15575
	* configure.ac: Declare AM_LANGINFO_CODESET.
	* aclocal.m4: Define AM_LANGINFO_CODESET.
	* configure, config.in: Rebuilt.

Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&r1=2.5960&r2=2.5961
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/configure.ac.diff?cvsroot=gcc&r1=2.77&r2=2.78
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/aclocal.m4.diff?cvsroot=gcc&r1=1.98&r2=1.99
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/configure.diff?cvsroot=gcc&r1=1.868&r2=1.869
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config.in.diff?cvsroot=gcc&r1=1.199&r2=1.200

Comment 10 Bryce McKinlay 2004-10-20 21:38:26 UTC
Fix checked in.