GCJ manual changed
Per Bothner
per@bothner.com
Wed Jan 30 10:40:00 GMT 2002
Joseph S. Myers wrote:
> In that case, the manual should state that references to UTF-8 are to the
> Java dialect meaning rather than the standard Unicode meaning. And there
> still shouldn't be references to "UTF", unqualified, as here - if it means
> some form of UTF-8, it should say so.
We should probably just say UTF-8, but at some convenient point
clarify that we're talking about Java's non-standard UTF-8 variant.
> Does Java define that, except for the special encoding of the null byte,
> over-long sequences must be treated as invalid, to avoid the usual
> security holes associated with them?
I guess. UTF-8 is primarily used to represent string literals and names
in .class files. It is not used as the Java programming level, except
as just another I/O encoding (in which case I assume it means the
UTF-8 encoding), when writing native code (either CNI or JNI), or if
explicitly reading/writing class files.
--
--Per Bothner
per@bothner.com http://www.bothner.com/per/
More information about the Gcc-patches
mailing list