This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: GCJ manual changed


Joseph S. Myers wrote:

>On 30 Jan 2002, Tom Tromey wrote:
>
>>I don't recall seeing text to that effect in anything I've read.  And
>>I'd be willing to bet that at least some versions of the JDK from Sun
>>don't reject such sequences.  For that matter, we don't reject such
>>sequences.  It's unclear whether we should change our implementation
>>here; this is yet another under-specified aspect of Java.
>>
>
>We ought to reject them (unless it is specifically specified otherwise).  
>Both the Unicode and ISO 10646 standards were changed to disallow
>interpretation (not just generation) of such sequences as representing the
>characters they would appear to represent when a naive UTF-8 decoder is
>used, because of the security issues associated with multiple
>representations.
>
>If there is some way of influencing Java standards, it would be worthwhile
>to represent that the standards should be changed to make it clear such
>over-long sequences must be rejected.
>

The online docs say:

    *Unicode 3.0 Support*
    Character handling in J2SE 1.4 is based on version 3.0 of the
    Unicode standard. This affects the Character and String classes in
    the java.lang package as well as the collation and bidirectional
    text analysis functionality in the java.text package.


So, if the Unicode standard has something to say on the matter, then 
that is what we should implement.

regards

Bryce.



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]