Patch: libcpp -vs- UTF-8 BOM

Paolo Bonzini bonzini@gnu.org
Thu Apr 17 20:56:00 GMT 2008


> Paolo> The comment could instead remind the reader that SOURCE_CHARSET
> Paolo> *is* "UTF-8" if the host charset is ASCII, and that's why we
> Paolo> look at the BOM as UTF-8.
> 
> Thanks.  This idiom is used a few spots in the file, but I don't mind
> adding a comment to make it clearer.  How does this sound to you?
> 
>   /* The HOST_CHARSET test just above ensures that the source charset
>      is UTF-8.  So, ignore a UTF-8 BOM if we see one.  Note that
>      glib'c UTF-8 iconv() provider (as of glibc 2.7) does not ignore a
>      BOM -- however, even if it did, we would still need this code due
>      to the 'convert_no_conversion' case.  */

Great.

>>> libcpp/ChangeLog:
> 
> Paolo> Missing entry for charset.c.
> 
> Thanks for catching this.  I'm pretty sure I would not have noticed.

I went headlong doing "less files.c" based on the ChangeLog and, well, 
did not find SOURCE_CHARSET anywhere. :-)

Paolo



More information about the Gcc-patches mailing list