This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Universal Character Names, v2


Neil Booth <neil@daikokuya.co.uk> writes:

> I've had more thoughts about arbitrary charsets.  Rather than converting
> to UTF-8 on a per-character basis, the obvious place is to convert
> a line-at-a-time from the new-line handler (plus a call when starting
> a buffer to get the process started).  

Would there be anything wrong with converting the entire *file*?

Some encodings may have shift states that can expand beyond the line
end (although I think this is discouraged in many encodings), so you
might have difficulties interpreting the line end before performing
the charset conversion.

Regards,
Martin


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]