This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Help : File does not end with newline
- To: gcc at gcc dot gnu dot org, kaih at khms dot westfalen dot de
- Subject: Re: Help : File does not end with newline
- From: dewar at gnat dot com
- Date: Sun, 30 Sep 2001 20:43:29 -0400 (EDT)
<<However, I can't remember seeing a Windows-based editor that did this, or
a Windows-based textfile that had that 0x1A.
>>
Well there are some, and another test is that in my experience *all* windows
editors except those hailing from Unix sources, can always at least
*recognize* the 1A. Yes, the 1A hailed from CPM, but was incorporated
into DOS (which remember was derived from CPM), and this convention has
continued in at least half baked form in the DOS/Windows environment
ever since.
<No, this is false. Standard Windows textfiles do not contain a terminating
0x1A.
>>
I am not sure there is any defining standard that says what "Standard
Windows textfiles" look like, but in practice all WIndows software
recognizes either the hard EOF, or a 1A at the end of the file. Quite
a few software components will also recognize 1A in the middle of the
file, though this is much less universal. Indeed quite a lot of DOS
based software is still widely used in the Windows environments (e.g.
the SPITBOL/386 compiler :-)
Any software running on Windows that does NOT recognize a terminating 1A
and ignore it seems hostile to me.
<<It is true, however, that various Windows libc variants tend to get upset
at seeing a 0x1A on input, telling upper layers they saw EOF. These days,
this should probably regarded as a bug.
>>
I think a good compromise is to only allow 1A as the last character in the
file, and then ignore this, reporting end of file. But 1A in any other
position can be regarded as a valid character. Note that this is appropriate
for text mode files only, not binary files, of course, but then text files
are usually treated specially to deal with the CR/LF => NL translation
for Unix purposes.
Certainly we found that making the GNAT compiler do this on source input
files is useful in practice. We do this even in a Unix environment, since
quite often these contaminated 1A files wander from DOS/Windows systems
to Unix systems.