[Bug ada/66390] New: Text_IO.Get_Line does not correctly handle missing line marker for last line in all cases
tornenvi at gmail dot com
gcc-bugzilla@gcc.gnu.org
Wed Jun 3 06:51:00 GMT 2015
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66390
Bug ID: 66390
Summary: Text_IO.Get_Line does not correctly handle missing
line marker for last line in all cases
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: ada
Assignee: unassigned at gcc dot gnu.org
Reporter: tornenvi at gmail dot com
Target Milestone: ---
Created attachment 35685
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=35685&action=edit
Program file
For a "non canonical" file as described in the reference manual
gcc\ada\doc\gnat_rm\the_implementation_of_standard_i_o.html#text-io
which has no line marker at the end of file, then it is possible for
Get_Line to return the incorrect length.
Note that a "non canonical" file is a "standard" file in windows so this
use case is common.
The program file attached FileRead.adb simply reads the first line of a
text file into a 6 character buffer, and outputs the length of the line
read, the line and the line again as hex.
I compile FileRead.adb with
gnatmake FileRead.adb
I use windows notepad to create the test file Test.txt. note that
windows notepad will not place a end of line marker at the end of
the file.
When using Test.txt with the one line of text
ABCD
with no blank line at the end of the file I get output 1 below
-- Output 1 ------------------------------------------------------
Length= 4
Line=[ABCDXX]
Line as hex=[414243445858]
------------------------------------------------------------------
as expected. However if Test.txt is the one line of text
ABCDE
with no blank line at the end of the file then I get output 2 below
-- Output 2 -----------------------------------------------------
Length= 6
Line=[ABCDE ]
Line as hex=[4142434445ff]
-----------------------------------------------------------------
I receive an unexpected length of 6, expected 5. The fact that the last two
characters are not 'XX' is acceptable since ARM A.10.7 19 says
"The values of characters not assigned are not specified."
I get the same output 2 if I compile the program with
GNAT GPL 2014, 2013,2012 or 2011 or with mingw FSF GNAT 4.7.0-1 or 4.8.1-4.
However if I compile with mingw FSF GNAT 4.5.2-1 , 4.5.0-1, 4.4.0 or
3.4.5 then I get the expected output 3 below
-- Output 3 -----------------------------------------------------
Length= 5
Line=[ABCDEX]
Line as hex=[414243444558]
-----------------------------------------------------------------
Under windows the one line text file does not have an end of line
character (usually CRLF) which means it is not a canonical text file
as described here
gcc\ada\doc\gnat_rm\the_implementation_of_standard_i_o.html#text-io
however that section also states that text_io can be use to read non
canonical text files and the very last line of that section says
"Every LF character is considered to end a line, and there is an
implied LF character at the end of the file."
this indicates to me that there should be an implied LF at the end of the
file and I should be getting the expected output 3 result for
all the compilers.
The program works correctly for test.txt with 4 and less or 6 and more
characters.
I believe the problem is in
\gcc\ada\a-tigeli.adb
at line 187 a comment says
-- If we get EOF after already reading data, this is an incomplete
-- last line, in which case no End_Error should be raised.
however at line 193 there is no test to
check that the last character read 'ch' is the EOF marker instead
it adds it the Item buffer and increments the length.
I suggest line 193 be changed from
elsif ch /= LM then
to
elsif ch /= LM and ch /= EOF then
More information about the Gcc-bugs
mailing list