[Bug libstdc++/20806] basic_filebuf::xsgetn() fails with text mode and DOS line endings and large buffers

dannysmith at users dot sourceforge dot net gcc-bugzilla@gcc.gnu.org
Thu Apr 7 08:36:00 GMT 2005


------- Additional Comments From dannysmith at users dot sourceforge dot net  2005-04-07 08:35 -------

This optimization in basic_filebuf::xsgetn() causes problems on DOS based file 
sytstem when ifstreams are opened in text mode (\r\n line endings) and the 
user suppled buffer length exceeds _M_buf_size. 

2004-09-13  Paolo Carlini  <pcarlini@suse.de>

	PR libstdc++/11722
	* include/std/std_fstream.h (xsgetn): Declare only.
	* include/bits/fstream.tcc (xsgetn): Define, optimize for the
	always_noconv() case: when __n > __buflen, copy the available
	buffer and issue a direct read.

The problem is that the native read translates the input \r\n to
\n,  returning the number of chars written, so this test:

fstream.tcc:line 550 
	   if (__len == __n)
	     {
	       _M_set_buffer(0);
	       _M_reading = true;
	     }
 
fails.

Attached is a testcase submitted by a mingw user, The input file,
c:\winnt\schedlog.txt, has DOS line endings and is 32667 bytes in size.
The testcase stops after the first read with:

Is EOF? 1	gcount: 494

The first 512 bytes contain 18 line endings  

After reverting the xsgetn patch, or disabling for test mode files
with:

*** fstream.tcc.orig	Sun Jan 30 17:44:23 2005
--- fstream.tcc	Wed Apr  6 21:48:11 2005
*************** namespace std
*** 521,530 ****
         // future: when __n > __buflen we read directly instead of using the
         // buffer repeatedly.
         const bool __testin = _M_mode & ios_base::in;
         const streamsize __buflen = _M_buf_size > 1 ? _M_buf_size - 1
  	                                                 : 1;
         if (__n > __buflen && __check_facet(_M_codecvt).always_noconv()
! 	   && __testin && !_M_writing)
  	 {
  	   // First, copy the chars already present in the buffer.
  	   const streamsize __avail = this->egptr() - this->gptr();
--- 521,531 ----
         // future: when __n > __buflen we read directly instead of using the
         // buffer repeatedly.
         const bool __testin = _M_mode & ios_base::in;
+        const bool __testbinary = _M_mode & ios_base::binary;
         const streamsize __buflen = _M_buf_size > 1 ? _M_buf_size - 1
  	                                                 : 1;
         if (__n > __buflen && __check_facet(_M_codecvt).always_noconv()
! 	   && __testin && __testbinary && !_M_writing)
  	 {
  	   // First, copy the chars already present in the buffer.
  	   const streamsize __avail = this->egptr() - this->gptr();

the testcase produces:

Is EOF? 0	gcount: 512
Is EOF? 0	gcount: 512

 <snip  57 reads of 512 bytes > 

Is EOF? 0	gcount: 512
Is EOF? 1	gcount: 294


Disabling this optimization for non-binary input streams on all platforms is a
bit extreme.   Should this be conditional on an os_defines.h define?

Danny


Testcase modified from:
https://sourceforge.net/tracker/?
func=detail&atid=102435&aid=1171379&group_id=2435

#include <fstream>
using namespace std;

#define BS 512

int main (void)
{
  char buf [BS+1];
  // change to any DOS text file > BS bytes
  ifstream f ("c:\\winnt\\schedlog.txt") ;
  int r;
  while (!f.eof ())
   {
     f.read (buf, BS);
     r = f.gcount ();
     buf [r] = 0;

    fprintf (stderr, "%d %d: %s\n", f.eof (), r, buf);
   }

return 0;
}


-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|basic_filebuf::xsgetn()     |basic_filebuf::xsgetn()
                   |fails with text mode and    |fails with text mode and DOS
                   |                            |line endings and large
                   |                            |buffers


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20806



More information about the Gcc-bugs mailing list