Bug 21286 - [3.4/4.0/4.1 Regression] filebuf::xsgetn vs pipes
Summary: [3.4/4.0/4.1 Regression] filebuf::xsgetn vs pipes
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: libstdc++ (show other bugs)
Version: 4.0.0
: P2 normal
Target Milestone: 4.0.1
Assignee: Paolo Carlini
URL:
Keywords:
: 27168 (view as bug list)
Depends on:
Blocks:
 
Reported: 2005-04-29 15:54 UTC by Ralf Fassel
Modified: 2006-04-14 19:31 UTC (History)
4 users (show)

See Also:
Host: i686-pc-linux-gnu
Target: i686-pc-linux-gnu
Build: i686-pc-linux-gnu
Known to work:
Known to fail:
Last reconfirmed:


Attachments
C++ source file to show the problem (845 bytes, text/plain)
2005-04-29 15:56 UTC, Ralf Fassel
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ralf Fassel 2005-04-29 15:54:09 UTC
Attach stdin (or any other FILE*) to an iostream via the rdbuf() created from a
FILE* by using the GNU extension stdio_filebuf() (see attached C++-File).

Input is just some big file with +11MB:

% wc -c big_file
11601584 big_file

This reads incorrect number of bytes when stdin is a pipe and using an block
size which is not some power of 2. The number of bytes read differs from run to run.
% cat big_file | ./t400 8687
total bytes read 9347072, blocks 1076
% cat big_file | ./t400 8687
total bytes read 978944, blocks 113
% cat big_file | ./t400 8687
total bytes read 8679424, blocks 1000
% cat big_file | ./t400 8687
total bytes read 3424256, blocks 395
% cat big_file | ./t400 8687
total bytes read 11601584, blocks 1336
% cat big_file | ./t400 2048
total bytes read 11601584, blocks 5665
% cat big_file | ./t400 2048
total bytes read 11601584, blocks 5665

No errors when stdin is redirected directly to the file.  

% ./t400 8676 < big_file
total bytes read 11601584, blocks 1338
% ./t400 8676 < big_file
total bytes read 11601584, blocks 1338
% ./t400 8676 < big_file
total bytes read 11601584, blocks 1338
% ./t400 8676 < big_file
total bytes read 11601584, blocks 1338

Also no problems when reading in 2048 or 4096 blocks:

% cat big_file | ./t400 2048
total bytes read 11601584, blocks 5665
% cat big_file | ./t400 2048
total bytes read 11601584, blocks 5665
% cat big_file | ./t400 2048
total bytes read 11601584, blocks 5665

% cat big_file | ./t400 4096
total bytes read 11601584, blocks 2833
% cat big_file | ./t400 4096
total bytes read 11601584, blocks 2833
% cat big_file | ./t400 4096
total bytes read 11601584, blocks 2833

Same problem on HP-UX 10.20 with gcc-4.0.0 (and 3.4.3 if it matters).

The same program runs w/o problems in gcc-3.4.1 and gcc-3.3.3 on the same platforms.

I actually had a shell loop trying block sizes from 1 to 10000 to find the
problematic ones, so if 8687 does not reproduce the bug, try the follwoing shell
loop:

i=8000
while test $i -lt 9000 ; do
  echo -e "$i: \\c"
  cat big_file | ./t400 $i
  i=`echo $i+1 | bc`
done


R'
Comment 1 Ralf Fassel 2005-04-29 15:56:24 UTC
Created attachment 8765 [details]
C++ source file to show the problem

g++ -o t400 t.cxx
usage: t400 [blocksize]

t400 reads all input and counts the bytes.  At EOF, a summary is printed.  The
total bytes read should match the input file size.
Comment 2 Ralf Fassel 2005-04-29 15:58:46 UTC
Comment on attachment 8765 [details]
C++ source file to show the problem

The last paragraph in the attachted file is wrong, the bug is reproducable on
linux now.
Comment 3 Paolo Carlini 2005-04-29 17:28:39 UTC
Indeed, this is easily explained considering that we have a new optimization
that issues larger underlying read sys-calls (have a look to xsgetn in fstream.tcc
to see what I mean): when everything goes well ;) are you noticing measurable
performance improvements? When pipes are involved, I'm not sure we can call this
a bug, because, when 27.6.1.3/28 talks about "end-of-file occurs on the input
sequence" it refers, in this case, not to the underlying file that you are piping
but to the pipe itself and, at that moment, indeed the pipe cannot make available
all the characters requested. I'd like to know your opinion, as a user: are you
noticing worthwhile performance improvements? Would you consider very annoying
trying to read again (calling clear()), when pipes are used? In case, disabling
the optimization to be safe would be very easy.
Comment 4 Paolo Carlini 2005-04-29 21:31:32 UTC
Ok, I have asked privately Nathan Myers and what we need is rather simple: add
looping on short reads inside filebuf::sgetn, rather straightforward to
implement. Stay tuned!
Comment 5 ncm-nospam@cantrip.org 2005-04-30 02:11:44 UTC
Just to add... if sgetn() loops reading until it gets n bytes, but 
underflow() accepts a pipe's short reads, then in_avail() will report 
the size of the short read.  Then, istream::read_some() will 
rdbuf->sgetn(p,rdbuf->in_avail()), thus matching POSIX pipe semantics 
when attached to a pipe.

(For those who don't know: as long as whoever is writing the
other end of the pipe writes no more than PIPE_BUF bytes into 
it, then POSIX requires that read() deliver the bytes on write 
boundaries.  Thus, if you're writing structs, the reader will 
see half a struct.  If you're writing short-enough lines, the 
reader will only ever see complete lines -- although you might
see more than one.  Watch out that PIPE_BUF is 4K or more 
on Linux, but only 512 bytes on BSD!)

It's exciting to finally get this right.  With a bit more work, 
I see now that we might be able to support nonblocking file 
descriptors, and maybe even zero-copy I/O, too!  Thank you, Ralf.
Comment 6 ncm-nospam@cantrip.org 2005-04-30 03:49:11 UTC
(In reply to comment #5)
> ...  Thus, if you're writing structs, the reader will 
> see half a struct.  

Sorry, that's "will never see half a struct".

Comment 7 GCC Commits 2005-04-30 06:54:18 UTC
Subject: Bug 21286

CVSROOT:	/cvs/gcc
Module name:	gcc
Changes by:	paolo@gcc.gnu.org	2005-04-30 06:54:08

Modified files:
	libstdc++-v3   : ChangeLog 
	libstdc++-v3/include/bits: fstream.tcc 

Log message:
	2005-04-29  Paolo Carlini  <pcarlini@suse.de>
	Nathan Myers  <ncm@cantrip.org>
	
	PR libstdc++/21286
	* include/bits/fstream.tcc (basic_filebuf<>::xsgetn):
	Loop on short reads; remove the work-around for
	libstdc++/20806, not needed anymore.

Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/libstdc++-v3/ChangeLog.diff?cvsroot=gcc&r1=1.2986&r2=1.2987
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/libstdc++-v3/include/bits/fstream.tcc.diff?cvsroot=gcc&r1=1.129&r2=1.130

Comment 8 Ralf Fassel 2005-04-30 14:16:26 UTC
Paolo Carlini wrote:
> I'd like to know your opinion, as a user: are you
> noticing worthwhile performance improvements? 
> Would you consider very annoying trying to read again
> (calling clear()), when pipes are used? 

The issue seems already solved, but I'd like to add a comment to this question:
Yes, I would find it very annoying, since my program simply does not know
whether  a file descriptor is connected to a pipe or not.  Note that I have oly
checked for eof() im my program, so the above solution basically means: on
eof(), clear() and try to read again.  How often should I try this?  Etc.

Performance improvement or not, the stream should (must?) give the same results
regardless of the data source IMHO.  Performance can only be second here.

BTW, the program timing is as follows on my machine (reading from local file
system /tmp):

gcc 3.3.3: 
0.01user 0.01system 0:00.08elapsed 31%CPU (0avgtext+0avgdata 0maxresident)k
0.01user 0.00system 0:00.08elapsed 23%CPU (0avgtext+0avgdata 0maxresident)k
0.01user 0.01system 0:00.11elapsed 25%CPU (0avgtext+0avgdata 0maxresident)k
0.01user 0.01system 0:00.08elapsed 27%CPU (0avgtext+0avgdata 0maxresident)k

gcc 4.0.0 (when reading all bytes from the pipe, not with the short reads)
0.00user 0.01system 0:00.04elapsed 38%CPU (0avgtext+0avgdata 0maxresident)k
0.00user 0.01system 0:00.04elapsed 48%CPU (0avgtext+0avgdata 0maxresident)k
0.00user 0.01system 0:00.04elapsed 27%CPU (0avgtext+0avgdata 0maxresident)k
0.00user 0.01system 0:00.04elapsed 35%CPU (0avgtext+0avgdata 0maxresident)k

So yes, there seems to be a performance improvement (1.x CPU, half wall clock time).

Best regards
R'
Comment 9 ncm-nospam@cantrip.org 2005-04-30 15:48:57 UTC
A note for Ralf: It is incorrect to use cin.eof() to watch for the end
of a stream.  The correct flag to check is fail().  eof() is really 
meant for seeing if that's why op>> failed.  
Comment 10 GCC Commits 2005-04-30 22:55:50 UTC
Subject: Bug 21286

CVSROOT:	/cvs/gcc
Module name:	gcc
Branch: 	gcc-4_0-branch
Changes by:	paolo@gcc.gnu.org	2005-04-30 22:55:34

Modified files:
	libstdc++-v3   : ChangeLog 
	libstdc++-v3/include/bits: fstream.tcc 

Log message:
	2005-04-30  Paolo Carlini  <pcarlini@suse.de>
	Nathan Myers  <ncm@cantrip.org>
	
	PR libstdc++/21286
	* include/bits/fstream.tcc (basic_filebuf<>::xsgetn):
	Loop on short reads; remove the work-around for
	libstdc++/20806, not needed anymore.

Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/libstdc++-v3/ChangeLog.diff?cvsroot=gcc&only_with_tag=gcc-4_0-branch&r1=1.2917.2.35&r2=1.2917.2.36
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/libstdc++-v3/include/bits/fstream.tcc.diff?cvsroot=gcc&only_with_tag=gcc-4_0-branch&r1=1.128.10.1&r2=1.128.10.2

Comment 11 Paolo Carlini 2005-04-30 22:57:20 UTC
Should now be fixed for 4.1.0 and 4.0.1. Ralf, if you get a chance to test a
4.0.1 snapshot...
Comment 12 Ralf Fassel 2005-05-01 17:44:09 UTC
ncm-nospam@cantrip.org:
> A note for Ralf: It is incorrect to use cin.eof() 
> to watch for the end of a stream.  
> The correct flag to check is fail().  
> eof() is really meant for seeing if that's why op>> failed.  

Yup.  Thanks for pointing that out.  As for the 4.0.1 snapshot, is there a
specific place where to get it, or should I just use the CVS links on
http://www.gnu.org/software/gcc/cvs.html?

R'
Comment 13 Paolo Carlini 2005-05-01 18:12:33 UTC
You can find weekly snapshots both of mainline and 4_0 here:

  ftp://gcc.gnu.org/pub/gcc/snapshots/
Comment 14 Ralf Fassel 2005-05-01 19:12:42 UTC
>  ftp://gcc.gnu.org/pub/gcc/snapshots/

Just tried gcc-4.0-20050430: the problem is gone.
However, I could not reintroduce the problem by just reverting the patch for
fstream.cc (still works after I undo the patch), but I think this is not
important.  Maybe there were other changes as well which prevent the problem
from manifesting in gcc-4.0-20050430.

Best regards and thanks for the quick solution.
R'
Comment 15 Paolo Carlini 2005-05-01 19:28:56 UTC
OK, thanks. Maybe you cannot see the problem by just reverting the patch,
because at least part of the data remained in the cache: I found your procedure
very useful to expose the problem but very sensitive to that detail. If you can,
please keep on testing 4.0.1 prereleases: we really hate introducing regressions
between minor releases. Thanks.
Comment 16 Aaron M. Ucko 2005-07-14 20:45:23 UTC
FWIW, the same broken optimization, and hence the same bug, also appears to have
materialized in the 3.4 branch somewhere between 3.4.2 and 3.4.3.
Comment 17 Paolo Carlini 2005-07-14 23:51:33 UTC
Thanks. At this point in 3_4-branch go only very-very safe changes. Therefore,
before considering fixing in that branch too the problem (*), let's test the
new algorithm in mainline and 4_0-branch for a while.

(*) By the way, I keep on wondering why people like so much the word "broken"
about software?!? Maybe because it's something immaterial, e.g., would not be
so happy to use the same term about a leg or an arm?!? ;)
Comment 18 Ralf Fassel 2005-07-18 12:03:55 UTC
> Thanks. At this point in 3_4-branch go only very-very safe changes. Therefore,
> before considering fixing in that branch too the problem (*), let's test the
> new algorithm in mainline and 4_0-branch for a while.

I'm wondering why this problem does not break tons of code, since all pipes are
basically broken (yes, broken: they don't supply all the data).  So each program
which runs as a filter should suffer from this problem.
Comment 19 Paolo Carlini 2005-07-18 12:28:15 UTC
> I'm wondering why this problem does not break tons of code,

Just read the audit trail, about that. In particular my comment #3: portable C++
code using pipes has better being very very careful with short reads, because
the current Standard is way too vague in this area (improvements in system
programming support are certainly planned for C++0x).

That said, the next 3.4.x is not scheduled for tomorrow, we shall fix it in time,
don't worry. By the way, I would suggest pronouncing "broken" using the same 
intonation of Martin Landau's Bela Lugosi in Ed Wood: "Beware, beware... broken,
broken..." ;)
Comment 20 Ralf Fassel 2005-07-18 15:01:07 UTC
> portable C++ code using pipes has better being very very 
> careful with short reads, because the current Standard is 
> way too vague in this area

I prefer C++ for the abstractions it allows.  Having to code my I/O with the
type of the underlying data stream type in mind defeats this abstraction to a
great deal...

> By the way, I would suggest pronouncing "broken" using the same 
> intonation of Martin Landau's Bela Lugosi in Ed Wood:
> "Beware, beware... broken, broken..." ;)

Let's pronounce it [k&#601;&#712;p&#650;t] then :-)
http://en.wiktionary.org/wiki/Kaput
Comment 21 Paolo Carlini 2005-07-18 15:56:53 UTC
> Let's pronounce it [k&#601;&#712;p&#650;t] then :-)
> http://en.wiktionary.org/wiki/Kaput

Yes. My point was simply that in order to have the attention of the maintainers
you don't need to use exagerated expressions. Generally, technical explanations
are much better for that. Overstating the issue is more suited to make someone
nervous and, likely, obtain exactly the opposite effect.
Comment 22 GCC Commits 2005-07-18 18:32:06 UTC
Subject: Bug 21286

CVSROOT:	/cvs/gcc
Module name:	gcc
Branch: 	gcc-3_4-branch
Changes by:	paolo@gcc.gnu.org	2005-07-18 18:32:00

Modified files:
	libstdc++-v3   : ChangeLog 
	libstdc++-v3/include/bits: fstream.tcc 

Log message:
	2005-07-18  Paolo Carlini  <pcarlini@suse.de>
	Nathan Myers  <ncm@cantrip.org>
	
	PR libstdc++/21286
	* include/bits/fstream.tcc (basic_filebuf<>::xsgetn):
	Loop on short reads.

Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/libstdc++-v3/ChangeLog.diff?cvsroot=gcc&only_with_tag=gcc-3_4-branch&r1=1.2224.2.227&r2=1.2224.2.228
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/libstdc++-v3/include/bits/fstream.tcc.diff?cvsroot=gcc&only_with_tag=gcc-3_4-branch&r1=1.116.4.8&r2=1.116.4.9

Comment 23 Paolo Carlini 2005-07-18 18:39:35 UTC
Fixed for 3.4.4 too.
Comment 24 Paolo Carlini 2005-07-18 18:40:49 UTC
Sorry, I meant 3.4.5.
Comment 25 Paolo Carlini 2006-04-14 19:31:42 UTC
*** Bug 27168 has been marked as a duplicate of this bug. ***