Querying the position in a file uses filebuf::seekoff( 0, ios::cur ). Although the Standard specifies that this combination of parameters does not flush putback or the output sequence, it does anyway, which results in a performance hit. (§18.104.22.168/11)
Regression: http://gcc.gnu.org/viewcvs/trunk/libstdc%2B%2B-v3/include/bits/fstream.tcc?r1=68163&r2=68420 (@@ -473,41 +486,26 @@)
This has been broken a long time!
Created attachment 21762 [details]
This little program attempts to read three characters from its own source, checking the position each time. Reading three bytes from buffered file should only underflow once, right?
_M_terminate_output, correctly, does nothing in this case, cannot be the problem, and there is nothing wrong wrt the standard mandated behavior. The "problem" is that in our implementation, similarly to traditional C stdio impls, reading and writing are completely separate operations, and the user switches between the two with seeks, essentially. Any seek puts back the internal status to "uncommitted" (_M_reading = false, _M_writing = false) and afterwards the user can start *either* reading or writing, irrespective of the previous history, and the seek logic doesn't know what will actually happen in the future, of course. The user should not perform redundant seeks, because they have a cost, they do something more than just "seeking". On the other hand, a series of read or write operations has maximum performance, we don't think we could possibly do better. Thus, I'm open to ideas, but I don't think that within the current design one can change / improve much. Note that the patch you linked is exactly the one implementing the above semantics.
I *think* it can work to minimally change what we have now to not reset the get area buffers when (0, ios::cur) and we have been reading: as far as I can see, if in that specific case we get back to reading again, the get area remains completely valid indeed.
Does not work: when we reach the end of the buffer and we access again the file to refill it, we start reading from the wrong position, the position we seeked to.
To clarify: when we start reading in a buffered mode, the first underflow reads the buffer and leaves the physical file at the first char beyond the buffer. If we do afterwards a seek to the current reading position, belonging to the buffer, the physical position along the file also changes of course, an underlying fseek is performed. Then, if we don't refresh the buffer with a new underflow, an inconsistency is born: the physical position along the file doesn't correspond to the first char after the buffer and the next underflow will read from the wrong position.
Re: comment 5 - what is needed is for filebuf::seekoff(0,ios::cur) to:
1) *not* invalidate the buffer
2) *not* move the file pointer
since all that special case asks is "where am I in the 'logical' file?"
This can be accomplished by having filebuf::seekoff() recognize an off_type of 0, and a seekdir of cur, and special-case the code to call _M_file.seekoff(0,cur) (not moving the file pointer, right?) and then adjusting the resulting pos_type to reflect the true, earlier position held by gptr().
Am I missing something?
Then, seekoff would also return a position beyond the buffer, right? Or you want it to return 1 anyway? Actually, I think the standard want us to use width * off for the underlying fseek anyway, not only for off == 0, and this is not what we have been doing. I think there is something seriously different here, beyond the performance issue, which we should ponder much more, after so many years.
Paolo, yes, _M_file.seekoff(0,cur) would return the current physical file position, and then filebuf::seekoff would adjust the returned pos_type to reflect the position within the *logical* file, framed by the buffer and pointed to by gptr().
As for the mechanics of width*off, I confess that locale issues leave me completely befuddled, so I won't try to address that.
Ok. I don't think we should change the code to deal such specially with off == 0, if we are going to change it we should decouple the return value from what the underlying seek returns, and always call fseek(..., width * off, ...) as the standard mandates. Then dealing with off == 0 becomes simple.
(In reply to comment #9)
> Ok. I don't think we should change the code to deal such specially with off ==
> 0, if we are going to change it we should decouple the return value from what
> the underlying seek returns, and always call fseek(..., width * off, ...) as
> the standard mandates. Then dealing with off == 0 becomes simple.
I'm not sure I understand what you are saying. My concern is that calls to filebuf::seekoff(0,ios::cur) should not invalidate the buffer just to return a correct answer to "where am I?". Whether this is an accident of history or not, I've seen this usage enshrined in code as a respected idiom.
Does your text above specify this?
Sure. What I meant - contrary to wait you said, I think - is that an elegant and complete solution to this issue involves changing much more generally our code to *always* behave as if fseek(off * width) were called, not just fseek(0) in the special case you care about.
(In reply to comment #11)
> Sure. What I meant - contrary to wait you said, I think - is that an elegant
> and complete solution to this issue involves changing much more generally our
> code to *always* behave as if fseek(off * width) were called, not just fseek(0)
> in the special case you care about.
Okay, I understand better now. My apologies, and thanks for your comments on this issue.
Good, I think we are close to a fix, I'm already testing something. So, do we have a symmetric issue with the put area or not? I'm not sure.
(In reply to comment #13)
> Good, I think we are close to a fix, I'm already testing something. So, do we
> have a symmetric issue with the put area or not? I'm not sure.
I believe so. tellg and tellp are both handled by seekoff( 0, ios::cur ), and the required behavior doesn't differentiate between them, partly because for an fstream the get and put pointers are the same. (The result doesn't depend on the pointers, it comes from fseek.)
(In reply to comment #14)
> (The result doesn't depend on
> the pointers, it comes from fseek.)
I re-read Comment 5 and understand it this time ;v) . Well, any solution should fix both tellg and tellp, since the pointers are the same upon synchronization.
Then, yes, we need to check which type of operation occurred last, and either update whichever pointer is stale, or selectively use the correct one.
Actually, however, I don't think we can really always call fseek(off * width) as the Standard want us to do. In a sense I'm happy because the change is gonna be less invasive, on the other hand I'm a bit puzzled.
(In reply to comment #16)
> Actually, however, I don't think we can really always call fseek(off * width)
> as the Standard want us to do. In a sense I'm happy because the change is gonna
> be less invasive, on the other hand I'm a bit puzzled.
Could you post a patch? I'm not sure what you mean by generalizing (off * width), if off == 0 then width is irrelevant. If width < 0 then the only valid value for off is 0. If width > 0, off != 0 then repositioning is the primary effect.
The task is to call fseek(0,cur), and then subtract the number of bytes in the put area plus the "external characters," right?
I'm almost ready for the patch, please be patient ;) If look at the standard, it says that the last step of seekoff is *always* as if calling fseek(..., off * width, ...). If look at the current code, we have the concept of __computed_off and, in many cases we end up calling the equivalent of fseek with something != off * width. I'm changing that to (0, cur) for the case you care about, but not changing anything else otherwise.
Of course here I'm always under the assumption width > 0.
(In reply to comment #17)
> The task is to call fseek(0,cur), and then subtract the number of bytes in the
> put area plus the "external characters," right?
Er, I don't mean "bytes in the put area" exactly, but you know what I mean… what I'm asking is, how does your simplification relate to the task of figuring out how many file bytes the buffers hold, without flushing them, which the code does not currently seem designed to do.
For regression, note that the code previous to the linked patch was
// NB: Need to do this in case _M_file in indeterminate
// state, ie _M_file._offset == -1
pos_type __tmp = _M_file.seekoff(__off, ios_base::cur, __mode);
if (__tmp >= 0)
// Seek successful.
__ret = __tmp;
__ret += std::max(this->_M_out_cur, this->_M_in_cur)
which does not appear to do multibyte compensation correctly. Was _M_filepos the number of file bytes in whichever area was currently being used?
(In reply to comment #18)
> I'm almost ready for the patch, please be patient ;) If look at the standard,
> it says that the last step of seekoff is *always* as if calling fseek(..., off
> * width, ...). If look at the current code, we have the concept of
> __computed_off and, in many cases we end up calling the equivalent of fseek
> with something != off * width. I'm changing that to (0, cur) for the case you
> care about, but not changing anything else otherwise.
The standard says always to use (off * width, whence) but that is just the external effect if buffering is transparent. __computed_off compensates for the file pointer being necessarily different from gptr(), pptr(). (You can't seek for every putc!)
Don't mean to be impatient, just trying to follow along the discussion…
Good. Then I have a draft almost ready ;)
(In reply to comment #22)
> Good. Then I have a draft almost ready ;)
I have a very straightforward, low-impact solution, but I haven't tested it. (My tree is pretty out of date.) Would you like to try it, if you're having trouble?
Created attachment 21768 [details]
This is what I have so far, unfortunately I cannot work only on this today. Anyway, it passes testing and this specific testcase, but is incomplete vs wchar_t. If you have something which you are confident works fine for wchar_t too, I can give it a try later today or over the next days, thanks.
Created attachment 21769 [details]
alternative approach. untested
I hope this compiles ;v) . But it seems to "color within the lines."
Why does your patch call setp/setg to (re?)invalidate the opposite area? And then declares it is neither reading nor writing? Also, -1 return from _M_seek is not handled in seekoff.
(In reply to comment #25)
> Created an attachment (id=21769) 
> alternative approach. untested
> I hope this compiles ;v) . But it seems to "color within the lines."
Bah, it doesn't. I missed an underscore at
__is_tell? __state_type(_M_state_last) : move(_M_state_last);
Note that certainly we don't want to use C++0x stuff here. Also, one thing at a time of course, thus if we have been missing some error checking, etc, it's for another time.
PS: you are right that we have to check that _M_seek succeeds before adding back __computed_off.
And, please, if you want to help, manage to run the testsuite, we have got some pretty nasty testcases ;)
(In reply to comment #29)
> And, please, if you want to help, manage to run the testsuite, we have got some
> pretty nasty testcases ;)
I'll see if I can compile the latest… guess it's more useless to have an old tree than one that merely doesn't compile.
Also, I failed to account for overflow() called from _M_terminate_output flushing the put sequence and setting the file position to the actual location. So my patch is buggy with tell after write. It appears that your patch will still flush upon tell after write.
C++0x, easy enough to eliminate. I'm not aware of my patch fixing anything besides the problem at hand…
I'm afraid that the situation I outlined in Comment #5 is just the simple one. The real problem with the new scheme - which tries to deal specially with (0, cur) by not moving the file pointer - is when *writes* follow the seek. After a while the buffer becomes full and must be flushed to the file starting at the logical position corresponding to the previous seek. Thus - it seems to me - the file pointer must be finally adjusted. How to do that without saving anything in the filebuf? (note that within the current ABI we cannot add data members)
(In reply to comment #31)
> I'm afraid that the situation I outlined in Comment #5 is just the simple one.
> The real problem with the new scheme - which tries to deal specially with (0,
> cur) by not moving the file pointer - is when *writes* follow the seek. After a
> while the buffer becomes full and must be flushed to the file starting at the
> logical position corresponding to the previous seek. Thus - it seems to me -
> the file pointer must be finally adjusted. How to do that without saving
> anything in the filebuf? (note that within the current ABI we cannot add data
I don't see how this is particularly difficult. A seekoff(0,ios_cur) operation should only ever call lseek(0,SEEK_CUR). It does not point the file position inside the buffer.
For simple byte-oriented case:
read, tell, write: egptr corresponds to to file marker. Tell finds logical position using gptr-egptr. Write invalidates get area and starts fresh.
write, tell, write: pbase corresponds to file marker. Tell finds logical position using pptr-pbase. Pointers and marker are still valid for next write.
To handle writes, we simply have to avoid calling _M_seek and overflow.
Note, I am not attempting to tell after write with a nontrivial codecvt installed. Maybe the issue of Comment #5 is only in the general case?
I suppose it leaves UTF-8 files still a bit slow, but I still think that's pretty well justified.
Run the full testsuite, and you will see. In general, if you simply do fseek(0, cur) and then start writing, when eventually you have to flush you need the actual logical position in the file - the last fseek(0, cur) - 'something' - which is not available anywhere. I'm not saying it cannot be implemented, I'm very dubious it can without breaking the ABI by adding an additional data member, which we cannot do at the moment. To be honest I also don't think the issue is very serious if only because nobody complained in 7 years, and we have a lot to do for C++0x. Thus, if you can help with something concrete minimally passing the testsuite and clearly addressing the concerns above, excellent, otherwise, I cannot anticipate now when we are going to do something here. Just to be honest.
(In reply to comment #34)
> Run the full testsuite, and you will see.
Lol, you're still looking at this too? I *just* got those pesky four testcases done. I wasn't manually putting the codecvt state into the fpos in the special case. I'll rerun the entire suite and post the patch here.
> In general, if you simply do fseek(0,
> cur) and then start writing, when eventually you have to flush you need the
> actual logical position in the file - the last fseek(0, cur) - 'something' -
> which is not available anywhere.
No, if fseek(0,cur) is implemented to have no side effects, it has NO side effects. Nothing is lost. When the flush happens, it uses the logical position obtained at the last flush, just as if fseek(0,cur) never occurred.
I'm traveling. Note, I don't understand how you are addressing my concerns, thus whatever results you get from the testsuite, make sure we are not regressing on the situation I outlined, thus write a new testcase reading in in the buffer, say, 0123456789, then seeking to 0, reading consecutive positions up to 5 via simple get, calling seekoff(0, cur), put x in the place of 5. Then close, reopen, and check that you have 01234x6789.
Created attachment 21819 [details]
Tested x86_64-linux, mainline
This is a carefully tested patch (tested in mainline, per the normal policy, where I also added two additional seekoff correctness testcases), which works in limited circumstances (enough to fix the testcase, anyway) when I can convince myself it's fully correct and consistent with our general framework. My plan is committing it first and then possibly generalizing it, always together with additional accompanying testcases, anyway.
Created attachment 21822 [details]
Works with codecvt. Tested Tested x86_64-darwin, mainline
Ah, now I see the trick:
if (__off == 0 && !(_M_mode & ios_base::out))
So if the file is open for writing, disable the optimization.
I had a problem with this condition for these testcases:
which contain code such as
strmsz_2 = fb_01.sputn(", i wanna reach out and", 10);
fb_01.pubseekoff(0, std::ios_base::cur); // if this doesn't flush
c1 = fb_01.sgetc(); // this underflow is ignored
c2 = fb_01.sputbackc('z'); // as well as this putback
Essentially, pubseekoff(0,cur) is being used as a sync(). I see nothing in the Standard to support that, and indeed the sync() shouldn't be needed either, so I was planning to open a new bug.
Anyway, if I apply your limitation to my patch, it passes the unmodified testsuite, so here it is.
Oops, no, I'm on the 4.5.2 series, not mainline.
In general, our users know that seeking allows to switch from reading to writing, and viceversa (when the stream has been appropriately opened of course). This assumption remained true for years and years. Thus, for now at least, I would rather not change it, whether the Standard is completely clear in this area or not.
Also, I don't think the name __is_tell is appropriate, because of course this kinf of situation in principle can occur also when tell is not involved (like in your testcase ;)
Modulo the above comments, I think we can enable the optimization for codecvt too, yes, let me reformat your other tweaks and more cleanly incorporate the !(_M_mode & ios_base::out) thing.
Well, I'm happy if you'd like to merge the diffs. My code was written with the intent of optimizing the output case, too, but I guess it's not too inefficient or awkward from the perspective of input only.
I just filed a bug http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45708 about the requirement to separate reads and writes. I suppose it will be a duplicate, but maybe it is worded a little better than existing bugs. Although your current users might know the restriction, I don't think it's really well-documented. I personally observed this behavior when getting started with iostreams, and it was quite discouraging. GNU having a large part of the iostreams marketshare, I wouldn't be surprised if this were a minor stumbling block to iostreams adoption overall.
Before any other bug or analysis, I would recommend going back to the ton of discussions in 2002 / 2003 when the design of basic_filebuf has been changed to use _M_reading and _M_writing, **on purpose**. Didn't happen by chance, was a deliberate redesign of the previous design which allowed major performance improvements. And, to be clear, nobody complained anymore, *ever* all these years. After you have analyzed those discussions (look in particular for Nathan Myers and me), we can consider, for the future, alternate designs.
Thanks for the pointer, I'll read that.
By the way, if, for the future, you mean to contribute in these areas, if you are really interested in these topics, I would recommend starting immediately the Copyright assignment paperwork http://gcc.gnu.org/contribute.html send an email to assignments@
Already did copyright assignment :vP
To further clarify: what you have in mind isn't something which can belong to a casual PR, is a major redesign of basic_filebuf, according to a different basic philosophy, which at the time, Nathan called unified vs non-unified, if I remember correctly. At the time we moved *away* from what you essentially want, because we believed the new design to be superior in terms of performance. Anyway, if you want to propose something different, or a variant of the old design, please post messages to the libstdc++ mailing list, not here, remember to involve Nathan in the discussions, benchmark in various circumstances the various options, in particular, if you want something similar to the old scheme make sure you are *improving* on it. Remember, in practice, that in the 7 years since we moved to new scheme, **nobody ever** asked for the old behavior, nobody complained about the performance of basic_filebuf, thus, if, in the future, we are going to change it again, we really want to be sure to do it after a **very** serious and public analysis.
I'm having trouble finding the discussion that precedes the June 24, 2003 redesign, but I'll add "unified" to the search terms.
Actually, last week I started some changes with the aim of fixing the bug, not changing the philosophy. Rather than eliminate the state variables, just call overflow and underflow in the cases that otherwise fail. So if I finish that, and it performs well, I suppose I'll start an RFC.
It was **a ton** of work and discussions in public and among the maintainers, in private. Anyway, if you have something which doesn't touch basic_streambuf, keeps the get and put areas of basic_filebuf completely separate, with seeks switching between reading and writing via state variables, then it's fine, in principle.
(In reply to comment #49)
> with seeks
> switching between reading and writing via state variables, then it's fine, in
Why require seeks? The whole point is that they are extraneous. I'm not proposing to break code that calls seekoff, just to make it properly redundant.
I'm experimenting with switching states on any operation that requires it. Overlapping get and put areas are obviously suicide; if that is what you attempted before, then I'm encouraged about remaining alternatives. Outside the context of the closed bug, I'm not proposing anything yet.
If you can allow writes after reads and viceversa *also* without seeks in the middle, and without affecting performance and without adding data members, that's fine. Let's see what you come up with it. By the way, get and put areas never overlapped in the past, just look at the code in SVN, the movement of the pointers was synced, true, exactly because one wanted each area to somehow "know" what the other area was doing in order to "more easily" switch between reads and writes without seeks. And nobody liked that scheme, was slow and buggy. That happened way before I started contributing, for the record. Then the new design came, outlined by Nathan, and simply inspired by C stdio, nothing strange. As a matter of fact, many users found it also quite easy to use, because - in case wasn't clear already from my previous comments - **nobody ever** complained.
(In reply to comment #51)
> ...As a matter of fact, many users found it also quite easy to
> use, because - in case wasn't clear already from my previous comments -
> **nobody ever** complained.
I just have to butt in here. I don't expect any sympathy, but the only reason my company never complained about the performance issue with filebuf::seekoff(0,cur) is that we only just started using compilers which included that change. Our company can't be the only one beholden to conservative customers, so I won't be surprised if you get other questions similar to ours, years after the change went in.
In any case, I appreciate all the effort you have put into this.
What can I say, I don't know anybody still using GCCs dating back to 2003. In any case, my point wasn't really about seek(0, cur) and its optimization, etc, my point was about the general design, where you use seeks to switch, you have get and put areas completely independent, etc. As far as I know, **nobody** asked to have the old behavior back, not in Bugzilla, not in the mailing lists, nowhere in public discussions. Nobody **ever** commented **anywhere** that using seeks to switch was unusual, entire Linux distros have been quickly recompiled to use the new filebuf, we are now able to perform series of consecutive get or put almost as fast as "C" getc_unlocked and putc_unlocked (see the performance testsuite), etc. To summarize, I have nothing in principle against speeding up this and that (of course) but I do not accept comments implying that the current design is just wrong and should be changed to something else without a careful analysis, benchmarks, a discussion on the mailing list with all the experts involved, etc.
Well, for my part, a few years ago I played around with fstream a little, noticed the tellg requirement was weird (I only discovered it by throwing the kitchen sink at the problem), wasn't able to interpret the standard well enough to confirm that GCC was wrong, and decided never to use that part of fstream. It's not really a common use case anyway.
I think that a minority of people actually consider submitting bugs.
Companies are often very conservative with compilers and qualification, and are less motivated to report something the further behind they fall. Under what conditions does a manager actually decide to upgrade? If you're still using 2.95 and considering upgrading to 3.4, why bother reporting a bug against a compiler which is already obsolete?
> but I do not accept comments
> implying that the current design is just wrong and should be changed to
> something else without a careful analysis, benchmarks, a discussion on the
> mailing list with all the experts involved, etc.
Changes should never occur without analysis one way or another. It's easy to see that there is a bug… just look at the Standard's requirements on underflow. As for whether this bug is a symptom of a "wrong design," I don't think so. I'm getting the impression that you guys got tired after a long redesign process and oversimplified the state machine.
While fixing bugs in the overambitious design with simultaneous get and put areas (they might as well be overlapping as synchronized, both are impractical and not the intent of the Standard), the decision was made to cut back features, and non-explicit mode switches got lost.
> getting the impression that you guys got tired after a long redesign process
> and oversimplified the state machine.
Not me. What I remember is that Nathan Myers explained that C stdio, at least traditionally worked exactly like that, and since Nathan *designed* parts of the first C++ Standard itself, actively participated to all the meetings which led to C++98, I trusted him by and large and found the new design straightforward and well performing in most of out benchmarks. I still believe he was quite right. Anyway, when you post something to the mailing list, remember to add him in CC.
David himself is on it.
Subject: Bug 45628
Date: Wed Sep 22 19:40:43 2010
New Revision: 164529
2010-09-22 David Krauss <email@example.com>
* include/bits/fstream.tcc (basic_filebuf::underflow): Add state
transition to avoid modality requiring seekoff(0,ios::cur).
(basic_filebuf::_M_seek): Avoid minor unnecessary conversion.
(basic_filebuf::seekoff): Remove code to _M_get_ext_pos; make
(0, ios::cur) a special case preserving buffer contents.
(basic_filebuf::_M_get_ext_pos): New function to obtain status
about codecvt extern_t buffer for overflow and seekoff.
* include/std/fstream (basic_filebuf::_M_get_ext_pos): Likewise.
* config/abi/pre/gnu.ver: Export new symbols.
* testsuite/27_io/basic_filebuf/seekoff/char/45628-1.cc: New,
verifies that seekoff(0, ios::cur) preserves buffers.
* testsuite/27_io/basic_filebuf/seekoff/char/45628-2.cc: Likewise.
for codecvt case. More lenient as it may still flush put area.
* testsuite/27_io/basic_filebuf/seekoff/char/4.cc: Modify to
check that seekoff is not required between read and write.
* testsuite/27_io/basic_filebuf/seekoff/wchar_t/4.cc: Likewise.
* testsuite/27_io/basic_filebuf/sync/wchar_t/1.cc: Remove.
* testsuite/27_io/basic_filebuf/sync/wchar_t/1.cc: Likewise.
* testsuite/util/testsuite_character.h (codecvt::do_length): Comply
with 22.214.171.124.2/10 "Returns ... the LARGEST value in the range..."
Fixed for 4.6.0.