This is the mail archive of the libstdc++@gcc.gnu.org mailing list for the libstdc++ project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Shouldn't this work?


This is a long story involving Phil and a User.

User has a text-based data file.  User would like to add comments, e.g.,

    # this is a comment (*nix style)
    ; so is this (MS style)
    number1 number2 number3 number4  # hey, another comment
    number5 number6

and so on.

User recalls hearing about "filtering streambufs" and other magical
creatures, and decides to replace

    ifstream -----> filebuf -----> file

with

    ifstream -----> comment-remover-buf ----> filebuf -----> file

where the new buffer class looks like this

    class comment_filter : public std::streambuf
    {
    public:
        typedef  std::streambuf   base;

        explicit comment_filter (base *p) : real(p) {}

        virtual int_type underflow()
        {
            int_type   ret = real->sbumpc();

            /* another statement will go here later */

            if (ret == '#' || ret == ';')
            {
                do
                {
                    ret = real->sbumpc();
                }
                while (ret != '\n' && ret != traits_type::eof());
            }
            return ret;
        }

    private:
        base   *real;
    };

and was originally used along the lines of

    ifstream foo ("something");
    comment_filter* filter = new comment_filter(foo.rdbuf());
    //foo.rdbuf(filter);        // duh, won't work
    foo.ios::rdbuf(filter);

    int i1, i2, ....;
    foo >> i1 >> ..... >> i7 >> i18 >> ....;  // ignores comments


Obviously, this won't work.  Why?  Because ifstream doesn't use rdbuf()
or the _M_buf pointer to talk to "the current buffer".  It just uses
_M_filebuf directly, which is a filebuf, not a pointer.  When it does use
rdbuf(), naturally it uses its own rdbuf, which just returns the address
of the local filebuf, rather than whatever's been replaced by the user.

(And that's okay by the standard.  So we won't go into that here.)

So some of the functions are dispatched through the base rdbuf, and some
through the derived rdbuf, and some not at all, leading to some characters
seen by the comment_filter, and others pulled directly from the filebuf.
Chaos ensues, data makes no sense.


User becomes confused and asks Phil for help.  On my advice, User changes
the code to avoid ifstream altogether:

    filebuf  *t = new filebuf();
    t->open("something", ios::in);
    comment_filter* filter = new comment_filter(t);

    istream  foo (filter);

    int i1, i2, ....;
    foo >> i1 >> ..... >> i7 >> i18 >> ....;  // same as before

And now Phil is as confused as the User, because given a test input file
containing the number 123456, and no other text at all:

    - i1 is 2
    - EOF is reached before getting to any of the other numbers

Phil tries it himself using 3.2 and current CVS branches.  Phil verifies
the odd results.  Phil tries an experiment:  at the place reading "another
statement will come here later," Phil adds

    std::cerr << "read " << (char)ret << std::endl;

and gets

    read 1
    read 2
    read 3
    read 4

No "5".  No "6".  Phil tries various other combinations, using numbers
and text and whatnot, and finds that:

    - comment_filter::underflow() is only called 4 times, ever,
    - the second character is returned and interpreted as the entire
      integer (i1, in this case),
    - foo.eof() is true immediately after extracting i1.

Any thoughts on what's happening?


You may be wondering why I haven't tried stepping through the calls
in a debugger.  Well, I have.  Text-mode GDB quickly becomes just too
damn confusing with IOstreams, and then crashes.  The packaged distro
Insight-mode GDB gives some DWARF format error.  The freshly built
Insight-mode GDB crashes.  And DDD (which gets farther than all the
others) can't find the v3 header/source files to display the source.
So I'm debugging this the old-fashioned way.


Phil

-- 
I would therefore like to posit that computing's central challenge, viz. "How
not to make a mess of it," has /not/ been met.
                                                 - Edsger Dijkstra, 1930-2002


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]