This is the mail archive of the libstdc++@gcc.gnu.org mailing list for the libstdc++ project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: libstdc++/4150: catastrophic performance decrease in C++ code

From: Jason Merrill <jason at redhat dot com>
To: rittle at labs dot mot dot com
Cc: libstdc++ at gcc dot gnu dot org, rth at redhat dot com, gcc-gnats at gcc dot gnu dot org
Date: Tue, 16 Apr 2002 11:02:46 +0100
Subject: Re: libstdc++/4150: catastrophic performance decrease in C++ code
References: <200204152308.g3FN86s09742@latour.rsch.comm.mot.com><wvl1ydgy15s.fsf@prospero.cambridge.redhat.com><200204160014.g3G0E6T09935@latour.rsch.comm.mot.com><200204160556.g3G5uSO87484@latour.rsch.comm.mot.com>

>>>>> "Loren" == Loren James Rittle <rittle@latour.rsch.comm.mot.com> writes:

> Under the current architecture (which I have only ever tweaked for
> performance, compliance and QoS of interactive cases not dictated by
> standard), the whole reason for the backup to the point before the
> read is that until a character is actually consumed by the
> higher-layer of libstdc++-v3 IO, the lower-layer C stdio file-pointer
> must not appear to move forward w.r.t. other C stdio.  Granted it
> seems less-than-ideal to always use that algorithm even when not
> sync'd to stdio.

Then I suppose we should use a buffer size of 0.

> It is why I told RTH the other day in an e-mail that I thought some
> basic re-architecture would be required to solve all performance
> issues related to outstanding libstdc++-v3 PRs.  When the higher-layer
> knows it will consume more than X characters in sync'd IO cases, it
> should be able to pull >1&<X characters from the lower-layer (current
> architecture limits us to pulls of 1 character).  Or, if the
> higher-layer knows it is looking for a newline character (another very
> common case), it should be able to use the C stdio optimized routine
> to pull >1 character from the lower layer (bounded only by newline or
> the provided buffer size, aka the fgets function call).  Under a
> re-architecture, it seems to me that only when the higher-layer of
> libstdc++-v3 is in a scanning mode not directly supported by libc that
> it must conditionally pull 1 character at a time through the layer
> when sync'd to stdio

Makes sense to me.

> Now, I actually have no idea if the abstraction layer dictated by the
> standard even allows these optimizations.  I looked at this situation >6
> months ago and I actually think not.

Why not?  It seems to me that the optimizations you suggest would conform
fine to the spec for xs{put,get}n.  The spec for basic_streambuf::xsgetn
talks about implementation "as if" by repeated calls to sbumpc, but then
also says that derived classes can provide more efficient implementations.

To optimize getline, we'd need to introduce a virtual helper function in
streambuf, but I don't see any reason why that would violate the standard.

> With your patch (plus the removal of the related _GLIBCPP_AVOID_FSEEK
> region in src/ios.cc), I see one automatic regression here:

> assertion "(off_2 == (off_1 + 2 + 1 + 1))" failed: file
> "[...]/27_io/filebuf_virtuals.cc", line 428
> FAIL: 27_io/filebuf_virtuals.cc execution test

Yep, I'm aware of that.  I knew that the patch I posted was incomplete; it
was meant more as a concrete illustration of my proposal.  I'm still
working on it.

> [1] I don't know if this is widely known information thus I want to
>     make sure you tested my patch to enable _GLIBCPP_AVOID_FSEEK on
>     Linux properly.  If you bootstrap all of gcc, then when
>     libstdc++-v3 is built, it will be built with flags set by
>     top-level Makefile (nominally, `-O2 -g').  If you later run make
>     in libstdc++-v3, it will rebuild (some/all?) files with `-O0 -g'
>     (except stuff built in libmath which appears to get top-level
>     flags)...  IMHO, the only way to test performance patches in
>     libstdc++-v3, is to `rm -rf <target>/libstdc++-v3' and rerun make
>     at top-level.  This way libstdc++-v3 is built exactly as when it
>     is bootstrapped.

I'm aware of the difference; in all cases, I was building without
optimization, on the assumption that the calls to the C layer would be
where we were spending our time.  So I was comparing apples to apples, but
perhaps not the most useful apples.  :)

Jason

References:
- Re: libstdc++/4150: catastrophic performance decrease in C++ code
  - From: Loren James Rittle
- Re: libstdc++/4150: catastrophic performance decrease in C++ code
  - From: Jason Merrill
- Re: libstdc++/4150: catastrophic performance decrease in C++ code
  - From: Loren James Rittle
- Re: libstdc++/4150: catastrophic performance decrease in C++ code
  - From: Loren James Rittle

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]