This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: Pthread options (was: Re: Memory allocators / STL)


Thanks for the input.
Since we don't think we have the knowledge to write our own memory
allocator I have the following questions:
- If the standard STL way of doing things uses one global pool of
memory, could one use the pthread_alloc method - is that part of
libstdc++ or is that a pure SGI specific one (found it on their site...)

- Does the "global" approach also means that this applies to all string
allocation, that would be the main reason for the slowdown, and if that
is the case do anyone have an example of initializing strings with an
alternate memory allocator (the template does not seem to accept
that...)

Thanks!

/Stefan

Ralph Loader wrote:

> Stefan,
> > The platform is Slackware 8.0 running on a dual PIII (HP lpr2000)
> ...
> > The sympton (when not compiled using __USE_MALLOC) is that the
> application works
>
> > fine when only one thread is running (or atleast so it seems from looking at the
> > response times) but when the internal STL caching memoryallocator is used the
> > performance drops rapidly and the system CPU usage goes up to well over 50%
> >
> > Please give me some input on what more information is needed in order to find this
> > bug...
>
> I suspect that what you are seeing is lock contention within the STL
> allocator.
>
> If I read the code correctly, the default libstdc++ allocator uses a
> single global lock for protecting its caches & so if you're using STL
> heavily from multiple threads on SMP your threads will probably spend
> lots of time blocked on that lock & doing associated system calls.
>
> The glibc malloc uses multiple arenas, and makes some attempt to tie
> arenas to threads, so you should see far less lock contention using
> it.
>
> Work arounds:
>
> (a) write your own STL allocator. If you're careful, you might be able
> to give yourself a lock-free cache by putting it in something that is
> already thread-protected. IIRC SGI STL has an allocator specifically
> designed for SMP scalability - search google for it.
>
> (b) Don't use STL everywhere. With a std::list, an allocation is done
> every time you add something to a list. If you implement linked-lists
> yourself, and put the linked-list pointers in an object that already
> exists, then you avoid that overhead.
>
> (c) Instead of using __USE_MALLOC, use std::malloc_alloc selectively.
>
> (d) Don't use threads :-). If it hurts, then don't do it.
>
> Ralph.
>
> >
> > Brgds
> >
> > /Stefan Olsson, CTO @ Noname4us
> >
> > Loren James Rittle wrote:
> >
> > > >>> - Quite a few discussions on lists have been about __USE_MALLOC and other
> > >
> > > >> There should not be anything required except using -pthread. You
> > > >> shouldn't have to use __USE_MALLOC at all.
> > >
> > > Your questions are covered in the libstdc++-v3 FAQ.  Although you
> > > might need to read between the lines a bit to answer your exact
> > > questions thus I will take a crack at answering them directly.
> > >
> > > [What about -D__USE_MALLOC?]
> > >
> > > Related to threading or otherwise, the current recommendation is that
> > > users not add any macro defines on the command line to enable features
> > > out of libstdc++-v3.  There is no condition under which it will help
> > > you without causing other issues to perhaps raise up (possible
> > > linkage/ABI problems).
> > >
> > > In particular, __USE_MALLOC should only be added to a libstdc++-v3
> > > configuration file, include/bits/c++config (where such user action is
> > > cautioned against), and the entire library should be rebuilt.  If you
> > > do not, then you might be violating the one-definition rule of C/C++
> > > and you might cause yourself untold problems.
> > >
> > > Also, I will tell you that I personally performance tested the
> > > implication of enabling __USE_MALLOC against (threaded and
> > > non-threaded) application code I have here.  The slow down for heavy
> > > STL container code was incredible.  It is possible that my code is not
> > > representative, but others were complaining about a "massive" slowdown
> > > in container speed between 2.95 and pre-3.0 libstdc++-v3 releases so I
> > > think not.  The results were posted back when we flipped the default
> > > configuration of libstdc++-v3 just before gcc 3.0 was released to
> > > match that of libstdc++-v2 as was shipped with gcc 2.95.X.
> > >
> > > If you find any platform where gcc reports a threading model other
> > > than none and where libstdc++-v3 builds a buggy container allocator
> > > when used with threads unless you define __USE_MALLOC, I want to hear
> > > about it ASAP.  In the past, correctness was the main reason people
> > > were led to believe that they should define __USE_MALLOC when using
> > > threads.
> > >
> > > > What about -D_REENTRANT?
> > >
> > > libstdc++-v3 headers themselves do not directly require _REENTRANT on
> > > any platform.  If your system's libc headers require a special macro
> > > to be defined to enable proper compilation of threaded-code, then you
> > > will need to consult your local documentation.  I can tell you that my
> > > platform used to require that special macro to be defined but today,
> > > no macros need to be defined to get access to thread-safe libc
> > > headers.
> > >
> > > > I was under the impression that the requirements for pthreads were
> > > > -D_REENTRANT (or the equivalent #define in the code) at compile time and
> > > > -lpthread at link time. I've never heard of this -pthread compile time
> > > > option, and neither has the GCC manual. Could you explain please?
> > >
> > > This is a very non-standardized area of gcc.  Some ports support a
> > > special flag (the spelling isn't even standardized yet) to add all
> > > required macros to a compilation and link-library additions and/or
> > > replacements at link time.  The documentation is weak.  Here is a
> > > quick summary from memory to display how ad hoc this is (some efforts
> > > have occurred to rationalize this stuff: I and others have come close
> > > to understanding enough to attempt to create a new order of thing, but
> > > not quite yet):
> > >
> > > On solaris, both -pthreads and -threads (with subtly different
> > > meanings) are honored.  On OSF, -pthread and -threads (with subtly
> > > different meanings) are honored.  On Linux/i386, -pthread is honored
> > > with a meaning that matches your impression on the correct way to
> > > ensure proper thread support.  FreeBSD supports -pthread.  Some other
> > > ports use other switches.  AFAIK, none of this is properly documented
> > > anywhere other than in ``gcc -dumpspecs''.  This situation existed
> > > before I became a gcc developer so I don't know any more of the history.
> > >
> > > I am remiss in that I didn't finish improving documentation for
> > > libstdc++-v3 before 3.0.2 release.  Since I basically just wrote the
> > > remaining piece above, expect a posting to libstdc++ with an update of
> > > the FAQ I promised some time ago.
> > >
> > > Regards,
> > > Loren
> > > --
> > > Loren J. Rittle
> > > Senior Staff Software Engineer, Distributed Object Technology Lab
> > > Networks and Infrastructure Research Lab (IL02/2240), Motorola Labs
> > > rittle@rsch.comm.mot.com, KeyID: 2048/ADCE34A5, FDC0292446937F2A240BC07D42763672
> >
> > --
> > Military intelligence is a contradiction in terms.
> >
> >
> >
> >
>
--
Military intelligence is a contradiction in terms.




Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]