This is the mail archive of the
libstdc++@gcc.gnu.org
mailing list for the libstdc++ project.
debug mode performance patch
- From: François Dumont <francois dot cppdevs at free dot fr>
- To: libstdc++ at gcc dot gnu dot org
- Date: Mon, 01 Nov 2010 20:12:33 +0100
- Subject: debug mode performance patch
Hi
This is a patch to improve libstdc++ performance and also limit
contention when used in debug mode.
I finally prefered to use a simple approach which is to always
synchronize access to the lists of safe iterators. I considered that
trying to make hypothesis on the operation validity in a multithreaded
environment was error prone because you might do such operations on
unrelated parts of the sequence which will work in normal mode but not
with the safe one because of the modifications of safe iterators lists.
The _M_prior/_M_next safe iterator fields are considered to be part of
the sequence lists and only safe sequence methods are modifying them
now. Safe iterator _M_get_mutex is not used anymore but is kept for
binary compatibility.
I introduce usage of lambda expressions in forward_list
implementation rather than introducing new functor like _Equal_to and
others. I hope that as it is a C++0x container it is fine to use lambdas
in its implementation.
To reduce to its minimum the additional contention introduce by the
debug mode the last modification to achieve would be to add a mutex
instance to each safe sequence. But as it would break binary
compatibility I simply changed _M_get_mutex_base function in debug.cc to
handle a pool of mutexes that are associated to safe sequences depending
on their memory address. Just tell me if you know a better technique to
associate a safe sequence instance with a mutex one from the pool.
Tested on x86_64 linux. Tested also for performance using performance
testsuite with debug mode activated. After the patch performance are
equivalent or better, on some tests performance are even much better,
here is the most impressive enhancement:
Without the patch:
sort_search type: std::__debug::list<int,
__gnu_cxx::new_allocator<int> >
669r
665u 0s 0mem 0pf
sort_search type: std::__debug::list<int,
__gnu_cxx::malloc_allocator<int> >
697r
694u 0s 0mem 0pf
sort_search type: std::__debug::list<int,
__gnu_cxx::__mt_alloc<int,
__gnu_cxx::__common_pool_policy<__gnu_cxx::__pool, false> > >
620r
618u 0s 6063408mem 0pf
sort_search type: std::__debug::list<int,
__gnu_cxx::bitmap_allocator<int> >
578r
576u 1s 37996912mem 0pf
sort_search type: std::__debug::list<int,
__gnu_cxx::__pool_alloc<int> >
548r
547u 0s 3837568mem 0pf
sort_search-thread type: std::__debug::list<int,
__gnu_cxx::new_allocator<int> >
-thread 8617r
11847u 5056s 1216mem 0pf
sort_search-thread type: std::__debug::list<int,
__gnu_cxx::malloc_allocator<int> >
-thread 8355r
11534u 4885s 0mem 0pf
sort_search-thread type: std::__debug::list<int,
__gnu_cxx::__mt_alloc<int,
__gnu_cxx::__common_pool_policy<__gnu_cxx::__pool, true> > >
-thread 8664r
11596u 5386s 24891808mem 0pf
sort_search-thread type: std::__debug::list<int,
__gnu_cxx::bitmap_allocator<int> >
-thread 7644r
10619u 4408s 50607824mem 0pf
sort_search-thread type: std::__debug::list<int,
__gnu_cxx::__pool_alloc<int> >
-thread 8010r
11055u 4691s 14601856mem 0pf
After the patch:
sort_search type: std::__debug::list<int,
__gnu_cxx::new_allocator<int> >
171r
169u 0s 0mem 0pf
sort_search type: std::__debug::list<int,
__gnu_cxx::malloc_allocator<int> >
179r
180u 0s 0mem 0pf
sort_search type: std::__debug::list<int,
__gnu_cxx::__mt_alloc<int,
__gnu_cxx::__common_pool_policy<__gnu_cxx::__pool, false> > >
125r
125u 0s 6063408mem 0pf
sort_search type: std::__debug::list<int,
__gnu_cxx::bitmap_allocator<int> >
88r
87u 1s 37996912mem 0pf
sort_search type: std::__debug::list<int,
__gnu_cxx::__pool_alloc<int> >
77r
77u 0s 3837568mem 0pf
sort_search-thread type: std::__debug::list<int,
__gnu_cxx::new_allocator<int> >
-thread 810r
1554u 10s 1216mem 0pf
sort_search-thread type: std::__debug::list<int,
__gnu_cxx::malloc_allocator<int> >
-thread 820r
1606u 13s 0mem 0pf
sort_search-thread type: std::__debug::list<int,
__gnu_cxx::__mt_alloc<int,
__gnu_cxx::__common_pool_policy<__gnu_cxx::__pool, true> > >
-thread 693r
1357u 7s 24891808mem 0pf
sort_search-thread type: std::__debug::list<int,
__gnu_cxx::bitmap_allocator<int> >
-thread 424r
820u 11s 50607824mem 0pf
sort_search-thread type: std::__debug::list<int,
__gnu_cxx::__pool_alloc<int> >
-thread 568r
1102u 7s 14601856mem 0pf
François
Attachment:
ChangeLog.entry
Description: Text document
Attachment:
performance.patch
Description: Text document