This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [Patch] mt_allocator: spare mem & fix alignment problems
- From: Dhruv Matani <dhruvbird at gmx dot net>
- To: Paolo Carlini <pcarlini at suse dot de>
- Cc: Stefan Olsson <stefan at xapa dot se>, libstdc++ <libstdc++ at gcc dot gnu dot org>, gcc-patches <gcc-patches at gcc dot gnu dot org>
- Date: 26 Mar 2004 18:42:13 +0530
- Subject: Re: [Patch] mt_allocator: spare mem & fix alignment problems
- Organization:
- References: <40630DF2.7040507@suse.de> <4063F17B.3070802@xapa.se> <40641560.8030806@suse.de> <1080303648.1453.29.camel@localhost.localdomain> <4064208D.1070206@suse.de>
On Fri, 2004-03-26 at 17:52, Paolo Carlini wrote:
> Hi,
>
> [snip]
>
> Thanks for your interesting observations: in my opinion first we must figure
> out *exactly* where that 10x comes from...
>
> >Another couple of issues that are pending would be:
> >
> >1. The use of the div instruction on every call of deallocate to
> >calculate the nodes to be removed. This is quite a high latency
> >instruction, but I don't know how much the performance is affected.
> >
> >
> Probably not much, but I agree that we had better avoiding it, if
> possible. I'll
> give it a try in a forthcoming batch of minor cleanups.
You could try this divide by 10 algorithm. It would give even mul a run
for it's money.
template <typename Int>
inline Int divide_by_10(register Int Num)
{
register Int temp = Num;
Num <<= 1;
Num += temp;
Num += temp >> 2;
Num >>= 5;
return Num;
}
Or for more accuracy:
template <typename Int>
inline Int divide_by_10(register Int Num)
{
Num -= Num >> 2;
Num += Num >> 4;
Num += Num >> 8;
Num >>= 3;
return Num;
}
>
> >2. The algorithm for giving back nodes to the global pool. Instead of
> >clipping out one node at a time, why not splice out a whole bunch of
> >nodes, and put them all in at one shot? Just a thought. I can prepare a
> >patch for this one if needed.
> >
> >
> Looks like a nice idea. Please put together a patch for public scrutiny.
It's attached.
Just wondering, why aren't we incrementing the used count for the global
pool while adding blocks to it?
--
-Dhruv Matani.
http://www.geocities.com/dhruvbird/
Proud to be a Vegetarian.
http://www.vegetarianstarterkit.com/
http://www.vegkids.com/vegkids/index.html
*** ./mt_allocator.h 2004-03-26 15:16:41.000000000 +0530
--- /home/dhruv/projects/modified_cvs_libstdc++/include/ext/mt_allocator.h 2004-03-26 18:32:53.000000000 +0530
*************** namespace __gnu_cxx
*** 459,475 ****
if (remove > __cond1 && remove > __cond2)
{
__gthread_mutex_lock(__bin.mutex);
! block_record* tmp;
! while (remove > 0)
{
! tmp = __bin.first[thread_id]->next;
! __bin.first[thread_id]->next = __bin.first[0];
! __bin.first[0] = __bin.first[thread_id];
!
! __bin.first[thread_id] = tmp;
! __bin.free[thread_id]--;
! remove--;
}
__gthread_mutex_unlock(__bin.mutex);
}
--- 459,478 ----
if (remove > __cond1 && remove > __cond2)
{
__gthread_mutex_lock(__bin.mutex);
! block_record* __tmp = __bin.first[thread_id];
! block_record* __first = __tmp;
! int __removed = remove;
!
! while (remove > 1)
{
! __tmp = __tmp->next;
! --remove;
}
+ __bin.first[thread_id] = __tmp->next;
+ __tmp->next = __bin.first[0];
+ __bin.first[0] = __first;
+ __bin.free[thread_id] -= __removed;
+
__gthread_mutex_unlock(__bin.mutex);
}