This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [Patch] mt_allocator: spare mem & fix alignment problems

From: Dhruv Matani <dhruvbird at gmx dot net>
To: Paolo Carlini <pcarlini at suse dot de>
Cc: Stefan Olsson <stefan at xapa dot se>, libstdc++ <libstdc++ at gcc dot gnu dot org>, gcc-patches <gcc-patches at gcc dot gnu dot org>
Date: 26 Mar 2004 18:42:13 +0530
Subject: Re: [Patch] mt_allocator: spare mem & fix alignment problems
Organization:
References: <40630DF2.7040507@suse.de> <4063F17B.3070802@xapa.se> <40641560.8030806@suse.de> <1080303648.1453.29.camel@localhost.localdomain> <4064208D.1070206@suse.de>

On Fri, 2004-03-26 at 17:52, Paolo Carlini wrote:
> Hi,
> 
> [snip]
> 
> Thanks for your interesting observations: in my opinion first we must figure
> out *exactly* where that 10x comes from...
> 
> >Another couple of issues that are pending would be:
> >
> >1. The use of the div instruction on every call of deallocate to
> >calculate the nodes to be removed. This is quite a high latency
> >instruction, but I don't know how much the performance is affected.
> >  
> >
> Probably not much, but I agree that we had better avoiding it, if 
> possible. I'll
> give it a try in a forthcoming batch of minor cleanups.

You could try this divide by 10 algorithm. It would give even mul a run
for it's money.

template <typename Int>
inline Int divide_by_10(register Int Num)
{
  register Int temp = Num;
  Num <<= 1;
  Num += temp;
  Num += temp >> 2;
  Num >>= 5;
  return Num;
}

Or for more accuracy:

template <typename Int>
inline Int divide_by_10(register Int Num)
{
        Num -= Num >> 2;
        Num += Num >> 4;
        Num += Num >> 8;
        Num >>= 3;
	return Num;
}

> 
> >2. The algorithm for giving back nodes to the global pool. Instead of
> >clipping out one node at a time, why not splice out a whole bunch of
> >nodes, and put them all in at one shot? Just a thought. I can prepare a
> >patch for this one if needed.
> >  
> >
> Looks like a nice idea. Please put together a patch for public scrutiny.

It's attached.

Just wondering, why aren't we incrementing the used count for the global
pool while adding blocks to it?




-- 
	-Dhruv Matani.
http://www.geocities.com/dhruvbird/

Proud to be a Vegetarian.
http://www.vegetarianstarterkit.com/
http://www.vegkids.com/vegkids/index.html

*** ./mt_allocator.h	2004-03-26 15:16:41.000000000 +0530
--- /home/dhruv/projects/modified_cvs_libstdc++/include/ext/mt_allocator.h	2004-03-26 18:32:53.000000000 +0530
*************** namespace __gnu_cxx
*** 459,475 ****
  	  if (remove > __cond1 && remove > __cond2)
  	    {
  	      __gthread_mutex_lock(__bin.mutex);
! 	      block_record* tmp;
! 	      while (remove > 0)
  		{
! 		  tmp = __bin.first[thread_id]->next;
! 		  __bin.first[thread_id]->next = __bin.first[0];
! 		  __bin.first[0] = __bin.first[thread_id];
! 		  
! 		  __bin.first[thread_id] = tmp;
! 		  __bin.free[thread_id]--;
! 		  remove--;
  		}
  	      __gthread_mutex_unlock(__bin.mutex);
  	    }
  	  
--- 459,478 ----
  	  if (remove > __cond1 && remove > __cond2)
  	    {
  	      __gthread_mutex_lock(__bin.mutex);
! 	      block_record* __tmp = __bin.first[thread_id];
! 	      block_record* __first = __tmp;
! 	      int __removed = remove;
! 
! 	      while (remove > 1)
  		{
! 		  __tmp = __tmp->next;
! 		  --remove;
  		}
+ 	      __bin.first[thread_id] = __tmp->next;
+ 	      __tmp->next = __bin.first[0];
+ 	      __bin.first[0] = __first;
+ 	      __bin.free[thread_id] -= __removed;
+ 
  	      __gthread_mutex_unlock(__bin.mutex);
  	    }

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]