This is the mail archive of the libstdc++@gcc.gnu.org mailing list for the libstdc++ project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: RFC: Hint parameter for allocators.


On Fri, 2004-01-23 at 08:49, Loren James Rittle wrote:
> Dhruv Matani writes:
> > I'd like to discuss the specific use of the hint parameter of a standard
> > allocator when used to specify the region of memory the allocator should
> > prefer when choosing it for the current request.
> [...]
> 
> This is an interesting line of ideas.  I do wonder if the dynamic
> complexity outweighs memory layout advantage. 

That's the main question. I guess only studying a real application would
provide the answer.


>  In a static system
> build, where precompute time is a one-time cost, this can be a big win.

I did not get you here. What precompute time are you talking about?

> > 4. Does anyone know of an allocator that makes use of the hint parameter
> > for locality preservation?
> 
> Not in the context of a general allocator for use with STL.  How are
> you passing a hint in your current effort (code example, etc)?

First, we need to modify the list's base class so that get node accepts
hints. Then, modify the create_node function to accept hints. Then, in
the insert (iterator pos, const_reference Data) function, we change the
line that calls create_node (Data) to 
create_node (Data, pos._M_node->_M_prev);

pos._M_node->_M_prev is the last valid node in case of a non-empty list.

Then there's no need to change any driver program. Just re-compile any
existing test program that uses list and check the timings. I did that,
and did not notice any significant changes, so I suppose that:

1. It does not matter that much or:
2. In real application that runs for longer times, that would probably
make a difference?

Of course, you need to use an allocator that can make sense of the hint
parameter.


For the file attached, the controlling #define is USE_HINT_PARAMETER.
You can turn it on/off by simply commenting it away.



-- 
	-Dhruv Matani.
http://www.geocities.com/dhruvbird/


/* Copyright (c) 2003
 * Dhruv Matani.
 *
 * Permission to use, copy, modify, distribute and sell this software
 * and its documentation for any purpose is hereby granted without fee,
 * provided that the above copyright notice appear in all copies and
 * that both that copyright notice and this permission notice appear
 * in supporting documentation.  No representations about the 
 * suitability of this software for any purpose.  It is provided 
 * "as is" without express or implied warranty.
 */


//Bitmapped Allocator.
//====================


// This allocator will make use of 1 single bit to keep track of whther
// it has been allocated or not. A bit 1 indicates free, while 0
// indicates allocated. This has been done so that you can easily check a
// collection of bits for a free block. This kind of Bitmapped strategy
// works best for single object allocations, and with the STL type
// parameterized, we do not need to choose any size for the block which
// will be represented by a single bit. This will be the size of the
// parameter around which the allocator has been parameterized. Thus,
// close to optimal performace will result.

// Another issue would be whether to keep the all bitmaps in a
// separate area in memory, or to keep them near the actual blocks
// that will be given out or allocated for the client. After some
// testing, I've decided to keep these bitmaps close to the actual
// blocks. this will help in 2 ways. 1. Constant time access for the
// bitmap themselves, since no kind of look up will be needed to find
// the correct bitmap list or it's equivalent. 2. And also this would
// preserve the cache as far as possible. So in effect, this kind of
// an allocator might not prove beneficial from a purely cache point
// of view. But this allocator has been made to try and roll out the
// defects of the node_allocator, whrein the nodes get skewed about in
// memory, if they are not returned in the exact reverse order or in
// the same order in which they were allocated. Also, the
// vma_allocator's book keeping overhead is too much for small objects
// and single object allocations, though it preserves the locality of
// blocks very well when they are returned back to the allocator.

// Expected overhead per block would be 1 bit in memory. Also, once
// the address of the free list has been found, the cost for
// allocation/deallocation would be negligible, and is supposed to be
// conatant time. For these very reasons, it is very important to
// minimize the linear time costs, which include finding a free list
// with a free block while allocating, and finding the corresponding
// free list for a block while deallocating. Therefore, I have decided
// that the growth of the internal pool for this allocator will be
// exponential as compared to linear for node_allocator. There, linear
// time works well, because we are mainly concerned with speed of
// allocation/deallocation and memory consumption, whereas here, the
// allocation/deallocation part does have some linear/logarathmic
// complexity components in it. Thus, to try and minimize them would
// be a good thing to do at the cost of a little bit of memory.

// Another thing to be noted is the the pool size will double every
// time the internal pool gets exhausted, and all the free blocks have
// been given away. The initial size of the pool would be
// sizeof(int)*8 which is the nuber of bits in an integer, which can
// fit exactly in a CPU register. Hence, the term given is exponential
// growth of the internal pool.


#if !defined __STL_BITMAP_ALLOCATOR_HPP__
#define __STL_BITMAP_ALLOCATOR_HPP__

#include <cstddef>
//For std::size_t, and ptrdiff_t.
#include <utility>
//For std::pair.
#include <algorithm>
//std::find_if.
#include <vector>
//For the free list of exponentially growing memory blocks. At max,
//size of the vector should be  not more than the number of bits in an
//integer or an unsigned integer.
#include <functional>
//For greater_equal, and less_equal.

#include <new>

#include <cheap_alloc.hpp>
//For std::cheap_allocator for std::vector.

#include <cassert>

#define ARCH_HAS_BSF
#define MAGIC_STOP_GROWTH 1048576
//If USE_HINT_PARAMETER is defined, the allocator may get potentially
//slower, but it will ensure that the locality of reference for memory
//blocks is preserved as far as possible, if a correct hint parameter
//is passed. If an incorrect or useless hint parameter is passed, then
//the allocator's performance may degrade by a huge margin.
#define USE_HINT_PARAMETER
//#define CHECK_FOR_ERRORS

namespace __gnu_cxx
{


namespace _aux_balloc {

  static const unsigned int Bits_Per_Byte = 8;
  static const unsigned int Bits_Per_Block = sizeof(unsigned int) * Bits_Per_Byte;

#if defined USE_HINT_PARAMETER
  template <typename T>
  class RAII_Swap {
    T& T1;
    T& T2;
    RAII_Swap (RAII_Swap const&);
    RAII_Swap& operator= (RAII_Swap const&);
  public:
    RAII_Swap (T& _t1, T& _t2) : T1(_t1), T2(_t2)
    {
      std::swap (T1, T2);
    }
    ~RAII_Swap ()
    {
      std::swap (T1, T2);
    }
  };
#endif

  //T should be a pointer type.
  template <typename T>
  class inclusive_between {
    typedef T pointer;
    pointer ptr_value;
    typedef typename std::pair<T, T> block_pair;

  public:
    inclusive_between (pointer _ptr) : ptr_value(_ptr) { }
    bool operator () (block_pair _bp)
    {
      if (std::less_equal<pointer> ()(ptr_value, _bp.second) && 
	  std::greater_equal<pointer> ()(ptr_value, _bp.first))
	return true;
      else
	return false;
    }
  };

  //Used to pass a Functor to functions by reference.
  template <typename Functor>
  class Functor_Ref {
    Functor& FRef;

  public:
    Functor_Ref (Functor& _fref) : FRef(_fref) { }
    //operator Functor& () { return FRef; }
    template <typename X>
    bool operator() (X _x) { return FRef (_x); }
  };


  //T should be a pointer type, and A is the Allocator for the vector.
  template <typename T, typename A>
  class ffit_finder {
    typedef typename std::vector<std::pair<T, T>, A> BPVector;
    typedef typename BPVector::difference_type counter_type;
    typedef typename std::pair<T, T> block_pair;


    unsigned int *pbitmap;
    unsigned int _offset;

  public:
    ffit_finder () : pbitmap (0), _offset (0) { }

    bool operator() (block_pair _bp)
    {
      //Set the _rover to the last unsigned integer, which is the
      //bitmap to the first free block. Thus, the bitmaps are in exact
      //reverse order of the actual memory layout. So, we count down
      //the bimaps, which is the same as moving up the memory.
      unsigned int *_rover = reinterpret_cast<unsigned int*>(_bp.first) - 1;
      counter_type _diff = _bp.second - _bp.first + 1;
      _diff /= Bits_Per_Block;

      for (counter_type i = 0; i < _diff; ++i)
	{
	  _offset = i;
	  if (*_rover)
	    {
	      pbitmap = _rover;
	      return true;
	    }
	  --_rover;
	}
      return false;
    }

    unsigned int *get () { return pbitmap; }
    unsigned int offset () { return _offset * Bits_Per_Block; }
  };



  //T should be a pointer type.
  template <typename T, typename A>
  class Bit_Map_Counter {

    typedef typename std::vector<std::pair<T, T>, A> BPVector;
    typedef typename BPVector::size_type index_type;
    typedef T pointer;

    BPVector& _vbp;
    unsigned int *_curr_bmap;
    unsigned int *_last_bmap_in_block;
    index_type _curr_index;

  public:
    //Use the 2nd parameter with care. Make sure that such an entry
    //exists in the vector before passing that particular index to
    //this ctor.
    Bit_Map_Counter (BPVector& Rvbp, index_type Index = 0) : _vbp(Rvbp)
    {
      this->reset (Index);
    }

    void reset (index_type Index = 0)
    {
      _curr_index = Index;
      _curr_bmap = reinterpret_cast<unsigned int*>(_vbp[_curr_index].first) - 1;

      assert (Index <= _vbp.size() - 1);

      _last_bmap_in_block = _curr_bmap - 
	((_vbp[_curr_index].second - _vbp[_curr_index].first + 1) / Bits_Per_Block - 1);
    }

    //Dangerous Function! Use with extreme care. Pass to this
    //functions ONLY those values that are known to be correct,
    //otherwise this will mess up big time.
    void Set_Internal_Bit_Map (unsigned int *New_Internal_Marker)
    {
      _curr_bmap = New_Internal_Marker;
    }

    bool finished ()
    {
      return (_curr_bmap == 0);
    }


    Bit_Map_Counter& operator++ ()
    {
      if (_curr_bmap == _last_bmap_in_block)
	{
	  if (++_curr_index == _vbp.size())
	    {
	      _curr_bmap = 0;
	    }
	  else
	    {
	      this->reset (_curr_index);
	    }
	}
      else
	{
	  --_curr_bmap;
	}
      return *this;
    }
    
    unsigned int *get ()
    {
      return _curr_bmap;
    }

    pointer base () { return _vbp[_curr_index].first; }
    unsigned int offset ()
    {
      return Bits_Per_Block * ((reinterpret_cast<unsigned int*>(this->base()) - _curr_bmap) - 1);
    }

    unsigned int where () { return _curr_index; }
  };


}



template <class T> class bitmap_allocator;
// specialize for void:
template <> class bitmap_allocator<void> {
public:
  typedef void*       pointer;
  typedef const void* const_pointer;
  //  reference-to-void members are impossible.
  typedef void  value_type;
  template <class U> struct rebind { typedef bitmap_allocator<U> other; };
};

template <class T> class bitmap_allocator {
public:
  typedef size_t    size_type;
  typedef ptrdiff_t difference_type;
  typedef T*        pointer;
  typedef const T*  const_pointer;
  typedef T&        reference;
  typedef const T&  const_reference;
  typedef T         value_type;
  template <class U> struct rebind { typedef bitmap_allocator<U> other; };

private:

  static const unsigned int Bits_Per_Byte = 8;
  static const unsigned int Bits_Per_Block = sizeof(unsigned int) * Bits_Per_Byte;

  //The following function has been written by Apurva
  //Mehta. apurva@gmx.net.
  static inline unsigned int first_right_bit (unsigned int num)
  {
#if defined ARCH_HAS_BSF
    int tmp = 0;
    __asm__ ("bsf %1, %0"
	     : "=r" (tmp)
	     : "r" (num)
	     : "%eax", "%ebx", "%edx", "%ecx"
	     );
    //I don't know whether these registers need to be clobbered?
    
    return tmp;
#else
    int ret_val = 0;
    while (num % 2 == 0)
      ++ret_val, num = num/2;
    return ret_val;
#endif
  }



  static inline void bit_allocate (unsigned int *pbmap, unsigned int Pos)
  {
    unsigned int _mask = 1 << Pos;
    _mask = ~_mask;
    *pbmap &= _mask;
  }
  
  
  static inline void bit_free (unsigned int *pbmap, unsigned int Pos)
  {
    unsigned int _mask = 1 << Pos;
    *pbmap |= _mask;
  }

  static inline void *Memory_Get (size_t _sz) throw (std::bad_alloc)
  {
    return operator new (_sz);
  }

  static inline void Memory_Put (void *_ptr) throw ()
  {
    operator delete (_ptr);
  }


  typedef typename std::pair<pointer, pointer> block_pair;
  typedef typename std::cheap_allocator<block_pair> BPVec_Allocator_Type;
  typedef typename std::vector<block_pair, BPVec_Allocator_Type> BPVector;

#if defined CHECK_FOR_ERRORS
  static void Check_For_Free_Blocks ()
  {
	typedef typename __gnu_cxx::_aux_balloc::ffit_finder<pointer, BPVec_Allocator_Type> FFF;
	FFF _fff;
	typedef typename BPVector::iterator bpiter;
	bpiter _bpi = std::find_if (mem_blocks.begin(), mem_blocks.end(), 
				    __gnu_cxx::_aux_balloc::Functor_Ref<FFF> (_fff));
	assert (_bpi == mem_blocks.end());
  }
#endif

  
  static void Refill_Pool ()
  {
#if defined CHECK_FOR_ERRORS
    Check_For_Free_Blocks ();
#endif

    const unsigned int Num_Bit_Maps = Block_Size / Bits_Per_Block;
    const unsigned int _size_to_allocate = 
      Block_Size * sizeof(value_type) + Num_Bit_Maps*sizeof(unsigned int);

    unsigned int *temp = 
      reinterpret_cast<unsigned int*>(Memory_Get (_size_to_allocate));

    //The Header information goes at the Beginning of the Block.
    block_pair _bp = std::make_pair (reinterpret_cast<pointer>(temp + Num_Bit_Maps), 
				     reinterpret_cast<pointer>(temp + Num_Bit_Maps) + Block_Size - 1);


    //Fill the Vector with this information.
    mem_blocks.push_back (_bp);

    unsigned int Bit_Mask = 0; //0 Indicates all Allocated.
    Bit_Mask = ~Bit_Mask; //1 Indicates all Free.

    for (unsigned int i = 0; i < Num_Bit_Maps; ++i)
      temp[i] = Bit_Mask;

    //On some implementations, operator new might throw bad_alloc, or
    //malloc might fail if the size passed is too large, therefore, we
    //limit the size passed to malloc or operator new.
    if (Block_Size < MAGIC_STOP_GROWTH) Block_Size *= 2;
  }

  static void _init_me ()
  {
    if (!mem_blocks.empty ())
      return;

    mem_blocks.reserve (16);
    Refill_Pool ();
    static __gnu_cxx::_aux_balloc::Bit_Map_Counter<pointer, BPVec_Allocator_Type> _bmc (mem_blocks);
    //    Bit_Map_Counter *_bmc = new Bit_Map_Counter (mem_blocks);
    last_request = &_bmc;
  }

  static BPVector mem_blocks;
  static unsigned int Block_Size;
  static __gnu_cxx::_aux_balloc::Bit_Map_Counter<pointer, BPVec_Allocator_Type> *last_request;
#if defined USE_HINT_PARAMETER
  static __gnu_cxx::_aux_balloc::Bit_Map_Counter<pointer, BPVec_Allocator_Type> *hint_request;
#endif

public:
  bitmap_allocator()
  {
    _init_me ();
  }

  //Copy Ctor means that a class of this type has already been
  //instantiated. Meaning that we need not call _do_init() again.
  bitmap_allocator(const bitmap_allocator&) { }

  template <class U> bitmap_allocator(const bitmap_allocator<U>&) throw()
  {
    _init_me ();
  }

  //Dtor not needed.
  //~bitmap_allocator() throw();

  pointer Allocate_Single_Object ()
  {
    //The algorithm is something like this: The last_requst variable
    //points to the last accessed Bit Map. When such a condition
    //occurs, we try to find a free block in the current bitmap, or
    //succeeding bitmaps until the last bitmap is reached. If no free
    //block turns up, we resort to First Fit method. But, again, the
    //First Fit is used only upto the point where we started the
    //previous linear search.

    while (last_request->finished () == false && (*(last_request->get()) == 0))
      {
	last_request->operator++ ();
      }

    if (last_request->finished ())
      {
	//Fall Back to First Fit algorithm.

	typedef typename __gnu_cxx::_aux_balloc::ffit_finder<pointer, BPVec_Allocator_Type> FFF;
	FFF _fff;
	typedef typename BPVector::iterator bpiter;
	bpiter _bpi = std::find_if (mem_blocks.begin(), mem_blocks.end(), 
				    __gnu_cxx::_aux_balloc::Functor_Ref<FFF> (_fff));


	if (_bpi != mem_blocks.end())
	  {
	    //Search was successful. Ok, now mark the first bit from
	    //the right as 0, meaning Allocated. This bit is obtained
	    //by calling get() on _fff.
	    unsigned int nz_bit = first_right_bit (*_fff.get());
	    bit_allocate (_fff.get(), nz_bit);

	    last_request->reset (_bpi - mem_blocks.begin());

	    //Now, get the address of the bit we marked as allocated.
	    pointer Ret_Val = _bpi->first + _fff.offset() + nz_bit;
	    return Ret_Val;
	  }
	else
	  {
	    //Search was unsuccessful. We Add more memory to the pool
	    //by calling Refill_Pool().
	    Refill_Pool ();

	    //Reset the last_request structure to the first free
	    //block's bit map.
	    last_request->reset (mem_blocks.size () - 1);

	    //Now, mark that bit as allocated.
	    unsigned int nz_bit = first_right_bit (*last_request->get());
	    bit_allocate (last_request->get(), nz_bit);

	    pointer Ret_Val = last_request->base() + last_request->offset() + nz_bit;
	    return Ret_Val;
	  }
      }
    else
      {
	//last_request holds a pointer to a valid bit map, that points
	//to a free block in memory.
	unsigned int nz_bit = first_right_bit (*last_request->get());
	bit_allocate (last_request->get(), nz_bit);

	pointer Ret_Val = last_request->base() + last_request->offset() + nz_bit;
	return Ret_Val;
      }
  }


  pointer allocate (size_type n)
  {
    if (n != 1)
      return reinterpret_cast<pointer>(Memory_Get (n * sizeof(value_type)));
    else
      return Allocate_Single_Object ();
  }

  pointer allocate (size_type n, typename bitmap_allocator<void>::const_pointer hint)
  {
    //I will probably implement this beahviour later. For no, just
    //call the previous allocate function.
#if !defined USE_HINT_PARAMETER
    return allocate (n);
#else
    if (n != 1)
      return reinterpret_cast<pointer>(Memory_Get (n * sizeof(value_type)));
    else if (!hint)
      return Allocate_Single_Object ();


    typedef typename __gnu_cxx::_aux_balloc::Bit_Map_Counter<pointer, BPVec_Allocator_Type> bmc_type;
    static bmc_type _bmc (mem_blocks);
    hint_request = &_bmc;

    pointer _phint = reinterpret_cast<pointer> (const_cast<void*> (hint));

    if (hint_request->finished() == true)
      hint_request->reset (mem_blocks.size () - 1);

    if (__gnu_cxx::_aux_balloc::inclusive_between<pointer> (_phint) (mem_blocks[hint_request->where ()]))
      //Condition satisfied. We have assumed correctly. The hint
      //parameter is actually within the area returned by where().
      {
	//We first swap hint_request and last_request, and then call
	//the default Single Object Allocation routione.
	__gnu_cxx::_aux_balloc::RAII_Swap<bmc_type*> Auto_Kill (last_request, hint_request);
	pointer Ret_Val = Allocate_Single_Object ();
	return Ret_Val;
      }

    //Sorry! The current Area does not correspond well to the hint
    //parameter. We first need to find the corresponding area.
    typedef typename BPVector::iterator iterator;
    typedef typename BPVector::difference_type diff_type;

    iterator iter = std::find_if (mem_blocks.begin(), mem_blocks.end(), 
				  __gnu_cxx::_aux_balloc::inclusive_between<pointer> (_phint));

    assert (iter != mem_blocks.end());

    //Get the position of the iterator that has been found.
    diff_type _diff = iter - mem_blocks.begin();
    int _displacement = _phint - mem_blocks[_diff].first;
    unsigned int *_bit_mapC = reinterpret_cast<unsigned int*>(mem_blocks[_diff].first) - 1;
    _bit_mapC -= (_displacement / Bits_Per_Block);

    //Use a potentially Dangerous Function.
    hint_request->reset (_diff);
    hint_request->Set_Internal_Bit_Map (_bit_mapC);

    //We first swap hint_request and last_request, and then call
    //the default Single Object Allocation routione.
    __gnu_cxx::_aux_balloc::RAII_Swap<bmc_type*> Auto_Kill (last_request, hint_request);
    pointer Ret_Val = Allocate_Single_Object ();
    return Ret_Val;
#endif
  }


  void deallocate(pointer p, size_type n)
  {
    if (n != 1)
      Memory_Put (p);
    else
      Deallocte_Single_Object (p);
  }



  static void Deallocte_Single_Object (pointer p)
  {
    typedef typename BPVector::iterator iterator;
    typedef typename BPVector::difference_type diff_type;

    iterator iter = std::find_if (mem_blocks.begin(), mem_blocks.end(), 
				  __gnu_cxx::_aux_balloc::inclusive_between<pointer> (p));

    assert (iter != mem_blocks.end());

    //Get the position of the iterator that has been found.
    diff_type _diff = iter - mem_blocks.begin();
    int _displacement = p - mem_blocks[_diff].first;
    unsigned int _rotate = _displacement % Bits_Per_Block;
    unsigned int *_bit_mapC = reinterpret_cast<unsigned int*>(mem_blocks[_diff].first) - 1;
    _bit_mapC -= (_displacement / Bits_Per_Block);

    bit_free (_bit_mapC, _rotate);
  }


  pointer address (reference r) const { return &r; }
  const_pointer address (const_reference r) const { return &r; }

  size_type max_size (void) const throw () { return (size_type()-1)/sizeof(value_type); }



  void construct (pointer p, const_reference _data)
  {
    new (p) value_type (_data);
  }

  void destroy (pointer p)
  {
    p->~value_type();
  }


};

  template <typename T>
  bitmap_allocator<T>::BPVector bitmap_allocator<T>::mem_blocks;

  template <typename T>
  unsigned int bitmap_allocator<T>::Block_Size = bitmap_allocator<T>::Bits_Per_Block;

  template <typename T>
  __gnu_cxx::_aux_balloc::Bit_Map_Counter 
  <bitmap_allocator<T>::pointer, bitmap_allocator<T>::BPVec_Allocator_Type> 
  *bitmap_allocator<T>::last_request;

#if defined USE_HINT_PARAMETER
  template <typename T>
  __gnu_cxx::_aux_balloc::Bit_Map_Counter 
  <bitmap_allocator<T>::pointer, bitmap_allocator<T>::BPVec_Allocator_Type> 
  *bitmap_allocator<T>::hint_request;
#endif


  template <typename T1, typename T2>
  bool operator== (const bitmap_allocator<T1>&, const bitmap_allocator<T2>&) throw()
  {
    return true;
  }

  template <typename T1, typename T2>
  bool operator!= (const bitmap_allocator<T1>&, const bitmap_allocator<T2>&) throw()
  {
    return false;
  }
}


#endif //__STL_BITMAP_ALLOCATOR_HPP__



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]