[PATCH] Slightly better way to __USE_MALLOC

Loren James Rittle rittle@latour.rsch.comm.mot.com
Thu Oct 17 17:18:00 GMT 2002


This version includes Brad's initial feedback.  The code to capture
the state of the GLIBCPP_FORCE_NEW environment variable is now within
::allocate alone and shared with ::deallocate properly in light of all
threading and initialization order issues.  The only remaining
assumption is that ::deallocate may never be called before ::allocate
is called one time (which is the same assumption the code had before).

The measured overhead to gain the ability to switch the concrete
allocator without recompiling is quite small.  For code in a tight
allocate/deallocate loop, it might be measured at 5%.  However, the
allocators are not usually called by application code.  Thus, the true
overhead is smaller.

Final comments from fellow maintainers before installation on
mainline?  Given that there is a tradeoff, I will not commit it
without a nod from someone else.

	* docs/html/23_containers/howto.html (GLIBCPP_FORCE_NEW): Document
	new environment variable which replaces all uses of __USE_MALLOC
	macro.
	* docs/html/ext/howto.html (GLIBCPP_FORCE_NEW): Likewise.
	(__mem_interface): Remove all references to old internal typedef.
	* include/backward/alloc.h (__USE_MALLOC): Remove it and all
	guarded code.
	* include/bits/c++config (__USE_MALLOC): Update related error
	message and comment.
	* include/bits/stl_alloc.h (__USE_MALLOC): Remove it and all
	guarded code.  Update all related comments.
	(__mem_interface): Unconditionally replace it with __new_alloc.
	However, leave the typedef around in case anyone used it.
	(__default_alloc_template<>::_S_force_new): New class static.
	(__default_alloc_template<>::allocate, deallocate): Add
	run-time controlled feature similar to what __USE_MALLOC code
	path had provided.
	* src/stl-inst.cc (__USE_MALLOC): Remove it and all
	guarded code.
	* testsuite/21_strings/capacity.cc: Remove reference to __USE_MALLOC.
	* testsuite/ext/allocators.cc: Likewise.

Index: docs/html/23_containers/howto.html
===================================================================
RCS file: /cvs/gcc/gcc/libstdc++-v3/docs/html/23_containers/howto.html,v
retrieving revision 1.24
diff -c -r1.24 howto.html
*** docs/html/23_containers/howto.html	7 Oct 2002 18:11:20 -0000	1.24
--- docs/html/23_containers/howto.html	17 Oct 2002 23:55:46 -0000
***************
*** 251,291 ****
        solution would probably be more trouble than it's worth.
     </p>
     <p>The STL implementation is currently configured to use the
!       high-speed caching memory allocator.  If you absolutely think
!       you must change this on a global basis for your platform to better
!       support multi-threading, then please consult all commentary in
!       include/bits/stl_alloc.h and the allocators link below.
     </p> 
-    <blockquote>
-       <p>(Explicit warning since so many people get confused while
-       attempting this:)
-       </p>
-       <p><strong>Adding -D__USE_MALLOC on the command
-       line is almost certainly a bad idea.</strong>  Memory efficiency is
-       almost guaranteed to suffer as a result; this is
-       <a href="http://gcc.gnu.org/ml/libstdc++/2001-05/msg00136.html">why
-       we disabled it for 3.0 in the first place</a>.
-       </p>
-       <p>Related to threading or otherwise, the current recommendation is
-       that users not add any macro defines on the command line to remove or
-       otherwise disable features of libstdc++-v3.  There is
-       no condition under which it will help you without causing other
-       issues to perhaps raise up (possible linkage/ABI problems).  In
-       particular, __USE_MALLOC should only be added to a libstdc++-v3
-       configuration file, include/bits/c++config (where such user
-       action is cautioned against), and the entire library should be
-       rebuilt.  If you do not, then you might be violating the
-       one-definition rule of C/C++ and you might cause yourself untold
-       problems.
-       </p>
-    </blockquote>
-    <p>If you find any platform where gcc reports a
-       threading model other than single, and where libstdc++-v3 builds
-       a buggy container allocator when used with threads unless you
-       define __USE_MALLOC, we want to hear about it ASAP.  In the
-       past, correctness was the main reason people were led to believe
-       that they should define __USE_MALLOC when using threads.
-    </p>
     <p>There is a better way (not standardized yet):  It is possible to
        force the malloc-based allocator on a per-case-basis for some
        application code.  The library team generally believes that this
--- 251,260 ----
        solution would probably be more trouble than it's worth.
     </p>
     <p>The STL implementation is currently configured to use the
!       high-speed caching memory allocator.  If you want to test
!       or run your application without the cache, then please add
!       GLIBCPP_FORCE_NEW to your environment before running the program.
     </p> 
     <p>There is a better way (not standardized yet):  It is possible to
        force the malloc-based allocator on a per-case-basis for some
        application code.  The library team generally believes that this
Index: docs/html/ext/howto.html
===================================================================
RCS file: /cvs/gcc/gcc/libstdc++-v3/docs/html/ext/howto.html,v
retrieving revision 1.23
diff -c -r1.23 howto.html
*** docs/html/ext/howto.html	7 Oct 2002 18:11:21 -0000	1.23
--- docs/html/ext/howto.html	17 Oct 2002 23:55:46 -0000
***************
*** 280,298 ****
           same as <code>allocator<T></code>.
       </li>
     </ul>
-    <p>An internal typedef, <code> __mem_interface </code>, is defined to be
-       <code>__new_alloc</code> by default.
-    </p>
     <p>Normally,
        <code> __default_alloc_template<bool thr, int inst> </code>
        is also available.  This is the high-speed pool, called the default
        node allocator.  The reusable memory is shared among identical
        instantiations of
!       this type.  It calls through <code>__mem_interface</code> to obtain
        new memory when its lists run out.  If a client container requests a
        block larger than a certain threshold size, then the pool is bypassed,
        and the allocate/deallocate request is passed to
!       <code>__mem_interface</code> directly.
     </p>
     <p>Its <code>inst</code> parameter is described below.  The
        <code>thr</code> boolean determines whether the pool should be
--- 280,295 ----
           same as <code>allocator<T></code>.
       </li>
     </ul>
     <p>Normally,
        <code> __default_alloc_template<bool thr, int inst> </code>
        is also available.  This is the high-speed pool, called the default
        node allocator.  The reusable memory is shared among identical
        instantiations of
!       this type.  It calls through <code>__new_alloc</code> to obtain
        new memory when its lists run out.  If a client container requests a
        block larger than a certain threshold size, then the pool is bypassed,
        and the allocate/deallocate request is passed to
!       <code>__new_alloc</code> directly.
     </p>
     <p>Its <code>inst</code> parameter is described below.  The
        <code>thr</code> boolean determines whether the pool should be
***************
*** 313,329 ****
     </p>
     <h3>A cannon to swat a fly:<code>  __USE_MALLOC</code></h3>
     <p>If you've already read <a href="../23_containers/howto.html#3">this
!       advice</a> and decided to define this macro, then the situation changes
!       thusly:
     </p>
-    <ol>
-      <li><code>__mem_interface</code>, and</li>
-      <li><code>__alloc</code>, and</li>
-      <li><code>__single_client_alloc</code> are all typedef'd to
-          <code>__malloc_alloc_template</code>.</li>
-      <li><code>__default_alloc_template</code> is no longer available.
-          At all.  Anywhere.</li>
-    </ol>
     <h3>Writing your own allocators</h3>
     <p>Depending on your application (a specific program, a generic library,
        etc), allocator classes tend to be one of two styles:  "SGI"
--- 310,322 ----
     </p>
     <h3>A cannon to swat a fly:<code>  __USE_MALLOC</code></h3>
     <p>If you've already read <a href="../23_containers/howto.html#3">this
!       advice</a> but still think you remember how to use this macro from
!       SGI STL days.  Compile normally and set GLIBCPP_FORCE_NEW
!       into your environment before running the program.  You will
!       obtain a similar effect without having to recompile your entire
!       program and the entire library (new in gcc is a light wrapper
!       around malloc).
     </p>
     <h3>Writing your own allocators</h3>
     <p>Depending on your application (a specific program, a generic library,
        etc), allocator classes tend to be one of two styles:  "SGI"
Index: include/backward/alloc.h
===================================================================
RCS file: /cvs/gcc/gcc/libstdc++-v3/include/backward/alloc.h,v
retrieving revision 1.11
diff -c -r1.11 alloc.h
*** include/backward/alloc.h	6 Dec 2001 20:29:30 -0000	1.11
--- include/backward/alloc.h	17 Oct 2002 23:55:46 -0000
***************
*** 53,62 ****
  using std::__alloc; 
  using std::__single_client_alloc; 
  using std::allocator;
- #ifdef __USE_MALLOC
- using std::malloc_alloc; 
- #else
  using std::__default_alloc_template; 
- #endif
  
  #endif 
--- 53,58 ----
Index: include/bits/c++config
===================================================================
RCS file: /cvs/gcc/gcc/libstdc++-v3/include/bits/c++config,v
retrieving revision 1.516
diff -c -r1.516 c++config
*** include/bits/c++config	17 Oct 2002 07:17:10 -0000	1.516
--- include/bits/c++config	17 Oct 2002 23:55:46 -0000
***************
*** 74,86 ****
  // so, please report any possible issues to libstdc++@gcc.gnu.org .
  // Do not define __USE_MALLOC on the command line.  Enforce it here:
  #ifdef __USE_MALLOC
! #error __USE_MALLOC should only be defined within \
! libstdc++-v3/include/bits/c++config before full recompilation of the library.
  #endif
- // Define __USE_MALLOC after this point in the file in order to aid debugging
- // or globally change allocation policy.  This breaks the ABI, thus
- // completely recompile the library.  A patch to better support
- // changing the global allocator policy would be probably be accepted.
  
  // The remainder of the prewritten config is mostly automatic; all the
  // user hooks are listed above.
--- 74,81 ----
  // so, please report any possible issues to libstdc++@gcc.gnu.org .
  // Do not define __USE_MALLOC on the command line.  Enforce it here:
  #ifdef __USE_MALLOC
! #error __USE_MALLOC should never be defined.  Read the release notes.
  #endif
  
  // The remainder of the prewritten config is mostly automatic; all the
  // user hooks are listed above.
Index: include/bits/stl_alloc.h
===================================================================
RCS file: /cvs/gcc/gcc/libstdc++-v3/include/bits/stl_alloc.h,v
retrieving revision 1.24
diff -c -r1.24 stl_alloc.h
*** include/bits/stl_alloc.h	23 Aug 2002 16:52:29 -0000	1.24
--- include/bits/stl_alloc.h	17 Oct 2002 23:55:46 -0000
***************
*** 89,94 ****
--- 89,96 ----
  #include <bits/functexcept.h>   // For __throw_bad_alloc
  #include <bits/stl_threads.h>
  
+ #include <bits/atomicity.h>
+ 
  namespace std
  {
    /**
***************
*** 210,223 ****
      }
  #endif
  
! 
!   // Determines the underlying allocator choice for the node allocator.
! #ifdef __USE_MALLOC
!   typedef __malloc_alloc_template<0>  __mem_interface;
! #else
    typedef __new_alloc                 __mem_interface;
- #endif
- 
  
    /**
     *  @if maint
--- 212,219 ----
      }
  #endif
  
!   // Should not be referenced within the library anymore.
    typedef __new_alloc                 __mem_interface;
  
    /**
     *  @if maint
***************
*** 307,329 ****
      };
  
  
- #ifdef __USE_MALLOC
- 
-   typedef __mem_interface __alloc;
-   typedef __mem_interface __single_client_alloc;
- 
- #else
- 
- 
    /**
     *  @if maint
!    *  Default node allocator.  "SGI" style.  Uses __mem_interface for its
!    *  underlying requests (and makes as few requests as possible).
!    *  **** Currently __mem_interface is always __new_alloc, never __malloc*.
     *
     *  Important implementation properties:
     *  1. If the clients request an object of size > _MAX_BYTES, the resulting
!    *     object will be obtained directly from the underlying __mem_interface.
     *  2. In all other cases, we allocate an object of size exactly
     *     _S_round_up(requested_size).  Thus the client has enough size
     *     information that we can return the object to the proper free list
--- 303,318 ----
      };
  
  
    /**
     *  @if maint
!    *  Default node allocator.  "SGI" style.  Uses various allocators to
!    *  fulfill underlying requests (and makes as few requests as possible
!    *  when in default high-speed pool mode).
     *
     *  Important implementation properties:
+    *  0. If globally mandated, then allocate objects from __new_alloc
     *  1. If the clients request an object of size > _MAX_BYTES, the resulting
!    *     object will be obtained directly from __new_alloc
     *  2. In all other cases, we allocate an object of size exactly
     *     _S_round_up(requested_size).  Thus the client has enough size
     *     information that we can return the object to the proper free list
***************
*** 394,447 ****
        } __attribute__ ((__unused__));
        friend struct _Lock;
  
      public:
        // __n must be > 0
        static void*
        allocate(size_t __n)
        {
!         void* __ret = 0;
  
!         if (__n > (size_t) _MAX_BYTES)
!           __ret = __mem_interface::allocate(__n);
!         else
!           {
!             _Obj* volatile* __my_free_list = _S_free_list
!                                              + _S_freelist_index(__n);
!             // Acquire the lock here with a constructor call.  This
!             // ensures that it is released in exit or during stack
!             // unwinding.
!             _Lock __lock_instance;
!             _Obj* __restrict__ __result = *__my_free_list;
!             if (__result == 0)
!               __ret = _S_refill(_S_round_up(__n));
!             else
!               {
!                 *__my_free_list = __result -> _M_free_list_link;
!                 __ret = __result;
!               }
!           }
!         return __ret;
!       };
  
        // __p may not be 0
        static void
        deallocate(void* __p, size_t __n)
        {
!         if (__n > (size_t) _MAX_BYTES)
!           __mem_interface::deallocate(__p, __n);
!         else
!           {
!             _Obj* volatile*  __my_free_list = _S_free_list
!               + _S_freelist_index(__n);
!             _Obj* __q = (_Obj*)__p;
! 
!             // Acquire the lock here with a constructor call.  This
!             // ensures that it is released in exit or during stack
!             // unwinding.
!             _Lock __lock_instance;
!             __q -> _M_free_list_link = *__my_free_list;
!             *__my_free_list = __q;
!           }
        }
  
  #ifdef _GLIBCPP_DEPRECATED
--- 383,451 ----
        } __attribute__ ((__unused__));
        friend struct _Lock;
  
+       static _Atomic_word _S_force_new;
+ 
      public:
        // __n must be > 0
        static void*
        allocate(size_t __n)
        {
! 	void* __ret = 0;
  
! 	// If there is a race through here, assume answer from getenv
! 	// will resolve in same direction.  Inspired by techniques
! 	// to efficiently support threading found in basic_string.h.
! 	if (_S_force_new == 0)
! 	  {
! 	    if (getenv("GLIBCPP_FORCE_NEW"))
! 	      __atomic_add(&_S_force_new, 1);
! 	    else
! 	      __atomic_add(&_S_force_new, -1);
! 	    // Trust but verify...
! 	    assert (_S_force_new != 0);
! 	  }
! 
! 	if ((__n > (size_t) _MAX_BYTES) || (_S_force_new > 0))
! 	  __ret = __new_alloc::allocate(__n);
! 	else
! 	  {
! 	    _Obj* volatile* __my_free_list = _S_free_list
! 	      + _S_freelist_index(__n);
! 	    // Acquire the lock here with a constructor call.  This
! 	    // ensures that it is released in exit or during stack
! 	    // unwinding.
! 	    _Lock __lock_instance;
! 	    _Obj* __restrict__ __result = *__my_free_list;
! 	    if (__result == 0)
! 	      __ret = _S_refill(_S_round_up(__n));
! 	    else
! 	      {
! 		*__my_free_list = __result -> _M_free_list_link;
! 		__ret = __result;
! 	      }
! 	  }
! 	return __ret;
!       }
  
        // __p may not be 0
        static void
        deallocate(void* __p, size_t __n)
        {
! 	if ((__n > (size_t) _MAX_BYTES) || (_S_force_new > 0))
! 	  __new_alloc::deallocate(__p, __n);
! 	else
! 	  {
! 	    _Obj* volatile*  __my_free_list = _S_free_list
! 	      + _S_freelist_index(__n);
! 	    _Obj* __q = (_Obj*)__p;
! 
! 	    // Acquire the lock here with a constructor call.  This
! 	    // ensures that it is released in exit or during stack
! 	    // unwinding.
! 	    _Lock __lock_instance;
! 	    __q -> _M_free_list_link = *__my_free_list;
! 	    *__my_free_list = __q;
! 	  }
        }
  
  #ifdef _GLIBCPP_DEPRECATED
***************
*** 450,455 ****
--- 454,461 ----
  #endif
      };
  
+   template<bool __threads, int __inst> _Atomic_word
+   __default_alloc_template<__threads, __inst>::_S_force_new = 0;
  
    template<bool __threads, int __inst>
      inline bool
***************
*** 465,472 ****
  
  
    // We allocate memory in large chunks in order to avoid fragmenting the
!   // malloc heap (or whatever __mem_interface is using) too much.  We assume
!   // that __size is properly aligned.  We hold the allocation lock.
    template<bool __threads, int __inst>
      char*
      __default_alloc_template<__threads, __inst>::
--- 471,478 ----
  
  
    // We allocate memory in large chunks in order to avoid fragmenting the
!   // heap too much.  We assume that __size is properly aligned.  We hold
!   // the allocation lock.
    template<bool __threads, int __inst>
      char*
      __default_alloc_template<__threads, __inst>::
***************
*** 503,509 ****
                ((_Obj*)_S_start_free) -> _M_free_list_link = *__my_free_list;
                *__my_free_list = (_Obj*)_S_start_free;
              }
!           _S_start_free = (char*) __mem_interface::allocate(__bytes_to_get);
            if (0 == _S_start_free)
              {
                size_t __i;
--- 509,515 ----
                ((_Obj*)_S_start_free) -> _M_free_list_link = *__my_free_list;
                *__my_free_list = (_Obj*)_S_start_free;
              }
!           _S_start_free = (char*) __new_alloc::allocate(__bytes_to_get);
            if (0 == _S_start_free)
              {
                size_t __i;
***************
*** 528,534 ****
                      }
                  }
                _S_end_free = 0;        // In case of exception.
!               _S_start_free = (char*)__mem_interface::allocate(__bytes_to_get);
                // This should either throw an exception or remedy the situation.
                // Thus we assume it succeeded.
              }
--- 534,540 ----
                      }
                  }
                _S_end_free = 0;        // In case of exception.
!               _S_start_free = (char*)__new_alloc::allocate(__bytes_to_get);
                // This should either throw an exception or remedy the situation.
                // Thus we assume it succeeded.
              }
***************
*** 618,624 ****
  
    typedef __default_alloc_template<true,0>    __alloc;
    typedef __default_alloc_template<false,0>   __single_client_alloc;
- #endif /* ! __USE_MALLOC */
  
  
    /**
--- 624,629 ----
***************
*** 628,637 ****
     *  of stl_alloc.h.)
     *
     *  The underlying allocator behaves as follows.
-    *  - if __USE_MALLOC then
-    *    - thread safety depends on malloc and is entirely out of our hands
-    *    - __malloc_alloc_template is used for memory requests
-    *  - else (the default)
     *    - __default_alloc_template is used via two typedefs
     *    - "__single_client_alloc" typedef does no locking for threads
     *    - "__alloc" typedef is threadsafe via the locks
--- 633,638 ----
***************
*** 908,914 ****
        typedef __allocator<_Tp, __malloc_alloc_template<__inst> > allocator_type;
      };
  
- #ifndef __USE_MALLOC
    template<typename _Tp, bool __threads, int __inst>
      struct _Alloc_traits<_Tp, __default_alloc_template<__threads, __inst> >
      {
--- 909,914 ----
***************
*** 918,924 ****
        typedef __allocator<_Tp, __default_alloc_template<__threads, __inst> >
        allocator_type;
      };
- #endif
  
    template<typename _Tp, typename _Alloc>
      struct _Alloc_traits<_Tp, __debug_alloc<_Alloc> >
--- 918,923 ----
***************
*** 941,947 ****
        typedef __allocator<_Tp, __malloc_alloc_template<__inst> > allocator_type;
      };
  
- #ifndef __USE_MALLOC
    template<typename _Tp, typename _Tp1, bool __thr, int __inst>
      struct _Alloc_traits<_Tp, __allocator<_Tp1, __default_alloc_template<__thr, __inst> > >
      {
--- 940,945 ----
***************
*** 951,957 ****
        typedef __allocator<_Tp, __default_alloc_template<__thr,__inst> >
        allocator_type;
      };
- #endif
  
    template<typename _Tp, typename _Tp1, typename _Alloc>
      struct _Alloc_traits<_Tp, __allocator<_Tp1, __debug_alloc<_Alloc> > >
--- 949,954 ----
***************
*** 967,977 ****
    // NB: This syntax is a GNU extension.
    extern template class allocator<char>;
    extern template class allocator<wchar_t>;
- #ifdef __USE_MALLOC
-   extern template class __malloc_alloc_template<0>;
- #else
    extern template class __default_alloc_template<true,0>;
- #endif
  } // namespace std
  
  #endif
--- 964,970 ----
Index: src/stl-inst.cc
===================================================================
RCS file: /cvs/gcc/gcc/libstdc++-v3/src/stl-inst.cc,v
retrieving revision 1.14
diff -c -r1.14 stl-inst.cc
*** src/stl-inst.cc	17 Apr 2002 06:20:19 -0000	1.14
--- src/stl-inst.cc	17 Oct 2002 23:55:46 -0000
***************
*** 39,47 ****
    template class allocator<char>;
    template class allocator<wchar_t>;
  
- #ifdef __USE_MALLOC
-   template class __malloc_alloc_template<0>;
- #else
    template class __default_alloc_template<true, 0>;
- #endif
  } // namespace std
--- 39,43 ----
Index: testsuite/21_strings/capacity.cc
===================================================================
RCS file: /cvs/gcc/gcc/libstdc++-v3/testsuite/21_strings/capacity.cc,v
retrieving revision 1.9
diff -c -r1.9 capacity.cc
*** testsuite/21_strings/capacity.cc	31 Jul 2002 02:47:34 -0000	1.9
--- testsuite/21_strings/capacity.cc	17 Oct 2002 23:55:46 -0000
***************
*** 209,215 ****
    sz02 = str011.length();
    VERIFY( sz02 > sz01 );
      
!   // trickster allocator (__USE_MALLOC, luke) issues involved with these:
    std::string str3 = "8-chars_8-chars_";
    const char* p3 = str3.c_str();
    std::string str4 = str3 + "7-chars";
--- 209,215 ----
    sz02 = str011.length();
    VERIFY( sz02 > sz01 );
      
!   // trickster allocator issues involved with these:
    std::string str3 = "8-chars_8-chars_";
    const char* p3 = str3.c_str();
    std::string str4 = str3 + "7-chars";
Index: testsuite/ext/allocators.cc
===================================================================
RCS file: /cvs/gcc/gcc/libstdc++-v3/testsuite/ext/allocators.cc,v
retrieving revision 1.1
diff -c -r1.1 allocators.cc
*** testsuite/ext/allocators.cc	11 Dec 2001 19:04:58 -0000	1.1
--- testsuite/ext/allocators.cc	17 Oct 2002 23:55:46 -0000
***************
*** 20,26 ****
  
  // 20.4.1.1 allocator members
  
- #undef __USE_MALLOC
  #include <memory>
  #include <cstdlib>
  #include <testsuite_hooks.h>
--- 20,25 ----



More information about the Libstdc++ mailing list