GCC 3.0 and 3.0.3 bug of string implemenation and the fix

Mike Lu MLu@dynamicsoft.com
Tue Jan 15 15:44:00 GMT 2002


     We have encountered a serious bug in multi-thread code when using
string with GCC3.0 and GCC3.0.3 
on sparc-solaris-2.8 dual CPU system.  The problem can be easily re-produced
using the attached program, 
which is contributed by Jason Beardsley(jbeardsley@origin.ea.com) in this
mail archive with subject "solaris 
2.6, threads, and strings" dated  Feb. 11, 1999.

 Here is our test case:
    1.  We use ptmalloc, which is from Wolfram Gloger(wg@malloc.de) as our
memory allocator, other 
allocators have the same problem, but with ptmalloc, the problem is easier
to be reproduced. All the 
scenario reported by Jason seems appear again in GCC3.0 and GCC3.0.3.

    2. We have made some changes on Jason's code and used a debug STL
allocator from Austern's column 
in C++ Journal(we added some debug printout and the modified source code is
also attached).

    3. We added some debug print out in
".../include/g++-v3/bits/basic_string.h"

          basic_string.h::_Rep::_M_dispose() and
basic_string.h::_Rep::_M_refcopy() have been changed:

     void
         _M_dispose(const _Alloc& __a)
         {
           size_type __size = sizeof(_Rep) + (_M_capacity + 1) *
sizeof(_CharT);

            //add following debug statement
           #ifdef STL_STRING_DEBUG
           printf("basic_string::_Rep::_M_dispose called from thread %d with
_M_references value: %d, with address: 0x%x, 
size: %d\n\n", pthread_self(), _M_references,
reinterpret_cast<unsigned>(this), __size);
            #endif

         if (__exchange_and_add(&_M_references, -1) <= 0)
            _M_destroy(__a);
       }

     _CharT*
     _M_refcopy() throw()
     {
       //add following debug printout
        #ifdef STL_STRING_DEBUG
       printf("basic_string::_Rep::_M_refcopy called from thread %d with
_M_references value: %d, with address: 0x%x\n\n", 
pthread_self(), _M_references, reinterpret_cast<unsigned>(this));
        #endif
       __atomic_add(&_M_references, 1);
       return _M_refdata();
     }  // XXX MT


If compiled  without Debug option, the program almost core dump in first
thousands iterations. If compiled 
with Debug option, the program may finish 1 million iteration without any
problem. but one of the error 
message is:

basic_string::_Rep::_M_refcopy called from thread 4 with _M_references
value: 0, with address: 0x28f20

basic_string::_Rep::_M_dispose called from thread 4 with _M_references
value: 1, with address: 0x28f20, size: 24

Allocating memory of size 24 bytes at 0x28f00

basic_string::_Rep::_M_refcopy called from thread 5 with _M_references
value: 1, with address: 0x28f20

basic_string::_Rep::_M_dispose called from thread 5 with _M_references
value: 1, with address: 0x28f20, size: 24

basic_string::_Rep::_M_dispose called from thread 5 with _M_references
value: 0, with address: 0x28f20, size: 24

Deallocating memory at 0x28f20
After Deallocated memory at 0x28f20

basic_string::_Rep::_M_refcopy called from thread 4 with _M_references
value: 0, with address: 0x28f00

basic_string::_Rep::_M_dispose called from thread 4 with _M_references
value: 1, with address: 0x28f00, size: 24

Allocating memory of size 24 bytes at 0x28f20

basic_string::_Rep::_M_refcopy called from thread 5 with _M_references
value: 1, with address: 0x28f00

basic_string::_Rep::_M_dispose called from thread 5 with _M_references
value: 1, with address: 0x28f00, size: 24

basic_string::_Rep::_M_dispose called from thread 5 with _M_references
value: 0, with address: 0x28f00, size: 24

Deallocating memory at 0x28f00
After Deallocated memory at 0x28f00

basic_string::_Rep::_M_refcopy called from thread 4 with _M_references
value: 0, with address: 0x28f20

basic_string::_Rep::_M_dispose called from thread 4 with _M_references
value: 1, with address: 0x28f20, size: 24

Allocating memory of size 24 bytes at 0x28f00

basic_string::_Rep::_M_refcopy called from thread 5 with _M_references
value: 1, with address: 0x28f20

basic_string::_Rep::_M_dispose called from thread 5 with _M_references
value: 1, with address: 0x28f20, size: 24

basic_string::_Rep::_M_dispose called from thread 5 with _M_references
value: 0, with address: 0x28f20, size: 24

Deallocating memory at 0x28f20
After Deallocated memory at 0x28f20

basic_string::_Rep::_M_refcopy called from thread 4 with _M_references
value: 0, with address: 0x28f00

basic_string::_Rep::_M_dispose called from thread 4 with _M_references
value: 1, with address: 0x28f00, size: 24

basic_string::_Rep::_M_refcopy called from thread 5 with _M_references
value: 1, with address: 0x28f00

basic_string::_Rep::_M_dispose called from thread 5 with _M_references
value: 2, with address: 0x28f00, size: 24

Allocating memory of size 24 bytes at 0x28f20

Deallocating memory at 0x28f00
After Deallocated memory at 0x28f00

basic_string::_Rep::_M_dispose called from thread 5 with _M_references
value: -1, with address: 0x28f00, size: 160893

Deallocating memory at 0x28f00
After Deallocated memory at 0x28f00

basic_string::_Rep::_M_refcopy called from thread 4 with _M_references
value: 0, with address: 0x28f20

basic_string::_Rep::_M_dispose called from thread 4 with _M_references
value: 1, with address: 0x28f20, size: 24

Allocating memory of size 24 bytes at 0x28ef0

basic_string::_Rep::_M_refcopy called from thread 5 with _M_references
value: 1, with address: 0x28f20

Segmentation Fault (core dumped)
mlu@ua-01{251}

    Something definitely is wrong when  "_M_references" became "-1" before 
"basic_string::_Rep::_M_dispose()" is called and the size of the memory
chuck became 160893. the address 
0x28f00 is freed twice, then core dump happened.

    We have tested the same program by changing "basic_string" to other
class. Everything worked fine, so 
we don't think it is an error of the test program or STL container.


The fix:
     The only possible problem we can think about is atomic operation of
"_M_references".  Unfortunately, 
we have no knowledge of sparc assemble language code in  
".../include/g++-v3/sparc-solaris-2.8/bits/atomicity.h",
so we change the code of "basic_string.h::_Rep::_M_dispose()" and
"basic_string.h::_Rep::_M_refcopy()" 
by using atomic library available from Mike Bennett(mbennett@netcom.com):
   //add #include<libatom.h> in basic_string.h
 void
 _M_dispose(const _Alloc& __a)
 {
   size_type __size = sizeof(_Rep) + (_M_capacity + 1) * sizeof(_CharT);

#ifdef STL_STRING_DEBUG
   printf("basic_string::_Rep::_M_dispose called from thread %d with
_M_references value: %d, with address: 0x%x, size: 
%d\n\n", pthread_self(), _M_references, reinterpret_cast<unsigned>(this),
__size);
#endif

   //  if (__exchange_and_add(&_M_references, -1) <= 0)
   //  _M_destroy(__a);

   //if(atomicIncrement32_new(&_M_references, -1) == -1)
   short ret = atomicIncrement32_new(&_M_references, -1);

   if (ret == -1)
     _M_destroy(__a);
   else if(ret < -1)
     __throw_out_of_range("basic_string::_Rep::_M_references");
 }  // XXX MT

 void
 _M_destroy(const _Alloc&) throw();

 _CharT*
 _M_refcopy() throw()
 {
#ifdef STL_STRING_DEBUG
   printf("basic_string::_Rep::_M_refcopy called from thread %d with
_M_references value: %d, with address: 0x%x\n\n", 
pthread_self(), _M_references, reinterpret_cast<unsigned>(this));
#endif
   // __atomic_add(&_M_references, 1);
   atomicIncrement32_new(&_M_references, 1);
   return _M_refdata();
 }  // XXX MT

After make these changes, we have tested with different compiler option,
with deferent memory allocator 
and with deferent STL allocator. The core dump has never happened again and
no exception is thrown.


Can somebody verify the problem and provide a fix without libatom from Mike
Bennett?


Thanks,

--Mike Lu (mlu@dynamicsoft.com)


Cut here for TestMalloc.cpp

// foo.cpp - producer/consumer thread test
//
// compile with: g++ -ggdb -o foo foo.cpp -lpthread -lthread
//
// two cases:
//  1. use_malloc   => chews up memory (leak? fragmentation?)
//  2. stl_pthreads => dumps core
//
// case 2: can prevent core dump by doing one of the following:
// - in produce: remove stmt A, comment stmt B, uncomment stmt B.1
// - in consume: remove stmt C
//
// case 1: can prevent memory explosion by doing the same as above

// choose one: malloc-based allocator, or the default with thread safety
//#define __STL_PTHREADS
#define __USE_MALLOC

#ifndef _REENTRANT
#define _REENTRANT
#endif

#include <iostream>
using namespace std;

#include <pthread.h>
#include <thread.h>
#include <stdio.h>
#include "src/malloc_allocator.h"
using namespace std;

#include <string>
#include <list>
#include <vector>

// globals

static list<basic_string<char, char_traits<char>, malloc_allocator<char> > >
foo;
//static vector<string> foo;
static pthread_mutex_t fooLock = PTHREAD_MUTEX_INITIALIZER;
static unsigned max_size = 10;
static int iters = 1000000;

// producer thread

void* produce(void*)
{
  int num = 0;
  while ((iters == 0) || (num < iters))
  {
//	pthread_mutex_lock(&fooLock);
//	{
    // create a new string
    basic_string<char, char_traits<char>, malloc_allocator<char> > str("test
string"); // stmt A
    // add it to the list, unless the list is already too big
    bool added = false;
   	pthread_mutex_lock(&fooLock);
   if (foo.size() < max_size)
 //   if (foo.size() < iters)
    {
     foo.push_back(str); // stmt B
     // foo.push_back("TEST STRING"); // stmt B.1
      added = true;
    }
    // print heartbeat if it was added
    if (added)
    {
      num++;
      if ((num % 1000) == 0)
      {
        putc('p', stdout);
        fflush(stdout);
      }
    }

	pthread_mutex_unlock(&fooLock);
    // force a yield
	if(0 == rand() % 7)
		thr_yield();
	}
  //  pthread_mutex_unlock(&fooLock);
  //}
  return 0;
}

// consumer thread

void* consume(void*)
{
  int num = 0;
  while ((iters == 0) || (num < iters))
  {
    // consume entire list
    pthread_mutex_lock(&fooLock);
    while (foo.size() > 0)
    {
      // get string
//	 string str = foo.front(); // stmt C
//     foo.pop_front();
       basic_string<char, char_traits<char>, malloc_allocator<char> > str =
foo.back(); // stmt C
	   foo.pop_back();
      // print heartbeat
      num++;
      if ((num % 1000) == 0)
      {
        putc('c', stdout);
        fflush(stdout);
      }
    }
    pthread_mutex_unlock(&fooLock);
    // force a yield

	if(0 == rand() % 5)
		thr_yield();
  }
  return 0;
}

//
// main()
//
int main(int argc, char** argv)
{
  // if provided, get number of iterations
  if (argc > 1)
  {
    iters = atoi(argv[1]);
  }

  if (iters != 0)
  {
    printf("iterations: %d\n", iters);
  }
  else
  {
    printf("iterations: infinite\n");
  }

  // two threads
  thr_setconcurrency(2);

  // create and start threads
  pthread_t prod;
  pthread_create(&prod, NULL, produce, NULL);
  pthread_t cons;
  pthread_create(&cons, NULL, consume, NULL);

  // wait for them to exit
  pthread_join(prod, NULL);
  pthread_join(cons, NULL);

  printf("\nending list size = %d\n", foo.size());

  return 0;
}


//Cut here for malloc_allocator.h
/*
  malloc allocator as in Austern's column in C++ Journal
  The Standard Librarian: What are allocators good for?
  Dec 2000. 
  Allocator requirements are specified in Table 32 of the standard, section
20.1
*/

#include <pthread.h>
#include <thread.h>
static pthread_mutex_t debugLock = PTHREAD_MUTEX_INITIALIZER;

template < class T > 
class malloc_allocator
{
 public:
  // standard typedefs required for STL containers
  typedef T value_type;
  typedef value_type* pointer;
  typedef const value_type* const_pointer;
  typedef value_type& reference;
  typedef const value_type& const_reference;
  typedef std::size_t size_type; // size of largest object in this
allocation model
  typedef std::ptrdiff_t difference_type; // difference between any two
pointers in this model.

  // here, address is just a spelt out & !
 
  const_pointer address( const_reference s ) const { return &s; }
  pointer address( reference r ) const { return &r; }

  // the key function for the class.
  // allocates memory for n objects of this type T, but does not construct
them.
  // the number of objects is the first argument, the second argument, an
address is merely
  // a locality hint, that is ignored in this implementation. The return
value points to an adequately
  // large and correctly aligned block of memory. Note that the memory is
uninitialized.

  pointer allocate( size_type n, const_pointer = 0 ) {
    void* p = std::malloc( n * sizeof(T) );
    if( !p ) 
      throw std::bad_alloc();
    pthread_mutex_lock(&debugLock);
    //std::cout << "Allocating memory of size " << n * sizeof(T) << " bytes
at " << p << std::endl;
    //printf("Allocating memory of size %d bytes at 0x%x from thread
%d\n\n", n,  (unsigned int)p, pthread_self());
    printf("Allocating memory of size %d bytes at 0x%x\n\n", n,  (unsigned
int)p);
    pthread_mutex_unlock(&debugLock);
    return static_cast< pointer >( p );
  }

  void deallocate( pointer p, size_type ) {
    pthread_mutex_lock(&debugLock);
    //std::cout << "Deallocating memory at " << p << endl;
    printf("Deallocating memory at 0x%x\n", (unsigned int)p );
    std::free(p);
    //std::cout << "Deallocated memory at " << static_cast<void*>(p) <<
endl;
     printf("After Deallocated memory at 0x%x\n\n", (unsigned)
static_cast<void*>(p));
     //printf("After Deallocated memory at 0x%x from thread %d\n\n",
(unsigned) static_cast<void*>(p), pthread_self() );
    pthread_mutex_unlock(&debugLock);
  }

  void construct( pointer p, const value_type& x ) {
    new(p) value_type(x);
    pthread_mutex_lock(&debugLock);
    std::cout << "Constructed object at " << p << std::endl;
    pthread_mutex_unlock(&debugLock);
  }
  
  void destroy( pointer p ) { 
        //pthread_mutex_lock(&debugLock);
	std::cout << "Destroyed object at " << p << std::endl; 
        //pthread_mutex_unlock(&debugLock);
	p -> ~value_type(); 
    }

  //constructors
  malloc_allocator() {} // nothing to do!
  malloc_allocator( const malloc_allocator& ) {} // nothing to do!
  ~malloc_allocator() {} // nothing to do!


  // the largest size that can be meaninfully passed to the allocate member
function
  size_type max_size() const { 
    return static_cast< size_type >(-1)/ sizeof(T);
  }

  // rebinding
  // an allocator for one type should have corresponding allocators of other
types
  template < class U >
    malloc_allocator( const malloc_allocator< U >& ) {}

  template < class U >
    struct rebind {
      typedef malloc_allocator< U > other;
    };
  
 private:
  // disable to prevent accidental use
  void operator=(const malloc_allocator& );
};

// in this implementation, any two allocator objects corresponding to a
given type are interchangeable
// any two allocator objects corresponding to a given type are
interchangeable. i.e. memory allocated
// from one can be deallocated via the other.

template <class T>
inline bool operator==( const malloc_allocator<T>&, const
malloc_allocator<T>& ) {
  return true;
}
template < class T >
inline bool operator!=( const malloc_allocator<T>&, const
malloc_allocator<T>& ) {
  return false;
}


// template definition above uses sizeof(T) and T& which are illegal when T
is void.
// specialization to void is needed.

template <>
class malloc_allocator< void >
{
  typedef void value_type;
  typedef value_type* pointer;
  typedef const value_type* const_pointer;

  template < class U >
    struct rebind {
      typedef malloc_allocator< U > other;
    };
};


//Cut here for Makefile
# Makefile for dstest on Solaris

all: dstest 

CC = /home/mlu/gcc/bin/c++  

#CC=/export/gcc3.0/usr/local/bin/c++

RM = rm -f
CFLAGS = -I/home/cppuabuild/prods/atomics_0.1/include -Wall -O2 -DSUNOS
-D_POSIX_THREADS -D_POSIX_THREAD_SAFE_FUNCTIONS -D_PTHREADS -D_REENTRANT 

OBJS = TestMalloc.o

TestMalloc.o: TestMalloc.cpp
	$(CC) -c $(CFLAGS) $<

dstest:  $(OBJS)
	@rm -f dstest 
	$(CC) /home/cppuabuild/prods/ptmalloc/ptmalloc.o $(CFLAGS) $(OBJS)
-L/home/cppuabuild/prods/atomics_0.1/src -latom -lposix4 -lpthread -lnsl
-lsocket -lthread -o $@
#	$(CC) $(CFLAGS) $(OBJS) -L/home/cppuabuild/prods/atomics_0.1/src
-latom -lposix4 -lpthread -lnsl -lsocket -lthread -o $@

clean:
	$(RM) $(OBJS) dstest 



More information about the Gcc-bugs mailing list