This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: SIGSEGV: exceptions with thread specific data


At 23:47 1999.03.15 -0500, you wrote:
>On Fri, 12 Mar 1999 15:36:31 +0300 (MEST), Andrey Slepuhin wrote:
>>Hi,
>>
>>I have the same bug in the following configuration:
>>linux-2.2.1
>>egcs-1.1.1
>>binutils-2.9.1.0.19a
>>glibc-2.0.7-29 (from RH-5.2).
>>
>>I modified original t.cc file in the following way:
>[...]
>
>I'm afraid the malloc hooks are different in 2.1 and so your code
>won't compile, and I don't have time right now to fix it up.  There is
>some automatic malloc checking which can be enabled by setting
>MALLOC_CHECK_ in the environment, doing that I got a crash from inside
>the eh functions, only.
>
>So I don't think the bug you see is the same one I see.  I still think
>there is a memory corruption bug in egcs' EH routines.
>

I also think that the bug is in glibc not egcs so I didn't post my view in
this list yet (to the author of the initial post only). I was able to
reproduce the bug even without usage of exceptions (by replacing
throw-catch with several malloc-free). 

But because this thread still is alive that's my theory:

( I send this to the author of LinuxThreads - Xavier Leroy - also but
didn't recieved the answer yet).

==============================================

At 19:11 1999.03.10 +0100, you wrote:
>I would like to report a problem (possible bug) in exception
>code, but it could also be a problem in glibc-2.0.7 or linuxthreads.
>
>The problem is that when working with threads with threadspecific data
>and mixing it little bit with exception handling the program SIGSEGV
>in exit() function (when trying to deallocate something).
>
>I am running RedHat-5.2 with egcs-1.0.3 and glibc-2.0.7,
>linuxthreads-0.7
>
>
>Please let me know if this problem exists on glibc-2.1.X or egcs-1.1.X
>or if it is a bug of egcs or glibc or linuxthreads
>
>Please reply personally as I don't get this mailing list
>
>Thanks in advance
>Pavel Krauz <kra@cri.cz>
>
>
>
>Here is the bug reproduction:
>
>Makefile:
>-----------------------------------------
>
>LDFLAGS=
>LDLIB=-lpthread -lstdc++
>CFLAGS=-D_REENTRANT
>CXX=egcs
>
>all: t
>
>
>.cc.o:
>        $(CXX) ${CFLAGS} -c -o $@ $<
>
>t: t.o
>        $(CXX) -o $@ $< -L.. $(LDLIB)
>
>-----------------------------------------
>t.cc
>-----------------------------------------
>#include <pthread.h>
>#include <stdio.h>
>
>pthread_key_t key;
>
>void *main_thr(void *)
>{
>        int i;
>
>        // here linuxthreads allocate through calloc thread specific
>data
>        pthread_setspecific(key, "asd");// try move after catch
>
>        try {
>                throw 1;                // try disable
>        }
>        catch (int i) {
>                printf("cought\n");
>        }
>        pthread_setspecific(key, NULL);
>}
>
>void main()
>{
>        pthread_t t_id;
>
>        pthread_key_create(&key, NULL);
>
>        pthread_create(&t_id, NULL, main_thr, NULL);
>        pthread_join(t_id, NULL);
>        pthread_key_delete(key);        // try disable
>
>        exit(0);
>}
>
>

I tried and successfuly reproduced the bug. Of course exception handling
was the first suspect because this thing is inherently buggy although the
actual cause of sigsegv is the heap corruption. After lot of work ( that
was the first time I was debuging MT program with DDD) I found the following:

pthread_key_delete() does this:

  do {
    if (th->p_specific[idx1st] != NULL)
      th->p_specific[idx1st][idx2nd] = NULL;
    th = th->p_nextlive;
  } while (th != self);


The chain of threads linked by p_nextlive is handled by the manager thread
which does that whithout any locking or synchronization with other threads.
So when the thread exits its p_specific pointers are free'd, but when the
main thread enters pthread_key_delete that already terminated thread isn't
removed from the chain yet and pthread_key_delete in the line 

th->p_specific[idx1st][idx2nd] = NULL

puts NULL in the free'd memory area and so corrupts heap, because the heap
routines places some information in the begining of free'd memory blocks
(pointers to previous and next free memory blocks).


I grep'ed all sources of linuxthreads and found that pthread_key_delete is
the only function (besides thread manager code) which uses p_nextlive. I
supose that people rarely (if ever) use this function - thread specific
codes usualy are allocated when the program starts and never deleted -
therefore this obvious design bug wasn't noticed so far.

=================================


Rimantas Plaipa,

Department of Biochemistry and Biophysics,
Vilnius University,
Ciurlionio 21/27, Vilnius 2009, Lithuania.

E-mail: rp010gf@voruta.vu.lt
Phone: (370-2)-650381 
Fax: (370-2)-235049



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]