This is the mail archive of the gcc-prs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: target/8004: All C++ binaries crash in __register_frame_info_baseson Sparc Solaris 2.7


The following reply was made to PR target/8004; it has been noted by GNATS.

From: "Aaron Williams" <aaron_williams@net.com>
To: davem@gcc.gnu.org, gcc-bugs@gcc.gnu.org, gcc-prs@gcc.gnu.org,
        nobody@gcc.gnu.org, gcc-gnats@gcc.gnu.org
Cc:  
Subject: Re: target/8004: All C++ binaries crash in __register_frame_info_bases
 on Sparc Solaris 2.7
Date: Tue, 01 Oct 2002 21:13:03 -0700

 I should have all the required patches installed.  I have Sun's patch 
 cluster as of 9/11 installed.  I believe this may be due to a bug in 
 ld.so.  I am attaching a copy of an email I received from someone else 
 who appears to have the same problem.  My current workaround is to use 
 Sun's /usr/ccs/bin/ld instead of the one from binutils 2.13.
 
 I am having other stability problems with gcc 3.2 on Solaris and will 
 likely go back to 2.95.3. Konqueror in KDE 3.0.3 and qt-3.0.5 compiled 
 with gcc 3.2 is unstable, for example.  I was hoping 3.2 would fix a 
 problem where I see static destructors being called in a shared library 
 when the shared library is no longer present (causing a crash in the 
 exit handler).  This too, unfortunately sound like it might be a Solaris 
 bug.
 
 -Aaron
 
 Email follows:
 
 Dear Aaron Williams,
 
 
 >> After searching the web regarding a problem I am having with GCC 3.2 on 
 >> Solaris I came across your bug report at :
 >> 
 >> http://www.geocrawler.com/lists/3/GNU/361/0/9566991/
 >> 
 >> I am experiencing exactly the same problem but with Solaris 2.7.  I was 
 >> wondering if you were successful in resolving this problem and if so how you 
 >> did it?
 >  
 >
 
 one of my colleagues, Christian Ehrhardt <ehrhardt@mathematik.uni-ulm.de>
 analyzed this problem further on and he believes that it is a bug of
 ld.so.1. Here is his report:
 
 
 >> The dynamic runtime linker fails to relocate valid shared libraries
 >> generated by recent versions of GNU-ld. /usr/local/bin/ld is from
 >> the GNU binutils-2.13 package:
 >> 
 >>        turing$ /usr/local/bin/ld -v
 >>        GNU ld version 2.13
 >> 
 >> How to reproduce:
 >> 
 >> Script started on Fri Sep 20 19:46:43 2002
 >> turing$ cat t2.c
 >> struct object {
 >>         int i;
 >>         int j;
 >>         int k;
 >>         int l;
 >> };
 >> 
 >> 
 >> 
 >> int func ()
 >> {
 >>         static struct object x;
 >>         struct object * p;
 >>         p = &x;
 >>         p->i = 3;
 >>         return 0;
 >> }
 >> 
 >> turing$ cat t3.c
 >> extern int func();
 >> 
 >> int main ()
 >> {
 >>         func();
 >>         return 0;
 >> }
 >> turing$ cat Makefile.sun
 >> .PHONY: clean
 >> all:    a.out
 >> t2.o:   t2.c
 >>         CC  -c -KPIC t2.c
 >> libt2.so:       t2.o
 >>         /usr/local/bin/ld -G t2.o -olibt2.so
 >> t3.o:   t3.c
 >>         CC  -c t3.c
 >> a.out: libt2.so t3.o
 >>         CC  -lt2 t3.o -L. -R.
 >> clean:
 >>         rm -f *.so *.o a.out
 >> 
 >> turing$ cat Makefile
 >> .PHONY: clean
 >> all:    a.out
 >> t2.o:   t2.c
 >>         gcc -c -fPIC t2.c
 >> libt2.so: t2.o
 >>         /usr/local/bin/ld -nostdlib -shared -olibt2.so t2.o
 >> a.out: libt2.so t3.c
 >>         gcc -nostdlib t3.c libt2.so -L. -R. 
 >> clean:
 >>         rm -f *.so *.o a.out core
 >> 
 >> turing$ make -f Makefile.sun clean
 >> rm -f *.so *.o a.out
 >> turing$ make -f Makefile.sun 
 >> CC  -c -KPIC t2.c
 >> /usr/local/bin/ld -G t2.o -olibt2.so
 >> CC  -c t3.c
 >> CC  -lt2 t3.o -L. -R.
 >> turing$ a.out
 >> Segmentation Fault (core dumped)
 >> turing$ exit
 >> 
 >> script done on Fri Sep 20 19:47:32 2002
 >> 
 >> Note that I compiled everything with /opt/SUNCspro/bin/CC to
 >> rule out bugs in gcc. This problem can be reproduced using
 >> the second Makefile and gcc with an even smaller executable.
 >> 
 >> 
 >> Analyzing the core shows the following:
 >> turing$ pmap core | grep libt2.so
 >> FF370000      8K read/exec         libt2.so
 >> FF380000      8K read/write/exec   libt2.so
 >> 
 >> Script started on Fri Sep 20 19:53:10 2002
 >> turing$ gdb a.out core
 >> GNU gdb 5.0
 >> [ ... ]
 >> #0  0xff370318 in __1cEfunc6F_i_ ()
 >>    from /home/thales/ehrhardt/ld.so.1-bug/./libt2.so
 >> (gdb) disass
 >> Dump of assembler code for function __1cEfunc6F_i_:
 >> 0xff3702e0 <__1cEfunc6F_i_>:    save  %sp, -112, %sp
 >> 0xff3702e4 <__1cEfunc6F_i_+4>:  call  0xff3702ec <__1cEfunc6F_i_+12>
 >> 0xff3702e8 <__1cEfunc6F_i_+8>:  sethi  %hi(0), %o1
 >> 0xff3702ec <__1cEfunc6F_i_+12>: mov  %o1, %o1   ! 0x0
 >> 0xff3702f0 <__1cEfunc6F_i_+16>: add  %o7, %o1, %o1
 >> 0xff3702f4 <__1cEfunc6F_i_+20>: st  %o1, [ %fp + -12 ]
 >> 0xff3702f8 <__1cEfunc6F_i_+24>: sethi  %hi(0x10000), %o0
 >> 0xff3702fc <__1cEfunc6F_i_+28>: or  %o0, 0xc4, %o0      ! 0x100c4
 >> 0xff370300 <__1cEfunc6F_i_+32>: add  %o1, %o0, %l7
 >> 0xff370304 <__1cEfunc6F_i_+36>: sethi  %hi(0), %g1
 >> 0xff370308 <__1cEfunc6F_i_+40>: or  %g1, 4, %g1 ! 0x4
 >> 0xff37030c <__1cEfunc6F_i_+44>: ld  [ %l7 + %g1 ], %o0
 >> 0xff370310 <__1cEfunc6F_i_+48>: st  %o0, [ %fp + -8 ]
 >> 0xff370314 <__1cEfunc6F_i_+52>: mov  3, %o1
 >> 0xff370318 <__1cEfunc6F_i_+56>: st  %o1, [ %o0 ]
 >> 0xff37031c <__1cEfunc6F_i_+60>: clr  [ %fp + -4 ]
 >> 0xff370320 <__1cEfunc6F_i_+64>: mov  %g0, %i0
 >> 0xff370324 <__1cEfunc6F_i_+68>: ret 
 >> 0xff370328 <__1cEfunc6F_i_+72>: restore 
 >> 0xff37032c <__1cEfunc6F_i_+76>: mov  %g0, %i0
 >> 0xff370330 <__1cEfunc6F_i_+80>: ret 
 >> 0xff370334 <__1cEfunc6F_i_+84>: restore 
 >> ---Type <return> to continue, or q <return> to quit---
 >> End of assembler dump.
 >> (gdb) bt
 >> #0  0xff370318 in __1cEfunc6F_i_ ()
 >>    from /home/thales/ehrhardt/ld.so.1-bug/./libt2.so
 >> #1  0x10884 in main ()
 >> (gdb) info reg o0
 >> o0             0xff370000       -13172736
 >> (gdb) info reg o1
 >> o1             0x3      3
 >> (gdb) info reg l7
 >> l7             0xff3803a8       -13106264
 >> (gdb) info reg g1
 >> g1             0x4      4
 >> (gdb) turing$ exit
 >> 
 >> script done on Fri Sep 20 19:54:46 2002
 >> 
 >> Looking back at function func from t2.c shows:
 >> int func ()
 >> {
 >> 	static struct object x;
 >> 	struct object * p;
 >> 	p = &x;
 >> 	p->i = 3;      <====== crash is here.
 >> 	return 0;
 >> }
 >> 
 >> The value of the pointer p is obviously in register o0, i.e. it is
 >> 0xff370000. This is precisely the BASE address where the shared library
 >> libt2.so has been mapped to. Register l7 contains the base address of
 >> the .got section (the global offset table of this library). The
 >> questionable address is loaded from offset 4 into the global offset table.
 >> 
 >> Looking at the contents of the global offset table in the shared
 >> library shows the following:
 >> 
 >> turing$ elfdump -G libt2.so 
 >> 
 >> Global Offset Table: 2 entries
 >>  ndx     addr      value    reloc              addend   symbol
 >> [00000]  000103a8  00010338 R_SPARC_NONE       00000000 
 >> [00001]  000103ac  000103b0 R_SPARC_RELATIVE   00000000 
 >> turing$ 
 >> 
 >> Note that we have indeed
 >> %l7(0xff3803a8) = Offset of .got(0x000103a8) + library base address(0xFF370000)
 >> 
 >> The Solaris Linker and Libraries Guide (freshly downloaded from
 >> docs.sun.com) hast this explanation about R_SPARC_RELATIVE:
 >> 
 >> |Some relocation types have semantics beyond simple calculation:
 >> |[ ... ]
 >> |R_SPARC_RELATIVE
 >> |  Created by the link-editor for dynamic objects. Its offset member
 >> |  gives the location within a shared object that contains a value
 >> |  representing a relative address. The runtime linker computes the
 >> |  corresponding virtual address by adding the virtual address at which
 >> |  the shared object is loaded to the relative address. Relocation
 >> |  entries for this type must specify 0 for the symbol table index.
 >> 
 >> This means that the value at offset 0x4 in the global offset
 >> Table should be
 >>       library base address  + Value in .got
 >>       0xFF370000            + 0x000103B0     = 0xFF3803B0
 >> after relocation. However looking at the value of register o0 we
 >> see that the .got section obviously contains the value 0xFF37B000
 >> instead.
 >> 
 >> Checking the source code of the /usr/lib/ld.so.1 from Solaris 7 (the
 >> latest that we currently have access to) I found the following
 >> concerning R_SPARC_RELATIVE relocations.
 >> 
 >> os_net/src_ws/usr/src/cmd/sgs/rtld/sparc/sparc_elf.c function elf_reloc:
 >> | if ((rtype == R_SPARC_RELATIVE) &&
 >> |     !(FLAGS(lmp) & FLG_RT_FIXED) && !dbg_mask) {
 >> |         if (relacount) {
 >> |                 relbgn = elf_reloc_relacount(relbgn, relacount,
 >> |                         relsiz, basebgn);
 >> | 
 >> |                 relacount = 0;
 >> |         } else
 >> |                 relbgn = elf_reloc_relative(relbgn, relend,
 >> |                         relsiz, basebgn, etext, emap);
 >> |         if (relbgn >= relend)
 >> |                 break;
 >> |         rtype = ELF_R_TYPE(((Rel *)relbgn)->r_info);
 >> | }
 >> 
 >> i.e. there are two functions that may be called to perform an
 >> R_SPARC_RELATIVE relocation, elf_reloc_relacount or elf_reloc_relative.
 >> 
 >> However, these function do fundamentally different things to resolve
 >> these relocations:
 >> 
 >> elf_reloc_relative (in file common_sparc.c) does the following:
 >> 
 >> | /*
 >> |  * Perform the actual relocation.
 >> |  */
 >> | *((ulong_t *) roffset) +=
 >> |     basebgn + (long)(((Rel *)relbgn)->r_addend);
 >> 
 >> whereas elf_reloc_relacount (in file common_sparc.c) does this:
 >> 
 >> | /*
 >> |  * Perform the actual relocation.
 >> |  */
 >> | *((ulong_t *) roffset) =
 >> |     basebgn + (long)(((Rel *)relbgn)->r_addend);
 >> 
 >> Note the assignment (``='') instead of the addition ``+=''.
 >> I highly suspect that changing this will fix the problem.
 >  
 >
 
 Regards, Andreas Borchert.
 
 -- Andreas Borchert, Universitaet Ulm, SAI, Helmholtzstr. 18, 89069 Ulm, 
 Germany E-Mail: borchert@mathematik.uni-ulm.de WWW: 
 http://www.mathematik.uni-ulm.de/sai/borchert/ PGP: 
 http://www.mathematik.uni-ulm.de/sai/borchert/pgp.html
 
 
 
 
 davem@gcc.gnu.org wrote:
 
 >Synopsis: All C++ binaries crash in __register_frame_info_bases on Sparc Solaris 2.7
 >
 >State-Changed-From-To: open->feedback
 >State-Changed-By: davem
 >State-Changed-When: Tue Oct  1 20:59:26 2002
 >State-Changed-Why:
 >    Do you have all the fixed installed which are mentioned
 >    in:
 >    
 >    http://gcc.gnu.org/install/specific.html#sparc-sun-solaris2.7
 >    
 >    These are necessary to get gcc working on 2.7
 >
 >http://gcc.gnu.org/cgi-bin/gnatsweb.pl?cmd=view%20audit-trail&database=gcc&pr=8004
 >  
 >
 
 


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]