This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: GCC's -fsplit-stack disturbing Mach's vm_allocate
- From: Justus Winter <4winter at informatik dot uni-hamburg dot de>
- To: svante dot signell at gmail dot com, Svante Signell <svante dot signell at gmail dot com>, "Samuel Thibault" <samuel dot thibault at gnu dot org>
- Cc: bug-hurd at gnu dot org, fotis dot koutoulakis at gmail dot com, "Roland McGrath" <roland at hack dot frob dot com>, "Ian Lance Taylor" <iant at google dot com>, gcc-patches at gcc dot gnu dot org, "Thomas Schwinge" <thomas at codesourcery dot com>
- Date: Sat, 26 Apr 2014 08:53:08 +0200
- Subject: Re: GCC's -fsplit-stack disturbing Mach's vm_allocate
- Authentication-results: sourceware.org; auth=none
- References: <8761x7nkks dot fsf at kepler dot schwinge dot homeip dot net> <87r4fumx2g dot fsf at kepler dot schwinge dot homeip dot net> <CAKOQZ8wZOf8MBdVKiyDcggFBs3A48+QckAoB7XJm+b+E3rkAxQ at mail dot gmail dot com> <87ip10o90k dot fsf at kepler dot schwinge dot homeip dot net> <20140404191416 dot GG5350 at type> <1397027146 dot 1276 dot 29 dot camel at G3620 dot my dot own dot domain> <87fvln6jjp dot fsf at schwinge dot name> <20140416220345 dot GZ5545 at type dot youpi dot perso dot aquilenet dot fr> <20140418080311 dot GA5626 at type dot bordeaux dot inria dot fr> <1398328750 dot 568 dot 74 dot camel at G3620 dot my dot own dot domain>
Quoting Svante Signell (2014-04-24 10:39:10)
> On Fri, 2014-04-18 at 10:03 +0200, Samuel Thibault wrote:
> > Samuel Thibault, le Thu 17 Apr 2014 00:03:45 +0200, a Ãcrit :
> > > Thomas Schwinge, le Wed 09 Apr 2014 09:36:42 +0200, a Ãcrit :
> > > > Well, the first step is to verify that TARGET_THREAD_SPLIT_STACK_OFFSET
> > > > and similar configury is correct for the Hurd,
> > >
> > > I have added the corresponding field, so we can just use the same offset
> > > as on Linux.
> >
> > I have uploaded packages on http://people.debian.org/~sthibault/tmp/ so
> > Svante can try setting TARGET_THREAD_SPLIT_STACK_OFFSET to 0x30 with
> > them.
>
> Status report:
> - Without split stack enabled around 70 libgo tests pass and 50 fails,
> most of them with a segfault.
> - Enabling split stack and using the libc Samuel built all 122 libgo
> tests fail with a segfault.
> - In both cases simple go programs work, like hello+sqrt.go below.
> - The segfault seems to be located at the same code piece according to
> gdb (maybe due to exception handling)
>
> cat hello+sqrt.go
> package main
> import (
> "fmt"
> )
> func main() {
> fmt.Printf("Hello, world. Sqrt(2) = %v\n", Sqrt(2))
> }
How is that even a valid go program? Sqrt is not defined.
> I have not been able to use a local go library function, e.g. package
> newmath, and the go frontend is not yet available for GNU/Hurd.
What do you mean exactly by "local go library function"?
> However, it seems that something triggers the segfaults when running
> make -C build/i486-gnu/libgo check (both with and w/o split-stack)
> while setting the keep parameter in ./src/libgo/testsuite/gotest
> and running them manually some of them work?? As a first glance, about
> the same number of tests succeeds with and w/o split stack :) Some of
> the failing tests still seems random, sometimes they pass, sometimes
> they fail.
For reference, here are my notes about one of these crashes (Svante,
is this still current?):
~~~ snip ~~~
First, there is a rpctrace bug (or, i'm misinterpreting the output):
93<--142(pid1182)->dir_lookup ("etc/hostname" 1 0) = 0 1 "" 158<--160(pid1182)
Here, we do a dir_lookup and get port 158.
158<--160(pid1182)->io_read_request (-1 255) = 0 "hurd-2013\n"
158<--160(pid1182)->io_readable_request () = 0 0
Here, we use it to do stuff with that file.
task130(pid1182)->mach_port_deallocate (pn{ 23}) = 0
Here, we deallocate the port. Note how the port name (pn?) says 23,
even though it's clearly port 158 that is getting deallocated, b/c we
get port 158 back from the next rpc:
93<--142(pid1182)->dir_lookup ("lib/i386-gnu/libnss_files.so.2" 4194305 0) = 0 1 "" 158<--157(pid1182)
Now, the get to the real issue. From the backtrace (http://paste.debian.net/95410/)
we know that it segfaults in mmap:
Program received signal SIGSEGV, Segmentation fault.
0x019977b7 in _hurd_intr_rpc_msg_in_trap () at intr-msg.c:132
132 intr-msg.c: No such file or directory.
[...]
Thread 4 (Thread 1205.4):
#0 0x019977b7 in _hurd_intr_rpc_msg_in_trap () at intr-msg.c:132
err = <optimized out>
err = <optimized out>
user_option = 3
user_timeout = 48
m = 0x532370
msgh_bits = 0
remote_port = 268509186
msgid = 21118
save_data = <optimized out>
__PRETTY_FUNCTION__ = "_hurd_intr_rpc_mach_msg"
#1 0x00000005 in ?? ()
No symbol table info available.
#2 0x01a7a8dd in __mmap (addr=0x0, len=49880, prot=5, flags=33, fd=8, offset=0)
at ../sysdeps/mach/hurd/mmap.c:92
__ulink = {resource = {next = 0x0, prevp = 0x2cfcc}, thread = {next = 0x0,
prevp = 0x1b81c5c}, cleanup = 0x19a2c70 <_hurd_port_cleanup>, cleanup_data = 0x99}
__ctty_ulink = {resource = {next = 0x0, prevp = 0x19fc6bc <_int_malloc+12>}, thread = {
next = 0x17, prevp = 0x5}, cleanup = 0x0, cleanup_data = 0x700f2}
__result = <optimized out>
descriptor = 0x1b5e467 <__io_map+7>
robj = 0
wobj = 4608
err = <optimized out>
vmprot = 0
memobj = <optimized out>
mapaddr = 0
#3 0x00007b27 in _dl_map_object_from_fd (name=name@entry=0x532b58 "libnss_files.so.2", fd=8,
[...]
Note how weird the remote_port = 268509186 looks. Here is the rpctrace again:
158<--157(pid1182)->term_getctty () = 0xfffffed1 ((ipc/mig) bad request message ID)
158<--157(pid1182)->io_read_request (-1 512) = 0 "\x7fELF\x01\x01\x01"
158<--157(pid1182)->io_stat_request () = 0 {23 5 0 458994 0 1885249733 0 33188 1 0 0 46752 0 1398335701 0 1397789836 0 1398160744 0 8192 96 0 0 0 0 0 0 0 0 0 0 0}
158<--157(pid1182)->io_map_request () = 0 133<--160(pid1182) (null)
So we call io_map, get a read memobj and no write memobj.
task130(pid1182)->vm_map (0 49880 0 1 133<--160(pid1182) 0 1 5 7 1) = 0 2453504
We map that somewhere.
task130(pid1182)->mach_port_deallocate (pn{ 25}) = 0
Deallocate the port. Again, for some strange reason 133 == pn{ 25}.
158<--157(pid1182)->io_map_request () = 0 133<--162(pid1182) (null)
Some more io_map.
task130(pid1182)->vm_map (2498560 8192 0 0 133<--162(pid1182) 40960 1 3 7 1) = 0x3 ((os/kern) no space available)
task130(pid1182)->vm_deallocate (2498560 8192) = 0
Hum?
task130(pid1182)->vm_map (2498560 8192 0 0 133<--162(pid1182) 40960 1 3 7 1) = 0 2498560
task130(pid1182)->mach_port_deallocate (pn{ 25}) = 0
Success!
task130(pid1182)->mach_port_deallocate (pn{ 23}) = 0
Get rid of port 158. That looks rather allright from the ipc perspective.
Why do we see the process crash at presumably this very moment? I guess it
could still crash here due to the fact that rpctrace can not differentiate
between different threads in the tracee.
task130(pid1182)->vm_protect (2498560 4096 0 1) = 0
93<--142(pid1182)->dir_lookup ("etc/hosts" 4194305 0) = 0 1 "" 158<--160(pid1182)
158<--160(pid1182)->term_getctty () = 0xfffffed1 ((ipc/mig) bad request message ID)
158<--160(pid1182)->io_stat_request () = 0 {23 5 0 81987 0 1368845833 0 33188 1 0 0 248 0 1398335660 0 1368797592 0 1368797592 0 8192 8 0 0 0 0 0 0 0 0 0 0 0}
158<--160(pid1182)->io_read_request (-1 8192) = 0 "127.0.0.1 localhost\n127.0.1.1 hurd-2013.my.own.domain hurd-2013\n\n# The following"
158<--160(pid1182)->io_read_request (-1 8192) = 0 ""
task130(pid1182)->mach_port_destroy (pn{ 24}) ...159
task130(pid1182)->mach_port_deallocate (pn{ 23}) ...134
159... = 0
134... = 0
task130(pid1182)->vm_allocate (0 36864 1) = 0 2506752
task130(pid1182)->mach_port_deallocate (pn{ 11}) = 0
task130(pid1182)->mach_port_deallocate (pn{ 21}) ...134
task130(pid1182)->mach_port_deallocate (pn{ 11}) ...159
134... = 0
159... = 0
task130(pid1182)->vm_allocate (33562796 8364 0) = 0x3 ((os/kern) no space available)
task130(pid1182)->vm_allocate (33571160 8364 0) = 0 33570816
task130(pid1182)->mach_port_allocate (1) = 0 pn{ 23}
task130(pid1182)->mach_port_insert_right (pn{ 23} 133) = 0
task130(pid1182)->mach_port_set_qlimit (pn{ 23} 1) = 0
task130(pid1182)->thread_create () = 0 160<--157(pid1182)
task130(pid1182)->vm_protect (33570816 1 0 0) = 0
146<--150(pid-1)-> 2400 ( thread151(pid1182) task130(pid1182) 1 2 33557926) ...159
139<--144(pid1182)->proc_dostop_request ( thread138(pid1182)) = 0
93<--142(pid1182)->dir_lookup ("servers/crash" 0 0) = 0 1 "" 163<--161(pid1182)
task130(pid1182)->mach_port_mod_refs (pn{ 6} 0 1) = 0
109<--141(pid1182)->dir_mkfile (18 384) = 0 165<--164(pid1182)
163<--161(pid1182)->crash_dump_task ( task130(pid1182) 165<--164(pid1182) 11 2 2 1 2 33557926 118<--145(pid1182)) ...134
159-> 71 ();
134... = 0
Child 1182 Segmentation fault
~~~ snip ~~~
Justus