Bug 61303 - gccgo: segfault, regression since 4.8.2
Summary: gccgo: segfault, regression since 4.8.2
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: go (show other bugs)
Version: 4.9.2
: P3 normal
Target Milestone: ---
Assignee: Ian Lance Taylor
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-05-24 14:16 UTC by Maciej Bliziński
Modified: 2015-11-26 00:25 UTC (History)
0 users

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Maciej Bliziński 2014-05-24 14:16:15 UTC
> ./bin/gen-catalog-index --help
Usage of ./bin/gen-catalog-index:
  -arch="sparc": { sparc | i386 }
  -catalog-release="unstable": e.g. unstable, bratislava, kiel, dublin
  -os-release="SunOS5.10": e.g. SunOS5.10
  -output="catalog": The name of the file to generate.
  -pkgdb-url="http://buildfarm.opencsw.org/pkgdb/rest": Web address of the pkgdb app.
> ./bin/gen-catalog-index -output foo
2014/05/24 13:50:23 Making a request to http://buildfarm.opencsw.org/pkgdb/rest/catalogs/unstable/sparc/SunOS5.10/for-generation/as-dicts/
2014/05/24 13:50:32 Retrieved {unstable sparc SunOS5.10} with 3704 packages
2014/05/24 13:50:32 Writing {unstable sparc SunOS5.10} to foo
2014/05/24 13:50:32 Catalog index written successfully
> cp ./bin/gen-catalog-index{,.bak}
> rm ./bin/gen-catalog-index
> make bin/gen-catalog-index
gccgo -g -o bin/gen-catalog-index src/gen-catalog-index/gen-catalog-index.o src/opencsw/diskformat/diskformat.o
> rm foo
> ./bin/gen-catalog-index -output foo
2014/05/24 13:51:05 Making a request to http://buildfarm.opencsw.org/pkgdb/rest/catalogs/unstable/sparc/SunOS5.10/for-generation/as-dicts/
Segmentation Fault (core dumped)
> dbx - core
 Corefile specified executable: "/home/raos/opencsw/.buildsys/v2/go/bin/gen-catalog-index"
 For information about new features see `help changes'
 To remove this message, put `dbxenv suppress_startup_message 7.6' in your .dbxrc
 Reading gen-catalog-index
 dbx: warning: core object name "gen-catalog-ind" matches
 object name "gen-catalog-index" within the limit of 14. assuming they match
 core file header read successfully
 Reading ld.so.1
 Reading libgo.so.5.0.0
 dbx: warning: unknown location expression code (0x9e)
 Reading libm.so.2
 Reading libgcc_s.so.1
 Reading libc.so.1
 Reading libpthread.so.1
 Reading libsocket.so.1
 Reading libnsl.so.1
 Reading librt.so.1
 Reading libaio.so.1
 Reading libmd.so.1
 Reading libc_psr.so.1
 t@3 (l@3) terminated by signal SEGV (no mapping at the fault address)
 0xfe730e98: _memcpy+0x04bc:     st       %o4, [%o0]
 (dbx) where
 current thread: t@3
 =>[1] _memcpy(0x0, 0xff0d1f6c, 0x61, 0x60100, 0x30, 0x0), at 0xfe730e98
   [2] runtime_netpoll(0x1, 0x80, 0xce1e1e3b, 0x0, 0xff0d1ff0, 0x180), at 0xfeb8e004
   [3] schedule(0x1, 0xfef32f6c, 0x0, 0x0, 0xff0f3d28, 0xff0f5800), at 0xfeb92e90
   [4] runtime_mstart(0xde811800, 0xce1e2000, 0x0, 0xce310a38, 0x4, 0xce310a38), at 0xfeb93130
 (dbx) threads
       t@1  a  l@1   ?()   sleep on 0x3f3c0  in  __lwp_park()
       t@2  a  l@2   ?()   sleep on 0x3f3f8  in  __lwp_park()
 o>    t@3  a  l@3   ?()   signal SIGSEGV in  _memcpy()
       t@4  a  l@4   ?()   sleep on 0x3f810  in  __lwp_park()
 (dbx) quit

This is a regression since gcc-4.8.2. The same go code works when compiled with gccgo-4.8.2 and fails when compiled with gccgo-4.9.0.
Comment 1 Maciej Bliziński 2014-05-24 15:23:31 UTC
I did some more testing. Running it under truss makes the binary work:

> truss -f -o /tmp/crashtest.truss bin/crashtest
2014/05/24 17:15:34 Making a request to http://buildfarm.opencsw.org/pkgdb/rest/catalogs/unstable/sparc/SunOS5.10/for-generation/as-dicts/
2014/05/24 17:15:44 Retrieved {unstable sparc SunOS5.10} with 3704 packages
2014/05/24 17:15:44 Writing {unstable sparc SunOS5.10} to catalog
2014/05/24 17:15:44 Catalog index written successfully

Running the same binary without truss makes it crash:

> bin/crashtest
2014/05/24 17:16:03 Making a request to http://buildfarm.opencsw.org/pkgdb/rest/catalogs/unstable/sparc/SunOS5.10/for-generation/as-dicts/
[hangs and eventually segfaults]

Running the same binary under gdb makes it segfault immediately:

> gdb bin/crashtest
GNU gdb (GDB) 7.7
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "sparc-sun-solaris2.10".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from bin/crashtest...done.
(gdb) run
Starting program: /home/maciej/src/opencsw-gar/v2/go/bin/crashtest
[Thread debugging using libthread_db enabled]
[New Thread 1 (LWP 1)]
[New LWP    2        ]
[New LWP    3        ]
2014/05/24 17:22:40 Making a request to http://buildfarm.opencsw.org/pkgdb/rest/catalogs/unstable/sparc/SunOS5.10/for-generation/as-dicts/
[New Thread 3 (LWP 3)]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 3 (LWP 3)]
0xfe730e98 in memcpy () from /platform/SUNW,SPARC-Enterprise-T5220/lib/libc_psr.so.1
(gdb) where
#0  0xfe730e98 in memcpy () from /platform/SUNW,SPARC-Enterprise-T5220/lib/libc_psr.so.1
#1  0xfeb8e00c in runtime_netpoll (block=block@entry=1 '\001')
    at /home/maciej/src/opencsw/pkg/gcc4/trunk/work/solaris10-sparc/build-isa-sparcv8plus/gcc-4.9.0/libgo/runtime/netpoll_select.c:163
#2  0xfeb92e98 in findrunnable ()
    at /home/maciej/src/opencsw/pkg/gcc4/trunk/work/solaris10-sparc/build-isa-sparcv8plus/gcc-4.9.0/libgo/runtime/proc.c:1653
#3  schedule () at /home/maciej/src/opencsw/pkg/gcc4/trunk/work/solaris10-sparc/build-isa-sparcv8plus/gcc-4.9.0/libgo/runtime/proc.c:1751
#4  0xfeb93138 in runtime_mstart (mp=0xde810800)
    at /home/maciej/src/opencsw/pkg/gcc4/trunk/work/solaris10-sparc/build-isa-sparcv8plus/gcc-4.9.0/libgo/runtime/proc.c:1000
#5  0xff1faee8 in _lwp_start () from /lib/libc.so.1
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
Comment 2 Maciej Bliziński 2014-07-20 12:33:00 UTC
I've just reproduced this with gcc-4.9.1 (Solaris 10 sparc).
Comment 3 Ian Lance Taylor 2014-07-20 15:18:32 UTC
Do you have a test case that I can use to reproduce the problem?

Do you know whether the problem occurs on GNU/Linux?
Comment 4 Maciej Bliziński 2014-08-09 12:31:21 UTC
Just checked on Linux 3.13.0 x86_64, gccgo 4.9.0, works fine. gccgo version:

$ gccgo --version
gccgo (Ubuntu 4.9-20140406-0ubuntu1) 4.9.0 20140405 (experimental) [trunk revision 209157]

In the meantime, I keep working with gccgo 4.8.2, which also works fine.
Comment 5 Maciej Bliziński 2014-08-09 15:04:18 UTC
How to reproduce:

Requirements:
Solaris 10
svn
gmake
ginstall
gccgo 4.9.x

(Everything available as binary packages at OpenCSW)

svn checkout http://svn.code.sf.net/p/gar/code/csw/mgar/gar/v2/go gar-code
cd gar-code
mkdir bin
gmake bin/promote-packages
bin/promote-packages -dry-run -html-report-path=. -package-times-json-file=times.json

With gccgo-4.8.2, this program completes within 80-90s on Intel or slightly over 3 minutes on sparc. When compiled with gccgo-4.9.0, it freezes up, stays frozen for some time, and eventually crashes.
Comment 6 Maciej Bliziński 2014-11-29 11:23:13 UTC
Just checked, the problem is still there in GCC 4.9.2.
Comment 7 Maciej Bliziński 2015-05-03 21:02:46 UTC
Checked again, the problem is still there in GCC-5.1.
Comment 8 Maciej Bliziński 2015-05-04 12:35:48 UTC
Here's my attempt to get some information:

experimental10s 14:35:13 ~/src/opencsw-gar/v2/go $ gcc -v
Reading specs from /opt/csw/lib/gcc/sparc-sun-solaris2.10/5.1.0/specs
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/opt/csw/libexec/gcc/sparc-sun-solaris2.10/5.1.0/lto-wrapper
Target: sparc-sun-solaris2.10
Configured with: /home/dam/mgar/pkg/gcc5/trunk/work/solaris10-sparc/build-isa-sparcv8plus/gcc-5.1.0/configure --prefix=/opt/csw --exec_prefix=/opt/csw --bindir=/opt/csw/bin --sbindir=/opt/csw/sbin --libexs
Thread model: posix
gcc version 5.1.0 (GCC) 

experimental10s 14:31:43 ~/src/opencsw-gar/v2/go $ export CFLAGS="-g"
experimental10s 14:31:55 ~/src/opencsw-gar/v2/go $ gmake bin/package-gar-status
gccgo -o src/opencsw/diskformat/diskformat.o -g -c src/opencsw/diskformat/diskformat.go
ginstall -m 755 src/opencsw/diskformat/diskformat.o opencsw/diskformat.o
gccgo -o src/opencsw/mantis/mantis.o -g -c src/opencsw/mantis/mantis.go
# ginstall
cp src/opencsw/mantis/mantis.o opencsw/mantis.o
gccgo -g -o bin/package-gar-status src/package-gar-status/package-gar-status.go opencsw/diskformat.o opencsw/mantis.o
experimental10s 14:32:15 ~/src/opencsw-gar/v2/go $ # bin/package-gar-status -output-file=foo.md 
experimental10s 14:32:29 ~/src/opencsw-gar/v2/go $ gdb bin/package-gar-status
GNU gdb (GDB) 7.7
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "sparc-sun-solaris2.10".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from bin/package-gar-status...done.
(gdb) run -output-file=foo.md
Starting program: /home/maciej/src/opencsw-gar/v2/go/bin/package-gar-status -output-file=foo.md
[Thread debugging using libthread_db enabled]
[New Thread 1 (LWP 1)]
[New LWP    2        ]
[New LWP    3        ]
2015/05/04 14:33:05 Program start
2015/05/04 14:33:05 Looking at catalog  {unstable i386 SunOS5.10}  only.
2015/05/04 14:33:05 Making a request to http://buildfarm.opencsw.org/pkgdb/rest/catalogs/unstable/i386/SunOS5.10/for-generation/as-dicts/
[New LWP    4        ]
[New LWP    5        ]
[New Thread 4 (LWP 4)]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 4 (LWP 4)]
0xfe3c0d88 in memcpy () from /platform/SUNW,SPARC-Enterprise-T5220/lib/libc_psr.so.1
(gdb) where
#0  0xfe3c0d88 in memcpy () from /platform/SUNW,SPARC-Enterprise-T5220/lib/libc_psr.so.1
#1  0xfe9453d0 in runtime_netpoll (block=block@entry=1 '\001') at /home/dam/mgar/pkg/gcc5/trunk/work/solaris10-sparc/build-isa-sparcv8plus/gcc-5.1.0/libgo/runtime/netpoll_select.c:163
#2  0xfe94a794 in findrunnable () at /home/dam/mgar/pkg/gcc5/trunk/work/solaris10-sparc/build-isa-sparcv8plus/gcc-5.1.0/libgo/runtime/proc.c:1667
#3  schedule () at /home/dam/mgar/pkg/gcc5/trunk/work/solaris10-sparc/build-isa-sparcv8plus/gcc-5.1.0/libgo/runtime/proc.c:1765
#4  0xfe94aad8 in runtime_mstart (mp=0xde315c00) at /home/dam/mgar/pkg/gcc5/trunk/work/solaris10-sparc/build-isa-sparcv8plus/gcc-5.1.0/libgo/runtime/proc.c:1000
#5  0xff1baec8 in _lwp_start () from /lib/libc.so.1
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
Comment 9 Ian Lance Taylor 2015-05-08 23:27:22 UTC
From looking at the code it seems that mmap may be returning MAP_FAILED.  That could lead to this crash.  I don't know why mmap would fail, though.  It's a shame that the program works when run under truss.

If you feel like experimenting, look at the call to runtime_SysAlloc near the start of runtime_netpoll in libgo/runtime/netpoll_select.c.  Add something like
    if(prfds == nil)
        runtime_throw("runtime_SysAlloc returned nil");
and see if you see that error instead of the crash.
Comment 10 ian@gcc.gnu.org 2015-11-26 00:24:53 UTC
Author: ian
Date: Thu Nov 26 00:24:21 2015
New Revision: 230922

URL: https://gcc.gnu.org/viewcvs?rev=230922&root=gcc&view=rev
Log:
	PR go/61303
    runtime: don't overallocate in select code
    
    If we've already allocated an fd_set, don't allocate another one.
    
    Also, don't bother to read from rdwake if it wasn't returned in select.
    
    Fixes https://gcc.gnu.org/PR61303.
    
    Reviewed-on: https://go-review.googlesource.com/17243

Modified:
    trunk/gcc/go/gofrontend/MERGE
    trunk/libgo/runtime/netpoll_select.c
Comment 11 ian@gcc.gnu.org 2015-11-26 00:25:05 UTC
Author: ian
Date: Thu Nov 26 00:24:33 2015
New Revision: 230923

URL: https://gcc.gnu.org/viewcvs?rev=230923&root=gcc&view=rev
Log:
	PR go/61303
    runtime: don't overallocate in select code
    
    If we've already allocated an fd_set, don't allocate another one.
    
    Also, don't bother to read from rdwake if it wasn't returned in select.
    
    Fixes https://gcc.gnu.org/PR61303.
    
    Reviewed-on: https://go-review.googlesource.com/17243

Modified:
    branches/gcc-5-branch/libgo/runtime/netpoll_select.c
Comment 12 Ian Lance Taylor 2015-11-26 00:25:44 UTC
This seems to have been due to a bug in the select support, which is only used on Solaris.  Should be fixed on mainline and GCC 5 branch.