Bug 88462 - All D execution tests FAIL on Solaris/SPARC
Summary: All D execution tests FAIL on Solaris/SPARC
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: d (show other bugs)
Version: 9.0
: P3 normal
Target Milestone: 9.0
Assignee: Iain Buclaw
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-12-12 10:13 UTC by Rainer Orth
Modified: 2019-04-01 21:14 UTC (History)
0 users

See Also:
Host:
Target: sparc*-sun-solaris2.*
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Rainer Orth 2018-12-12 10:13:38 UTC
Even with the latest patch for PR d/88150 to use sections_elf_shared.d on Solaris,
which allows the vast majority of gdc execution tests to PASS on Solaris 11.4/x86,
I have no such luck on Solaris 11.5/SPARC: all tests FAIL with

Aborting from local/libphobos/libdruntime/core/sync/mutex.d(95) Error: pthread_mutex_init failed.

Thread 2 received signal SIGABRT, Aborted.
[Switching to Thread 1 (LWP 1)]
0xfec7e044 in __lwp_sigqueue () from /lib/libc.so.1
(gdb) where
#0  0xfec7e044 in __lwp_sigqueue () from /lib/libc.so.1
#1  0xfebb9898 in raise () from /lib/libc.so.1
#2  0xfeb8b1d0 in abort () from /lib/libc.so.1
#3  0x000c150c in core.internal.abort.abort(immutable(char)[], immutable(char)[], uint) (msg=..., 
    filename=<error reading variable: Cannot access memory at address 0x5f>, 
    line=95)
    at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/core/internal/abort.d:44
#4  0x000fe32c in core.sync.mutex.Mutex.this!(core.sync.mutex.Mutex).this(bool)
    (this=0x12de34 <core.thread.Thread._locks+44>, _unused_=true)
    at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/core/sync/mutex.d:94
#5  0x000fdeac in core.sync.mutex.Mutex.this() (
    this=0x12de34 <core.thread.Thread._locks+44>)
    at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/core/sync/mutex.d:63
#6  0x000c64d0 in core.thread.Thread.initLocks() ()
    at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/core/thread.d:1726
#7  0x000c69d8 in thread_init ()
    at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/core/thread.d:2022
#8  0x000a232c in gc_init ()
    at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/gc/proxy.d:56
#9  0x00080684 in rt_init ()
    at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/rt/dmain2.d:187
#10 0x000810c8 in runAll (this=0xffbfe904)
    at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/rt/dmain2.d:485
#11 0x00081020 in tryExec (this=0xffbfe904, dg=...)
    at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/rt/dmain2.d:461
#12 0x00080f2c in _d_run_main (argc=1, argv=0xffbfea34, 
    mainFunc=0x6b2bc <D main>)
    at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/rt/dmain2.d:494
#13 0x0006b24c in main (argc=1, argv=0xffbfea34)
    at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/__entrypoint.di:44
#14 0x0006b014 in _start ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

            !pthread_mutex_init(cast(pthread_mutex_t*) &m_hndl, &attr) ||
                abort("Error: pthread_mutex_init failed.");

After much digging and head scratching, I found what's wrong: pthread_mutex_init
expects the mutex to be long long (i.e. 8 byte) aligned.  I'd thought this
would happen automatically given the declaration in core/sys/posix/sys/types.d
with the ulong __pthread_mutex_data field which has a natural alignment of
64 bits.  Whatever I do, however, I only end up with the mutex being 4-byte aligned:

* apply align(8): at the beginning of the pthread_mutex_t fields,

* apply align(8) to the struct pthread_mutex_t declaration, or

* apply align(8) to the m_hndl member of Class Mutex in core/sync/mutex.d.

To guard against incomplete dependencies that could lead to some parts not
being recompiled when they should, I've always recompiled all of libphobos to
be sure I didn't miss something.

I've no idea what I'm doing wrong here.
Comment 1 Iain Buclaw 2018-12-12 14:04:31 UTC
Does it appear to be the correct alignment at compile-time?

This just prints the what the front-end determines.  Not discounting these values somehow change during translation to gcc trees.

---

import core.sys.posix.sys.types;
import core.sync.mutex;

pragma(msg, pthread_mutex_t.alignof);
pragma(msg, Mutex.alignof);
pragma(msg, Mutex.m_hndl.offsetof);
Comment 2 ro@CeBiTec.Uni-Bielefeld.DE 2018-12-12 14:13:06 UTC
> pragma(msg, pthread_mutex_t.alignof);
> pragma(msg, Mutex.alignof);
> pragma(msg, Mutex.m_hndl.offsetof);

I get

8u
4u
/homes/ro/mutex_align.d:6:13: error: class core.sync.mutex.Mutex member m_hndl is not accessible
16u

The first is right, but Mutex alignment is off, probably leading to the
4-byte alignment of m_hndl I'm seeing.
Comment 3 Iain Buclaw 2018-12-12 15:29:39 UTC
(In reply to ro@CeBiTec.Uni-Bielefeld.DE from comment #2)
> > pragma(msg, pthread_mutex_t.alignof);
> > pragma(msg, Mutex.alignof);
> > pragma(msg, Mutex.m_hndl.offsetof);
> 
> I get
> 
> 8u
> 4u
> /homes/ro/mutex_align.d:6:13: error: class core.sync.mutex.Mutex member
> m_hndl is not accessible
> 16u
> 
> The first is right, but Mutex alignment is off, probably leading to the
> 4-byte alignment of m_hndl I'm seeing.

Oh wait, Mutex is a class, so of course the alignment is 4.  Classes are reference types, so of course alignof would be the pointer.
Comment 4 Iain Buclaw 2018-12-12 18:30:34 UTC
Stepping through the backtrace, I see the following at Thread.initLocks (core/thread.d around line 1719).


---

__gshared align(Mutex.alignof) void[__traits(classInstanceSize, Mutex)][2] _locks;

static void initLocks()
{
    foreach (ref lock; _locks)
    {
        lock[] = typeid(Mutex).initializer[];
        (cast(Mutex)lock.ptr).__ctor();
    }
}


---

So there are two things.  Firstly, the object instance is type punned from a void[N] array.  Secondly it is aligned to pointer size, not the alignment of the underlying record type.

So I'm certain that the problem will be fixed if `align(Mutex.alignof)` is replaced with `align(8)`.
Comment 5 ro@CeBiTec.Uni-Bielefeld.DE 2018-12-13 14:21:18 UTC
> --- Comment #4 from Iain Buclaw <ibuclaw at gdcproject dot org> ---
> Stepping through the backtrace, I see the following at Thread.initLocks
> (core/thread.d around line 1719).
[...]
> So there are two things.  Firstly, the object instance is type punned from a
> void[N] array.  Secondly it is aligned to pointer size, not the alignment of
> the underlying record type.
>
> So I'm certain that the problem will be fixed if `align(Mutex.alignof)` is
> replaced with `align(8)`.

Unfortunately, this doesn't work: the first time through, _locks[0] was
already 8-byte aligned and everything worked fine.  This remained when
using align(8) instead.  However, Mutex is 44 bytes on 32-bit
Solaris/x86, so again _locks[1] lands on a non-8 byte boundary and
pthread_mutex_init fails.

I tried rouding up the size of the _locks array members to a multiple of
8, but that let the constructor already fail the first time through
where _d_arraycopy checks that the right amount of data is copied:

_d_arraycopy -> rt.util.array.enforceRawArraysConformable ->
rt.util.array._enforceSameLength
Comment 6 Iain Buclaw 2018-12-13 22:19:06 UTC
(In reply to ro@CeBiTec.Uni-Bielefeld.DE from comment #5)
> 
> Unfortunately, this doesn't work: the first time through, _locks[0] was
> already 8-byte aligned and everything worked fine.  This remained when
> using align(8) instead.  However, Mutex is 44 bytes on 32-bit
> Solaris/x86, so again _locks[1] lands on a non-8 byte boundary and
> pthread_mutex_init fails.
> 

Oh, yeah. I didn't consider that size of Mutex would be a problem.


> I tried rouding up the size of the _locks array members to a multiple of
> 8, but that let the constructor already fail the first time through
> where _d_arraycopy checks that the right amount of data is copied:
> 
> _d_arraycopy -> rt.util.array.enforceRawArraysConformable ->
> rt.util.array._enforceSameLength

What you're doing rounding up the array size is correct.  The bit you're missing is fixing up the slice assignment as well.

lock[0 .. __traits(classInstanceSize, Mutex)] = typeid(Mutex).initializer[];
Comment 7 ro@CeBiTec.Uni-Bielefeld.DE 2018-12-13 23:19:43 UTC
> --- Comment #6 from Iain Buclaw <ibuclaw at gdcproject dot org> ---
>> 8, but that let the constructor already fail the first time through
>> where _d_arraycopy checks that the right amount of data is copied:
>> 
>> _d_arraycopy -> rt.util.array.enforceRawArraysConformable ->
>> rt.util.array._enforceSameLength
>
> What you're doing rounding up the array size is correct.  The bit you're
> missing is fixing up the slice assignment as well.
>
> lock[0 .. __traits(classInstanceSize, Mutex)] = typeid(Mutex).initializer[];

That got me over this issue indeed, thanks.

However, I now hit the next issue: a SIGBUS (which gdb incorrectly
reports as SIGSEGV) due to an alignement issue:

Thread 2 received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1 (LWP 1)]
0x0007c970 in object.ModuleInfo.flags() const (this=...)
    at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/object.d:1541
1541	    @property uint flags() nothrow pure @nogc { return _flags; }
(gdb) where
#0  0x0007c970 in object.ModuleInfo.flags() const (this=...)
    at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/object.d:1541
#1  0x0007d118 in object.ModuleInfo.importedModules() const (this=...)
    at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/object.d:1580
#2  0x0008ed74 in rt.minfo.ModuleGroup.sortCtors(immutable(char)[]) (this=..., 
    cycleHandling=...)
    at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/rt/minfo.d:259
#3  0x00091110 in rt.minfo.ModuleGroup.sortCtors() (this=...)
    at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/rt/minfo.d:533
#4  0x00092d24 in __foreachbody1 (this=0x0, sg=...)
    at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/rt/minfo.d:795
#5  0x00097a08 in rt.sections_elf_shared.DSO.opApply(scope int(ref rt.sections_elf_shared.DSO) delegate) (dg=...)
    at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/rt/sections_elf_shared.d:68
#6  0x00092ce8 in rt_moduleCtor ()
    at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/rt/minfo.d:793
#7  0x00085dbc in rt_init ()
    at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/rt/dmain2.d:190
#8  0x000867e8 in runAll (this=0xffbfe78c)
    at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/rt/dmain2.d:485
#9  0x00086740 in tryExec (this=0xffbfe78c, dg=...)
    at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/rt/dmain2.d:461
#10 0x0008664c in _d_run_main (argc=1, argv=0xffbfe8bc, 
    mainFunc=0x6c164 <D main>)
    at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/rt/dmain2.d:494
#11 0x0006b7e4 in main (argc=1, argv=0xffbfe8bc)
    at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/__entrypoint.di:44
#12 0x0006b5d4 in _start ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) p this
$5 = (const object.ModuleInfo &) @0x12ab33: {_flags = 4100, _index = 0}
(gdb) x/i $pc
=> 0x7c970 <_D6object10ModuleInfo5flagsMxFNaNbNdNiZk+152>:	
    ld  [ %g1 ], %g1
(gdb) p/x $g1
$6 = 0x12ab33

Trying to load 32 bits from a non-4 byte aligned pointer is a no-no on a
strict-alignment target like sparc...
Comment 8 Iain Buclaw 2018-12-13 23:31:56 UTC
(In reply to ro@CeBiTec.Uni-Bielefeld.DE from comment #7)
> > --- Comment #6 from Iain Buclaw <ibuclaw at gdcproject dot org> ---
> >> 8, but that let the constructor already fail the first time through
> >> where _d_arraycopy checks that the right amount of data is copied:
> >> 
> >> _d_arraycopy -> rt.util.array.enforceRawArraysConformable ->
> >> rt.util.array._enforceSameLength
> >
> > What you're doing rounding up the array size is correct.  The bit you're
> > missing is fixing up the slice assignment as well.
> >
> > lock[0 .. __traits(classInstanceSize, Mutex)] = typeid(Mutex).initializer[];
> 
> That got me over this issue indeed, thanks.
> 
> However, I now hit the next issue: a SIGBUS (which gdb incorrectly
> reports as SIGSEGV) due to an alignement issue:
> 
> Thread 2 received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 1 (LWP 1)]
> 0x0007c970 in object.ModuleInfo.flags() const (this=...)
>     at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/object.d:1541
> 1541	    @property uint flags() nothrow pure @nogc { return _flags; }
> (gdb) where
> #0  0x0007c970 in object.ModuleInfo.flags() const (this=...)
>     at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/object.d:1541
> #1  0x0007d118 in object.ModuleInfo.importedModules() const (this=...)
>     at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/object.d:1580
> #2  0x0008ed74 in rt.minfo.ModuleGroup.sortCtors(immutable(char)[])
> (this=..., 
>     cycleHandling=...)
>     at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/rt/minfo.d:259
> #3  0x00091110 in rt.minfo.ModuleGroup.sortCtors() (this=...)
>     at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/rt/minfo.d:533
> #4  0x00092d24 in __foreachbody1 (this=0x0, sg=...)
>     at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/rt/minfo.d:795
> #5  0x00097a08 in rt.sections_elf_shared.DSO.opApply(scope int(ref
> rt.sections_elf_shared.DSO) delegate) (dg=...)
>     at
> /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/rt/sections_elf_shared.d:
> 68
> #6  0x00092ce8 in rt_moduleCtor ()
>     at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/rt/minfo.d:793
> #7  0x00085dbc in rt_init ()
>     at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/rt/dmain2.d:190
> #8  0x000867e8 in runAll (this=0xffbfe78c)
>     at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/rt/dmain2.d:485
> #9  0x00086740 in tryExec (this=0xffbfe78c, dg=...)
>     at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/rt/dmain2.d:461
> #10 0x0008664c in _d_run_main (argc=1, argv=0xffbfe8bc, 
>     mainFunc=0x6c164 <D main>)
>     at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/rt/dmain2.d:494
> #11 0x0006b7e4 in main (argc=1, argv=0xffbfe8bc)
>     at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/__entrypoint.di:44
> #12 0x0006b5d4 in _start ()
> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
> (gdb) p this
> $5 = (const object.ModuleInfo &) @0x12ab33: {_flags = 4100, _index = 0}
> (gdb) x/i $pc
> => 0x7c970 <_D6object10ModuleInfo5flagsMxFNaNbNdNiZk+152>:	
>     ld  [ %g1 ], %g1
> (gdb) p/x $g1
> $6 = 0x12ab33
> 
> Trying to load 32 bits from a non-4 byte aligned pointer is a no-no on a
> strict-alignment target like sparc...

I saw that on HPPA as well when testing under QEMU.

ModuleInfo is a variably-sized packed struct - what is in the variable part is determined by the value of _flags.

This is compiler generated, so I'll have a look into giving it proper alignment on the compiler side.
Comment 9 ro@CeBiTec.Uni-Bielefeld.DE 2018-12-14 09:30:22 UTC
> --- Comment #8 from Iain Buclaw <ibuclaw at gdcproject dot org> ---
>> Trying to load 32 bits from a non-4 byte aligned pointer is a no-no on a
>> strict-alignment target like sparc...
>
> I saw that on HPPA as well when testing under QEMU.
>
> ModuleInfo is a variably-sized packed struct - what is in the variable part is
> determined by the value of _flags.
>
> This is compiler generated, so I'll have a look into giving it proper alignment
> on the compiler side.

An alternative might be to leave the on-disk representation as is and
only handle alignment on input/startup.  However, that's probably a bad
tradeoff of some on-disk space savings vs. the runtime cost at every
startup.
Comment 10 Johannes Pfau 2018-12-14 15:02:05 UTC
I guess the proper fix to the alignment problem is using 'https://dlang.org/phobos/std_traits.html#classInstanceAlignment' (or rather the druntime equivalent) instead of Mutex.alignof + the rounding / slice assignment fixes?

Regarding the ModuleInfo problem: Although ModuleInfo does have a variable size, _flags ist the first field in the struct. So the whole struct instance has to be misaligned for some reason? Is the minfo section aligned properly?
Comment 11 ro@CeBiTec.Uni-Bielefeld.DE 2018-12-14 15:30:01 UTC
> --- Comment #10 from Johannes Pfau <johannespfau at gmail dot com> ---
> I guess the proper fix to the alignment problem is using
> 'https://dlang.org/phobos/std_traits.html#classInstanceAlignment' (or rather
> the druntime equivalent) instead of Mutex.alignof + the rounding / slice
> assignment fixes?

Seems plausible: the current situation is nothing more than a hack to
get me further along, and I've only just started reading up on D.

> Regarding the ModuleInfo problem: Although ModuleInfo does have a variable
> size, _flags ist the first field in the struct. So the whole struct instance
> has to be misaligned for some reason? Is the minfo section aligned properly?

It is: both minfo sections on libgdruntime.so and libgphobos.so are:

libdruntime/.libs/libgdruntime.so:


Section Header[28]:  sh_name: minfo
    sh_addr:      0x17b834        sh_flags:   [ SHF_WRITE SHF_ALLOC ]
    sh_size:      0x344           sh_type:    [ SHT_PROGBITS ]
    sh_offset:    0x16b834        sh_entsize: 0
    sh_link:      0               sh_info:    0
    sh_addralign: 0x4       

src/.libs/libgphobos.so:


Section Header[28]:  sh_name: minfo
    sh_addr:      0x6ff014        sh_flags:   [ SHF_WRITE SHF_ALLOC ]
    sh_size:      0x224           sh_type:    [ SHT_PROGBITS ]
    sh_offset:    0x6ef014        sh_entsize: 0
    sh_link:      0               sh_info:    0
    sh_addralign: 0x4       

And looking at a statically linked test program (gdc283.exe), I see

Thread 2 received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1 (LWP 1)]
0x0007c970 in object.ModuleInfo.flags() const (this=...)
    at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/object.d:1541
1541        @property uint flags() nothrow pure @nogc { return _flags; }
(gdb) p this
$1 = (const object.ModuleInfo &) @0x12ab33: {_flags = 4100, _index = 0}
(gdb) up
#1  0x0007d118 in object.ModuleInfo.importedModules() const (this=...)
    at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/object.d:1580
1580            if (flags & MIimportedModules)
(gdb) up
#2  0x0008ed74 in rt.minfo.ModuleGroup.sortCtors(immutable(char)[]) (this=..., 
    cycleHandling=...)
    at /vol/gcc/src/hg/trunk/solaris/libphobos/libdruntime/rt/minfo.d:259
259                     foreach (imp; m.importedModules)
(gdb) p this
$2 = (rt.minfo.ModuleGroup &) @0x12f228: {_modules = {
      0x12932c <ModuleInfo for gdc283>, 
      0x1297ac <ModuleInfo for core.exception>, 
      0x129acc <ModuleInfo for gcc.deh>, 
      0x129ae4 <ModuleInfo for gcc.unwind.pe>, 
      0x12a99c <ModuleInfo for object>, 0x12aafc <ModuleInfo for rt.aaA>, 
      0x12ab33 <ModuleInfo for rt.adi>, 0x12ab42 <ModuleInfo for rt.arraycat>, 
      0x12ab56 <ModuleInfo for rt.cast_>, 0x12ab67 <ModuleInfo for rt.deh>, 
      0x12ab77 <ModuleInfo for rt.dmain2>, 
      0x12ab89 <ModuleInfo for invariant>, 

i.e. everything starts off alright, but goes astray from 0x12ab33
<ModuleInfo for rt.adi> onwards.
Comment 12 Iain Buclaw 2018-12-16 21:57:48 UTC
(In reply to Johannes Pfau from comment #10)
> I guess the proper fix to the alignment problem is using
> 'https://dlang.org/phobos/std_traits.html#classInstanceAlignment' (or rather
> the druntime equivalent) instead of Mutex.alignof + the rounding / slice
> assignment fixes?
> 
> Regarding the ModuleInfo problem: Although ModuleInfo does have a variable
> size, _flags ist the first field in the struct. So the whole struct instance
> has to be misaligned for some reason? Is the minfo section aligned properly?

ModuleInfo is forced an alignment of 1.  It should be instead aligned to `max(uint.sizeof, size_t.sizeof)` so that both named and variable data parts can be read without alignment problems.
Comment 13 Iain Buclaw 2019-03-31 14:35:13 UTC
Author: ibuclaw
Date: Sun Mar 31 14:34:41 2019
New Revision: 270043

URL: https://gcc.gnu.org/viewcvs?rev=270043&root=gcc&view=rev
Log:
d: Fix run-time SIGSEGV reading ModuleInfo.flags()

The current forced alignment is not necessary, and is problematic on
targets that have strict alignment rules.

gcc/d/ChangeLog:

2019-03-31  Iain Buclaw  <ibuclaw@gdcproject.org>

	PR d/88462
	* modules.cc (layout_moduleinfo_fields): Properly align ModuleInfo,
	instead of forcing alignment to be 1.

Modified:
    trunk/gcc/d/ChangeLog
    trunk/gcc/d/modules.cc
Comment 14 Iain Buclaw 2019-04-01 14:44:36 UTC
Author: ibuclaw
Date: Mon Apr  1 14:44:04 2019
New Revision: 270057

URL: https://gcc.gnu.org/viewcvs?rev=270057&root=gcc&view=rev
Log:
    PR d/88462
libphobos: Fix abort in pthread_mutex_init on Solaris.

Merges upstream druntime d57fa1ff.

Reviewed-on: https://github.com/dlang/druntime/pull/2534

Modified:
    trunk/libphobos/libdruntime/MERGE
    trunk/libphobos/libdruntime/core/internal/traits.d
    trunk/libphobos/libdruntime/core/thread.d
Comment 15 Iain Buclaw 2019-04-01 14:48:36 UTC
Commits r270043 and r270057 deals with the immediate problems here, other problems raised in pr89255 I think should be handled on a per-case basis to keep track off each fail test easier.
Comment 16 ro@CeBiTec.Uni-Bielefeld.DE 2019-04-01 15:16:58 UTC
> --- Comment #15 from Iain Buclaw <ibuclaw at gdcproject dot org> ---
> Commits r270043 and r270057 deals with the immediate problems here, other
> problems raised in pr89255 I think should be handled on a per-case basis to
> keep track off each fail test easier.

Absolutely.  I had a workaround for the second commit in my tree already
and tried a sparc-sun-solaris2.11 bootstrap with the first last night.
The execution tests get along far further now, but many (all of them?)
are spinning in repeated calls to nanosleep:

nanosleep(0xFFBFE140, 0xFFBFE148)               = 0
        tmout: 0.001000000 sec
        resid: 0.000000000 sec

pstack shows

6706:   gdc94/link11069a.exe
 fe78fe58 nanosleep (ffbfe140, ffbfe148)
 fefac72c core.thread.Thread.sleep(core.time.Duration) (ffbfe1c0, 1, ffbfe348, 0, ffbfe148, ffbfe140) + bc
 fefa26b0 core.internal.spinlock.SpinLock.lock() shared (ff0582c0, 1, ffbfe1c0, 299163c, 0, 4) + 78
 ff01270c ???????? (23ad0, ffbfe2a0, ffbfe2a4, ffbfe2ac, ffbfe2a8, ff0582c0) + 50
 ff00e07c _DT8_D2gc4impl12conservative2gc14ConservativeGC6mallocMFNbkkxC8TypeInfoZPv (0, 28, 0, ff052208, 0, 0) + 3c
 ff019560 gc_malloc (28, 0, ff052208, ff06b790, fef7fc7c, ffbfe398) + 30
 fefd0fd0 _d_newclass (ff052208, fef1c964, ffbfe418, ff06b790, fef7fc34, ff052208) + 10c
 fefa0818 onAssertErrorMsg (ffbfe4a0, 37c, ffbfe498, ff06b790, fef80444, ff1de314) + 68
 fefa0fa4 _d_assert_msg (ffbfe518, ffbfe510, 37c, 8, fef81410, ff1de314) + 2c
 fefae600 core.thread.suspend(core.thread.Thread) (0, ff05ac00, 4c4b3f, ff05abd0, ffbfe540, 1) + 374
 ff00cc00 gc.impl.conservative.gc.Gcx.fullcollect(bool) (24118, 1, 24158, 8, ff05a38c, 24118) + 4c
 ff00d0ec gc.impl.conservative.gc.ConservativeGC.fullCollectNoStack() (23ad0, fef18f14, fefd09fc, ff06b790, fef80b04, 0) + 60
 ff002e74 _DT8_D2gc4impl12conservative2gc14ConservativeGC14collectNoStackMFNbZv (23ad8, 0, fe7e6a80, 0, 0, ffffffff) + 18
 ff019380 gc_term  (ff05ae1c, 1, ff05abe0, ff05abc4, ff05abd0, 0) + 28
 fefd09fc rt_term  (1, ffbfe828, 4, ffbfeb90, ffbfe7a8, ffbfe95c) + 68
 fefd0ab4 rt.dmain2._d_run_main(int, char**, extern(C) int(char[][]) function).runAll() (ffbfe95c, 6e6b, 14, ffbfe878, 0, ffbfeb7e) + 28 (dmain2.d:489)
 fefd05d0 rt.dmain2._d_run_main(int, char**, extern(C) int(char[][]) function).tryExec(scope void() delegate) (ffbfe95c, ffbfe930, ffbfe928, 4, 14, ffbfe95c) + 1c (dmain2.d:460)
 fefd07e8 _d_run_main (1, ffbfe914, 1, ffbfe920, 14, 1) + 1c4
 00012844 main     (1, ffbfea34, ffbfea3c, 0, 0, 12ed0) + 1c
 00012634 _start   (0, 0, 0, 0, 0, 0) + 5c

Once I'd recompiled libphobos at -g3 -O0, the problem vanished, though.
I'll look closer and report my findings separately.
Comment 17 ro@CeBiTec.Uni-Bielefeld.DE 2019-04-01 21:14:30 UTC
> --- Comment #16 from ro at CeBiTec dot Uni-Bielefeld.DE <ro at CeBiTec dot
> Uni-Bielefeld.DE> ---
[...]
> Once I'd recompiled libphobos at -g3 -O0, the problem vanished, though.
> I'll look closer and report my findings separately.

Before going to bed, here are the gdc testresults for the -g3 -O0
libphobos (32 and 64-bit combined):

                === gdc Summary ===

# of expected passes            55351
# of unexpected failures        3109
# of unresolved testcases       256
# of unsupported tests          40

and for libphobos it's

                === libphobos Summary ===

# of expected passes            128
# of unexpected failures        53

I still had to kill off a couple of tests that were looping in calls to
nanosleep.  Will check in more detail later.