Bug 89123 - Too many go test failures on s390x-linux
Summary: Too many go test failures on s390x-linux
Status: UNCONFIRMED
Alias: None
Product: gcc
Classification: Unclassified
Component: go (show other bugs)
Version: 9.0
: P3 normal
Target Milestone: ---
Assignee: Ian Lance Taylor
URL:
Keywords:
: 89277 (view as bug list)
Depends on:
Blocks:
 
Reported: 2019-01-30 18:29 UTC by Jakub Jelinek
Modified: 2019-02-16 01:38 UTC (History)
2 users (show)

See Also:
Host:
Target: s390x-linux
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments
Sketch of patch (781 bytes, patch)
2019-02-01 19:06 UTC, Ian Lance Taylor
Details | Diff
Tentative patch for libgo on s390x (1.54 KB, patch)
2019-02-06 16:40 UTC, rdapp
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jakub Jelinek 2019-01-30 18:29:06 UTC
As can be seen in 
https://kojipkgs.fedoraproject.org/packages/gcc/9.0.1/0.2.fc30/data/logs/s390x/build.log
go has way too many failed tests on s390x:
                === go Summary for unix/ ===                                                                                                       
# of expected passes            2506                                                                                                               
# of unexpected failures        654                                                                                                                
# of expected failures          1                                                                                                                  
# of untested testcases         18                                                                                                                 
# of unsupported tests          3                                                                                                                  
                === libgo Summary ===                                                                                                              
# of unexpected failures        368                                                                                                                

Detailed failures are available by uudecoding the above build.log.
Comment 1 Jakub Jelinek 2019-01-30 19:03:58 UTC
Most of the tests don't print anything interesting into the go.log, just
Execution timeout is: 300
spawn [open ...]
FAIL: go.go-torture/execute/const-1.go execution,  -O2
and similar.
Comment 2 Jakub Jelinek 2019-01-30 19:09:02 UTC
For comparison, in pretty much the same build environment (20 days earlier) with 8.2.1 20190109 I see
                === go tests ===                                                                                                                   
Running target unix/                                                                                                                               
FAIL: ./index0-out.go execution,  -O0 -g -fno-var-tracking-assignments                                                                             
FAIL: go.test/test/ken/cplx2.go execution,  -O2 -g                                                                                                 
                === go Summary for unix/ ===                                                                                                       
# of expected passes            7278                                                                                                               
# of unexpected failures        2                                                                                                                  
# of expected failures          1                                                                                                                  
# of untested testcases         7                                                                                                                  
# of unsupported tests          3                                                                                                                  
                === libgo Summary ===                                                                                                              
# of expected passes            163                                                                                                                
# of unexpected failures        163                                                                                                                
where the unexpected failures for libgo are with -m31 (only very few 31-bit libraries around).
Comment 3 Ian Lance Taylor 2019-01-30 21:10:07 UTC
Clearly something is badly broken, but I don't know how to find out what it is.  There is no S/390 machine on the GCC compile farm.  Added Dominik Vogt who contributed the initial S/390 support to gccgo.
Comment 4 rdapp 2019-01-31 10:17:25 UTC
I'm going to have a look.
Comment 5 rdapp 2019-02-01 13:44:26 UTC
I performed a bisect using const-1.go as check and got the following likely culprit:

b0751b120f1b83d8e48a7c2cac831aabbb0bc048 is the first bad commit
commit b0751b120f1b83d8e48a7c2cac831aabbb0bc048
Author: ian <ian@138bc75d-0d04-0410-961f-82ee72b054a4>
Date:   Mon Sep 24 21:46:21 2018 +0000

        libgo: update to Go 1.11

        Reviewed-on: https://go-review.googlesource.com/136435

(rev. 264546)


Dominik Vogt is no longer with IBM, so I'm going to look into it.  I have no experience with go yet, though.  Might this simply be a case of an oversight regarding big endian?  Do we have another big-endian backend where go works?
Comment 6 Ian Lance Taylor 2019-02-01 13:51:25 UTC
Thanks.  I could have predicted that that would be the change.  Unfortunately that isn't useful as that is a huge change, bringing in a large number of upstream changes from the master Go library.

While anything is possible I think it's relatively unlikely to be an endianness problem.  The Go code works on a range of different processors, including big-endian ones like SPARC.

It seems that programs are crashing fairly early in their run, so I recommend that you build a failing program and run it under the debugger so see where it crashes.  That will probably suggest something.

Or I'm willing to look at it if there is guest access available on an S/390 GNU/Linux system.
Comment 7 rdapp 2019-02-01 15:40:53 UTC
I did a full debug build of libgo and noticed that this changes the behavior of the executable.  When it would segfault with default -O2 before, it now seems to rapidly allocate gigabytes of memory.

This happens in

doInit() in cpu_s390x.go:121

where we detect the CPU facilities.

Apparently the stfle call used to be in cpu_s390x.s which does not exist anymore. Hence, the

{ panic("not implemented for gccgo") }

gets triggers and we end up in

panic.go:133
if len(pp.deferpool) == 0 && sched.deferpool != nil {

pp is 0x0 here and "__go_runtime_error" tries to handle this by a runtime_panicstring which itself tries to defer again and so on.

Is cpu_s390x.s missing on purpose i.e. should it have been replaced by something else?
Comment 8 Ian Lance Taylor 2019-02-01 19:06:25 UTC
Created attachment 45590 [details]
Sketch of patch

Thanks.  That does make the problem obvious.  I've attached a sketch of what a patch should look like.  Basically, we want to call instructions like stfle and km.  As far as I can tell these are not available as GCC intrinsics, and as such will have to be invoked using __asm__.  I'm not sure quite what that would look like on S/390.  Hopefully this patch sketch will let you make some forward progress.  Let me know if I can help.
Comment 9 rdapp 2019-02-04 16:26:03 UTC
Thanks for the pointer, I implemented the functions and now the startup seems to be fully functional again.  I'm still checking whether the remaining 50ish libgo test suite failures I see are due to my changes or something else.
Comment 10 rdapp 2019-02-06 16:40:44 UTC
Created attachment 45621 [details]
Tentative patch for libgo on s390x

I didn't manage to make much progress with analyzing the remaining FAILs but I guess this can wait until after this bug. Is there an easy/preferred way to build and debug a single test case without having to manually add a plethora of dependency arguments?

Attached is a tentative patch that works for me on s390x and reduces the number of FAILs significantly.  Does it look reasonable?
Comment 11 rdapp 2019-02-15 10:48:29 UTC
Ping.
Comment 12 Ian Lance Taylor 2019-02-15 14:34:36 UTC
Sorry for the delay, will look at the patch now.

You can test a single target libgo target by using make to build the /check target.  For example, to test the bytes package, cd to the libgo build directory and run "make bytes/check".
Comment 13 ian@gcc.gnu.org 2019-02-15 14:51:42 UTC
Author: ian
Date: Fri Feb 15 14:51:10 2019
New Revision: 268941

URL: https://gcc.gnu.org/viewcvs?rev=268941&root=gcc&view=rev
Log:
	PR go/89123
    internal/cpu, runtime: add S/390 CPU capability support
    
    Patch by Robin Dapp.
    
    Updates https://gcc.gnu.org/PR89123
    
    Reviewed-on: https://go-review.googlesource.com/c/162887

Modified:
    trunk/gcc/go/gofrontend/MERGE
    trunk/libgo/go/internal/cpu/cpu_gccgo.c
    trunk/libgo/go/internal/cpu/cpu_s390x.go
    trunk/libgo/go/runtime/os_linux_s390x.go
Comment 14 Ian Lance Taylor 2019-02-15 14:52:55 UTC
OK, patch committed.  Should we leave this bug report open?
Comment 15 Ian Lance Taylor 2019-02-16 01:38:09 UTC
*** Bug 89277 has been marked as a duplicate of this bug. ***