Bug 101407 - non-determinism in -fdump-go-spec
Summary: non-determinism in -fdump-go-spec
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: go (show other bugs)
Version: 11.1.0
: P3 normal
Target Milestone: ---
Assignee: Jakub Jelinek
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-07-10 21:34 UTC by Toolybird
Modified: 2022-05-11 06:36 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2021-07-12 00:00:00


Attachments
gcc12-pr101407.patch (468 bytes, patch)
2021-07-12 13:09 UTC, Jakub Jelinek
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Toolybird 2021-07-10 21:34:29 UTC
I'm seeing a reproducibility issue when building gcc-go in GCC 11. Every build run produces a different binary.

I've narrowed down the test case to essentially this:

$ gcc -fdump-go-spec=tmp-1.go -S -o sysinfo.s sysinfo.c
$ gcc -fdump-go-spec=tmp-2.go -S -o sysinfo.s sysinfo.c

$ diff -u tmp-1.go tmp-2.go
--- tmp-1.go	2021-07-11 07:26:58.512916883 +1000
+++ tmp-2.go	2021-07-11 07:27:07.976340655 +1000
@@ -8519,10 +8519,10 @@
 const _PRIxFAST8 = "x"
 const ___POSIX_FADV_DONTNEED = 4
 const _IPPROTO_MTP = 92
-type ___dirstream struct {}
-type ___va_list_tag struct {}
 type _iface struct {}
 type ___locale_data struct {}
 type __IO_marker struct {}
 type __IO_codecvt struct {}
 type __IO_wide_data struct {}
+type ___dirstream struct {}
+type ___va_list_tag struct {}

The file in question is libgo/sysinfo.c

The output seems to differ each time. Occasionally it is the same.

This didn't happen with GCC 10. I suppose I should try the trunk but haven't done so yet.

Any thoughts? Thanks.
Comment 1 Toolybird 2021-07-10 23:45:36 UTC
The bug is present on trunk. Will try to bisect...
Comment 2 Toolybird 2021-07-11 04:49:44 UTC
> Will try to bisect

Well, that was a complete waste of time. There seems to an element of randomness to the problem. It turns out that GCC 10 is also affected as I was able to trigger it all the way back to

basepoints/gcc-10

I give up for the moment. Hopefully someone who cares about binary reproducibility sees this, is able to replicate and fix.
Comment 3 Jakub Jelinek 2021-07-12 13:09:38 UTC
Created attachment 51140 [details]
gcc12-pr101407.patch

I can reproduce it.  The problem is that hash_set for pointers by default uses
ptr_hash, which hashes the pointer values rather than strings they point to,
and that hash_set is then traversed and type lines are emitted during that traversal.  So, with address space randomization the strings hash differently
between different runs.  This patch changes it to hash the strings instead and that should be reproduceable.
Comment 4 Toolybird 2021-07-13 00:49:41 UTC
Your patch solves the problem for me. Thank you very much!
Comment 5 CVS Commits 2021-07-14 08:23:46 UTC
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:3be762c2ed79e36b9c8faaea2be04725c967a34e

commit r12-2293-g3be762c2ed79e36b9c8faaea2be04725c967a34e
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Wed Jul 14 10:22:50 2021 +0200

    godump: Fix -fdump-go-spec= reproduceability issue [PR101407]
    
    pot_dummy_types is a hash_set from whose traversal the code prints some type
    lines.  hash_set normally uses default_hash_traits which for pointer types
    (the hash set hashes const char *) uses pointer_hash which hashes the
    addresses of the pointers except of the least significant 3 bits.
    With address space randomization, that results in non-determinism in the
    -fdump-go-specs= generated file, each invocation can have different order of
    the lines emitted from pot_dummy_types traversal.
    
    This patch fixes it by hashing the string contents instead to make the
    hashes reproduceable.
    
    2021-07-14  Jakub Jelinek  <jakub@redhat.com>
    
            PR go/101407
            * godump.c (godump_str_hash): New type.
            (godump_container::pot_dummy_types): Use string_hash instead of
            ptr_hash in the hash_set.
Comment 6 CVS Commits 2021-07-18 23:29:13 UTC
The releases/gcc-11 branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:31b76a815fc177dd579adc03b671ba9a8846ae6c

commit r11-8771-g31b76a815fc177dd579adc03b671ba9a8846ae6c
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Wed Jul 14 10:22:50 2021 +0200

    godump: Fix -fdump-go-spec= reproduceability issue [PR101407]
    
    pot_dummy_types is a hash_set from whose traversal the code prints some type
    lines.  hash_set normally uses default_hash_traits which for pointer types
    (the hash set hashes const char *) uses pointer_hash which hashes the
    addresses of the pointers except of the least significant 3 bits.
    With address space randomization, that results in non-determinism in the
    -fdump-go-specs= generated file, each invocation can have different order of
    the lines emitted from pot_dummy_types traversal.
    
    This patch fixes it by hashing the string contents instead to make the
    hashes reproduceable.
    
    2021-07-14  Jakub Jelinek  <jakub@redhat.com>
    
            PR go/101407
            * godump.c (godump_str_hash): New type.
            (godump_container::pot_dummy_types): Use string_hash instead of
            ptr_hash in the hash_set.
    
    (cherry picked from commit 3be762c2ed79e36b9c8faaea2be04725c967a34e)
Comment 7 CVS Commits 2022-05-10 08:19:42 UTC
The releases/gcc-10 branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:2c7087f46bb8c3f698cc475ece3786582bd34da0

commit r10-10631-g2c7087f46bb8c3f698cc475ece3786582bd34da0
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Wed Jul 14 10:22:50 2021 +0200

    godump: Fix -fdump-go-spec= reproduceability issue [PR101407]
    
    pot_dummy_types is a hash_set from whose traversal the code prints some type
    lines.  hash_set normally uses default_hash_traits which for pointer types
    (the hash set hashes const char *) uses pointer_hash which hashes the
    addresses of the pointers except of the least significant 3 bits.
    With address space randomization, that results in non-determinism in the
    -fdump-go-specs= generated file, each invocation can have different order of
    the lines emitted from pot_dummy_types traversal.
    
    This patch fixes it by hashing the string contents instead to make the
    hashes reproduceable.
    
    2021-07-14  Jakub Jelinek  <jakub@redhat.com>
    
            PR go/101407
            * godump.c (godump_str_hash): New type.
            (godump_container::pot_dummy_types): Use string_hash instead of
            ptr_hash in the hash_set.
    
    (cherry picked from commit 3be762c2ed79e36b9c8faaea2be04725c967a34e)
Comment 8 CVS Commits 2022-05-11 06:21:20 UTC
The releases/gcc-9 branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:eb253e4148ba1a79789c623a062d43a126ec4c31

commit r9-10088-geb253e4148ba1a79789c623a062d43a126ec4c31
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Wed Jul 14 10:22:50 2021 +0200

    godump: Fix -fdump-go-spec= reproduceability issue [PR101407]
    
    pot_dummy_types is a hash_set from whose traversal the code prints some type
    lines.  hash_set normally uses default_hash_traits which for pointer types
    (the hash set hashes const char *) uses pointer_hash which hashes the
    addresses of the pointers except of the least significant 3 bits.
    With address space randomization, that results in non-determinism in the
    -fdump-go-specs= generated file, each invocation can have different order of
    the lines emitted from pot_dummy_types traversal.
    
    This patch fixes it by hashing the string contents instead to make the
    hashes reproduceable.
    
    2021-07-14  Jakub Jelinek  <jakub@redhat.com>
    
            PR go/101407
            * godump.c (godump_str_hash): New type.
            (godump_container::pot_dummy_types): Use string_hash instead of
            ptr_hash in the hash_set.
    
    (cherry picked from commit 3be762c2ed79e36b9c8faaea2be04725c967a34e)
Comment 9 Jakub Jelinek 2022-05-11 06:36:43 UTC
Fixed.