Bug 98731 - s390x: Large classes of std::bitset and std::vector<bool> hash the same
Summary: s390x: Large classes of std::bitset and std::vector<bool> hash the same
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: libstdc++ (show other bugs)
Version: 11.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: ABI
: 102531 (view as bug list)
Depends on:
Blocks:
 
Reported: 2021-01-18 14:41 UTC by Matthias Klose
Modified: 2021-09-29 18:15 UTC (History)
5 users (show)

See Also:
Host:
Target: s390x-linux-gnu, powerpc-*-*, powerpc64-*-*
Build:
Known to work:
Known to fail:
Last reconfirmed: 2021-01-19 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Matthias Klose 2021-01-18 14:41:46 UTC
[forwarded from https://bugs.debian.org/977638]

same behavior with GCC 8, 9 and 10.

On s390x, std::hash returns identical values for large classes of
std::bitset and std::vector<bool>:

    $ cat bug.cc
    #include <bitset>
    #include <functional>
    #include <iostream>
    #include <vector>

    int main() {
      std::bitset<2> a("00"), b("01");
      std::vector<bool> c = {false, true, false, true};
      std::vector<bool> d = {true, false, true, false};

      std::bitset<9> e("000000000"), f("010101010");
      std::vector<bool> g = {true, true, true, true, true, true, true, true, true};
      std::vector<bool> h = {false, false, false, true, true,
                             false, false, false, false};

      std::hash<std::bitset<2>> h1;
      std::hash<std::bitset<9>> h2;
      std::hash<std::vector<bool>> h3;

      std::cout << h1(a) << '\n'
                << h1(b) << '\n'
                << h3(c) << '\n'
                << h3(d) << "\n\n"
                << h2(e) << '\n'
                << h2(f) << '\n'
                << h3(g) << '\n'
                << h3(h) << '\n';
    }
    $ g++ -o bug bug.cc
    $ ./bug
    7857072875483051545
    7857072875483051545
    7857072875483051545
    7857072875483051545

    4158372090644325695
    4158372090644325695
    4158372090644325695
    4158372090644325695

It appears that the hash value is completely dependent on the size of
the object in bytes. 1–8-bit values all hash to 7857072875483051545;
9–16-bit values all hash to 4158372090644325695; and though bug.cc
doesn’t demonstrate it, 17-bit values hash to 14756137038141193723.
Comment 1 Richard Biener 2021-01-19 07:43:58 UTC
I bet it's the default specialization.
Comment 2 Jonathan Wakely 2021-01-19 10:00:13 UTC
I think we just hash the bytes of the storage, but assume the first bytes are populated. For a bitset<2> we have an unsigned long long where only the two least significant bits are used. For LE they're in the first byte, but for BE they're in the last byte.
Comment 3 Jonathan Wakely 2021-01-19 18:02:50 UTC
Yes the problem is that a bitset<2> uses the two least significant bits of an unsigned long, so we want to hash a single byte. But we take the address of the unsigned long and then hash the first byte. For BE we need to hash the last byte.

With a bitset of more than 64 bits we have a layout like this:

[76543210][...43210]

We assume the value bytes are contiguous and just hash N bytes from the address of the first long, and then the next 5 bytes. We should be hashing the last 5 bytes of the last word instead.

I haven't looked at hash<vector<bool>> but it's probably the same bug.
Comment 4 Jonathan Wakely 2021-01-21 11:35:55 UTC
(In reply to Matthias Klose from comment #0)
> It appears that the hash value is completely dependent on the size of
> the object in bytes.

It's not *completely* dependent on the size. Only the last x.size()%64 bits are hashed incorrectly. For x.size() < 32 it's completely dependent, because we never look at the right bits. For larger numbers of bits we look at *some* of them.

Fixing it is an ABI break though, because it would mean that the same value produces a different hash when compiled with an old GCC or a fixed GCC. Inserting an element into an unordered_map in one TU and then looking it up in another TU could fail.
Comment 5 Jonathan Wakely 2021-09-29 18:15:07 UTC
*** Bug 102531 has been marked as a duplicate of this bug. ***