[Bug libstdc++/102531] New: std::hash does not work correctly on Big Endian platforms

miladfarca at gmail dot com gcc-bugzilla@gcc.gnu.org
Wed Sep 29 16:44:05 GMT 2021


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102531

            Bug ID: 102531
           Summary: std::hash does not work correctly on Big Endian
                    platforms
           Product: gcc
           Version: 8.4.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: libstdc++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: miladfarca at gmail dot com
  Target Milestone: ---

std::hash does not work correctly on BE machines using the following key types:
- std::hash<std::bitset>
- std::hash<std::vector<bool>>

They both break the 5th rule, a large number of inputs will end up having the
same hash:
https://en.cppreference.com/w/cpp/utility/hash

### std::bitset
```
#include <iostream>
#include <functional>
#include <bitset>

int main(){
  std::bitset<2> a(0b01);
  std::bitset<2> b(0b10);

  std::size_t h1 = std::hash<std::bitset<2>>{}(a);
  std::size_t h2 = std::hash<std::bitset<2>>{}(b);

  std::cout << h1 << std::endl;
  std::cout << h2 << std::endl;
  return 0;

```
Output will be the same on BE machines.
The input length calculation is done incorrectly here:
https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/std/bitset#L1575
`bitset` writes 0 extended size_t sized values into memory. Reading smaller
lengths might return 0 as bytes are not reversed on BE.

### std::vector<bool>
```
#include <iostream>
#include <vector>
#include <functional>

int main(){

  std::vector<bool> a= {static_cast<bool>(1), static_cast<bool>(2)};
  std::vector<bool> b= {static_cast<bool>(1), static_cast<bool>(2),
static_cast<bool>(3)};

  std::size_t h1 = std::hash<std::vector<bool>>{}(a);
  std::size_t h2 = std::hash<std::vector<bool>>{}(b);

  std::cout << h1 << std::endl;
  std::cout << h2 << std::endl;

   return 0;
}
```
Similar issue as bitset. vector<bool> writes size_t sized values into memory.
https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/bits/vector.tcc#L987
The calculated length in the above link might be smaller than the input and
only 0s will get returned as bytes are not reversed on BE.


More information about the Gcc-bugs mailing list