Created attachment 49302 [details] A few simple source files demonstrating the problem, plus a .ii file generated on my system Hello, I have recently been diagnosing a fault with the 32-bit build of an application, and this has led me to find some strange behaviour with unordered_set, when compiling for 32-bit x86 (i686). I have included an example which demonstrates the behaviour. Consider the following small program: #include <unordered_set> #include <cstdint> uint64_t getBegin(const std::unordered_set<uint64_t> & set) { return *set.begin(); } It returns the first value in a uint64_t unordered_set. This can be compiled into a shared library with the following command: g++ -Wall -Wextra -fpic -shared -Os Working.cpp -o libworking.so This works as intended on all 32 and 64 bit systems which I have tested it on, and on all optimisation levels. However, consider the next small program: #include <vector> #include <unordered_set> #include <cstdint> uint64_t getBegin(const std::unordered_set<uint64_t> & set) { return *set.begin(); } The only change is the addition of a "#include <vector>". This ought not to have any impact on the program, and it ought to work exactly the same. It can also be compiled into a shared library with the following command: g++ -Wall -Wextra -fpic -shared -Os Broken.cpp -o libbroken.so Indeed, when compiled on a 64 bit x86 system there is no difference - both functions work as intended. However, when compiled either natively on a 32-bit x86 system, or cross compiled to 32-bit using the "-m32" flag, "*set.begin()" returns an incorrect value. I tested both of these with the following simple test program: #include <unordered_set> #include <iostream> uint64_t getBegin(const std::unordered_set<uint64_t> & set); int main(void) { std::unordered_set<uint64_t> set({1, 2, 3, 4, 5, 6, 7}); std::cout << getBegin(set) << "\n"; } Which can be compiled using the following commands (linking to either working or broken versions respectively): g++ -Wall -Wextra -Os test.cpp -L. -lworking -o test g++ -Wall -Wextra -Os test.cpp -L. -lbroken -o test Both versions work correctly when compiled and running natively on a 64-bit x86 system, (they print "7"). But, when when either compiled with "-m32" on a 64-bit system, or compiled and running natively on a 32-bit x86 system, the second (broken) variant prints "30064771072". This is certainly not one of the numbers in the unordered_set. I have reproduced the problem across several different 32-bit and 64-bit systems. These are the versions reported by g++ --version, on the systems which I have tried: Debian 10 (stable): g++ (Debian 8.3.0-6) 8.3.0 Debian 11 (testing): g++ (Debian 10.2.0-9) 10.2.0 Fedora 32 (Workstation Edition): g++ (GCC) 10.2.1 20200723 (Red Hat 10.2.1-1) Some source files and a simple makefile, which can reproduce the problem, are attached. I have also included the .ii file which is produced when running the following command, on the broken variant of the function: g++ -save-temps -m32 -Wall -Wextra -fpic -shared -Os Broken.cpp -o libbroken.so Other things which I have noticed: > The problem happens with optimisation levels O1, O2, O3, Os and Ofast, but does not occur with O0. > The problem occurs whenever vector is included BEFORE unordered_set. If vector is included after unordered_set, behaviour is correct. > When broken, the memory address pointed to by the iterator returned by begin(), seems to always be wrong by 4 bytes, (4 bytes before the address which it should be). > When I compared the assembly output of the two functions, the offsets in three movl instructions differ - see below. I assume this is causing the wrong address to be returned. Working version: movl 8(%eax), %eax # MEM[(struct _Hash_node_base * *)set_2(D) + 8B], MEM[(struct _Hash_node_base * *)set_2(D) + 8B] movl 12(%eax), %edx # MEM[(const long long unsigned int &)_4 + 8], MEM[(const long long unsigned int &)_4 + 8] movl 8(%eax), %eax # MEM[(const long long unsigned int &)_4 + 8], MEM[(const long long unsigned int &)_4 + 8] Broken version: movl 8(%eax), %eax # MEM[(struct _Hash_node_base * *)set_2(D) + 8B], MEM[(struct _Hash_node_base * *)set_2(D) + 8B] movl 8(%eax), %edx # MEM[(const long long unsigned int &)_4 + 4], MEM[(const long long unsigned int &)_4 + 4] movl 4(%eax), %eax # MEM[(const long long unsigned int &)_4 + 4], MEM[(const long long unsigned int &)_4 + 4] If there is any more information about this problem which you require, then please let me know. Thank you very much, Dan Wilson.
I've been trying to reduce this a bit. So far I've cut it down to: #include <bits/stl_algobase.h> #include <bits/allocator.h> #include <bits/stl_construct.h> #include <bits/stl_uninitialized.h> #include <bits/stl_vector.h> #include <unordered_set> #include <cstdint> uint64_t getBegin(const std::unordered_set<uint64_t> & set) { return *set.begin(); } which is less than everything in <vector> and still generates the bad code: _3 = MEM[(const long long unsigned int &)_4 + 4]; The problem appeared between gcc 7 and gcc 8, apparently due to something added in the libstdc++ headers. But I'm not sure yet if the actual bug is in the compiler or library.
It looks like the bug is that cp_tree_equal considers __alignof__(FOO) the same as alignof(FOO), but they have different semantics ever since the fix for PR69650. This manifests here because when <vector> is included before <unordered_set>, the specialization cache conflates the dependent specialization aligned_storage<sizeof(_Tp), alignof(_Tp)> used in include/bits/stl_vector.h:1726 with the later dependent specialization aligned_storage<sizeof(_Tp), __alignof__(_Tp)> used in include/ext/aligned_buffer.h:91. We later instantiate the latter type with _Tp=uint64_t as part of the implementation of unordered_set<uint64_t>, but 4 == alignof(uint64_t) != __alignof__(uint64_t) == 8 on x86 so our assumptions about the alignment of the type become incoherent. I think PR88115 reports the same issue. The following patch to cp_tree_equal seems to fix it: gcc/cp/tree.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c index 8b7c6798ee9..8ed9eed1ea5 100644 --- a/gcc/cp/tree.c +++ b/gcc/cp/tree.c @@ -3787,8 +3787,11 @@ cp_tree_equal (tree t1, tree t2) return true; } - case SIZEOF_EXPR: case ALIGNOF_EXPR: + if (ALIGNOF_EXPR_STD_P (t1) != ALIGNOF_EXPR_STD_P (t2)) + return false; + /* Fall through. */ + case SIZEOF_EXPR: { tree o1 = TREE_OPERAND (t1, 0); tree o2 = TREE_OPERAND (t2, 0);
Patch posted: https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555465.html
The master branch has been updated by Patrick Palka <ppalka@gcc.gnu.org>: https://gcc.gnu.org/g:592fe221735bdaa375b1834dd49ce125d0b600d8 commit r11-3704-g592fe221735bdaa375b1834dd49ce125d0b600d8 Author: Patrick Palka <ppalka@redhat.com> Date: Wed Oct 7 10:49:00 2020 -0400 c++: Distinguish alignof and __alignof__ in cp_tree_equal [PR97273] cp_tree_equal currently considers alignof the same as __alignof__, but these operators are semantically different ever since r8-7957. In the testcase below, this causes the second static_assert to fail on targets where alignof(double) != __alignof__(double) because the specialization table (which uses cp_tree_equal as its equality predicate) conflates the two dependent specializations integral_constant<__alignof__(T)> and integral_constant<alignof(T)>. This patch makes cp_tree_equal distinguish between these two operators by inspecting the ALIGNOF_EXPR_STD_P flag. gcc/cp/ChangeLog: PR c++/88115 PR libstdc++/97273 * tree.c (cp_tree_equal) <case ALIGNOF_EXPR>: Return false if ALIGNOF_EXPR_STD_P differ. gcc/testsuite/ChangeLog: PR c++/88115 PR libstdc++/97273 * g++.dg/template/alignof3.C: New test.
The releases/gcc-10 branch has been updated by Patrick Palka <ppalka@gcc.gnu.org>: https://gcc.gnu.org/g:aeb69dda51e93379fce10fb03ec57650fbf73f31 commit r10-8872-gaeb69dda51e93379fce10fb03ec57650fbf73f31 Author: Patrick Palka <ppalka@redhat.com> Date: Thu Oct 8 19:31:57 2020 -0400 c++: Distinguish alignof and __alignof__ in cp_tree_equal [PR97273] cp_tree_equal currently considers alignof the same as __alignof__, but these operators are semantically different ever since r8-7957. In the testcase below, this causes the second static_assert to fail on targets where alignof(double) != __alignof__(double) because the specialization table (which uses cp_tree_equal as its equality predicate) conflates the two dependent specializations integral_constant<__alignof__(T)> and integral_constant<alignof(T)>. This patch makes cp_tree_equal distinguish between these two operators by inspecting the ALIGNOF_EXPR_STD_P flag. gcc/cp/ChangeLog: PR c++/88115 PR libstdc++/97273 * tree.c (cp_tree_equal) <case ALIGNOF_EXPR>: Return false if ALIGNOF_EXPR_STD_P differ. gcc/testsuite/ChangeLog: PR c++/88115 PR libstdc++/97273 * g++.dg/template/alignof3.C: New test. (cherry picked from commit 592fe221735bdaa375b1834dd49ce125d0b600d8)
Fixed for GCC 11 and 10.3 so far.
The releases/gcc-9 branch has been updated by Patrick Palka <ppalka@gcc.gnu.org>: https://gcc.gnu.org/g:9df05884b3a30d32744a070d3fc5780b7323231a commit r9-9043-g9df05884b3a30d32744a070d3fc5780b7323231a Author: Patrick Palka <ppalka@redhat.com> Date: Wed Oct 7 10:49:00 2020 -0400 c++: Distinguish alignof and __alignof__ in cp_tree_equal [PR97273] cp_tree_equal currently considers alignof the same as __alignof__, but these operators are semantically different ever since r8-7957. In the testcase below, this causes the second static_assert to fail on targets where alignof(double) != __alignof__(double) because the specialization table (which uses cp_tree_equal as its equality predicate) conflates the two dependent specializations integral_constant<__alignof__(T)> and integral_constant<alignof(T)>. This patch makes cp_tree_equal distinguish between these two operators by inspecting the ALIGNOF_EXPR_STD_P flag. gcc/cp/ChangeLog: PR c++/88115 PR libstdc++/97273 * tree.c (cp_tree_equal) <case ALIGNOF_EXPR>: Return false if ALIGNOF_EXPR_STD_P differ. gcc/testsuite/ChangeLog: PR c++/88115 PR libstdc++/97273 * g++.dg/template/alignof3.C: New test. (cherry picked from commit 592fe221735bdaa375b1834dd49ce125d0b600d8)
The releases/gcc-8 branch has been updated by Patrick Palka <ppalka@gcc.gnu.org>: https://gcc.gnu.org/g:a34c19926b87f9cadd5ea84f5c5b93ae76b14558 commit r8-10855-ga34c19926b87f9cadd5ea84f5c5b93ae76b14558 Author: Patrick Palka <ppalka@redhat.com> Date: Wed Oct 7 10:49:00 2020 -0400 c++: Distinguish alignof and __alignof__ in cp_tree_equal [PR97273] cp_tree_equal currently considers alignof the same as __alignof__, but these operators are semantically different ever since r8-7957. In the testcase below, this causes the second static_assert to fail on targets where alignof(double) != __alignof__(double) because the specialization table (which uses cp_tree_equal as its equality predicate) conflates the two dependent specializations integral_constant<__alignof__(T)> and integral_constant<alignof(T)>. This patch makes cp_tree_equal distinguish between these two operators by inspecting the ALIGNOF_EXPR_STD_P flag. gcc/cp/ChangeLog: PR c++/88115 PR libstdc++/97273 * tree.c (cp_tree_equal) <case ALIGNOF_EXPR>: Return false if ALIGNOF_EXPR_STD_P differ. gcc/testsuite/ChangeLog: PR c++/88115 PR libstdc++/97273 * g++.dg/template/alignof3.C: New test. (cherry picked from commit 592fe221735bdaa375b1834dd49ce125d0b600d8)
Fixed.