This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug libstdc++/57885] unordered_map find slower in 4.8.1 than 4.7.3 with integer key
- From: "jhand at austin dot rr.com" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Mon, 15 Jul 2013 16:32:37 +0000
- Subject: [Bug libstdc++/57885] unordered_map find slower in 4.8.1 than 4.7.3 with integer key
- Auto-submitted: auto-generated
- References: <bug-57885-4 at http dot gcc dot gnu dot org/bugzilla/>
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57885
--- Comment #3 from Jim Hand <jhand at austin dot rr.com> ---
As a simplification, I tried compiling the following code with gcc-4.6.3 into
assembly with gcc-4.6.3 and 4.8.1:
#include <unordered_map>
bool contains(std::unordered_map<int, int> a) {
return a.find(5) != a.end();
}
gcc-4.6.3 generates the following assembly:
.LFB574:
.cfi_startproc
movq 8(%rdi), %rcx
movq 16(%rdi), %rdi
xorl %edx, %edx
movl $5, %eax
divq %rdi
xorl %eax, %eax
movq (%rcx,%rdi,8), %rsi
movq (%rcx,%rdx,8), %rdx
testq %rdx, %rdx
jne .L6
jmp .L2
.p2align 4,,10
.p2align 3
.L11:
movq 8(%rdx), %rdx
testq %rdx, %rdx
je .L10
.L6:
cmpl $5, (%rdx)
jne .L11
cmpq %rdx, %rsi
setne %al
.L2:
rep
ret
.p2align 4,,10
.p2align 3
.L10:
xorl %eax, %eax
ret
.cfi_endproc
gcc-4.8.1 generates the following assembly:
.LFB1323:
.cfi_startproc
movq 8(%rdi), %r8
xorl %edx, %edx
movl $5, %eax
divq %r8
movq (%rdi), %rax
movq (%rax,%rdx,8), %rax
movq %rdx, %r9
testq %rax, %rax
je .L7
movq (%rax), %rcx
movl 8(%rcx), %esi
.p2align 4,,10
.p2align 3
.L3:
cmpl $5, %esi
je .L5
movq (%rcx), %rcx
testq %rcx, %rcx
je .L7
movl 8(%rcx), %esi
xorl %edx, %edx
movslq %esi, %rax
divq %r8
cmpq %rdx, %r9
je .L3
.L7:
xorl %eax, %eax
ret
.p2align 4,,10
.p2align 3
.L5:
movl $1, %eax
ret
.cfi_endproc
In the gcc-4.8.1 code, I see two divq instructions that I think are coming from
line 345 of _Mod_range_hashing in bits/hashtable_policy.h:
343 result_type
344 operator()(first_argument_type __num, second_argument_type __den)
const
345 { return __num % __den; }
I would think that the hash function only needs to be called once for this
function, but I could be misinterpreting the assembly output. It does look to
me like the 4.8 output is doing a lot more than the 4.6.3 output, but I'll
leave it for you to explore.