[Bug libstdc++/80335] New: perf of copying std::optional<trivial>
glisse at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Wed Apr 5 23:42:00 GMT 2017
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80335
Bug ID: 80335
Summary: perf of copying std::optional<trivial>
Product: gcc
Version: 7.0.1
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: enhancement
Priority: P3
Component: libstdc++
Assignee: unassigned at gcc dot gnu.org
Reporter: glisse at gcc dot gnu.org
Target Milestone: ---
I was surprised recently, while profiling some application, to see
boost::optional::assign relatively high (about 4%, while I am only using it in
a single place). After checking, it seems that std::optional in libstdc++ has
the same issue. The point is, copy-assigning an optional<int> involves 2
conditions (is lhs initialized, is rhs initialized) while a brutal
memcpy/memmove would be perfectly safe for a trivial type like int (sanitizers
and valgrind might occasionally complain about reading uninitialized memory
though).
#include <optional>
typedef std::optional<int> O;
void f1(O&a, O const&b){ a=b; }
void f2(O&a, O const&b){ __builtin_memmove(&a,&b,sizeof(O)); }
yields
cmpb $0, 4(%rdi)
movzbl 4(%rsi), %eax
je .L2
testb %al, %al
jne .L8
movb $0, 4(%rdi)
ret
.p2align 4,,10
.p2align 3
.L2:
testb %al, %al
je .L9
movl (%rsi), %eax
movb $1, 4(%rdi)
movl %eax, (%rdi)
ret
.p2align 4,,10
.p2align 3
.L9:
ret
.p2align 4,,10
.p2align 3
.L8:
movl (%rsi), %eax
movl %eax, (%rdi)
ret
vs
movq (%rsi), %rax
movq %rax, (%rdi)
ret
I am wondering under what conditions we could implement copying this way:
- type small enough: don't want an expensive copy for an empty optional
- type "trivial": no need to run the destructor, copy assignment, etc
- not using a sanitizer, not going to use valgrind: that cannot really be
tested...
More information about the Gcc-bugs
mailing list