[Bug middle-end/35560] Missing CSE/PRE for memory operations involved in virtual call.
witold.baryluk+gcc at gmail dot com
gcc-bugzilla@gcc.gnu.org
Fri Dec 30 21:46:30 GMT 2022
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=35560
Witold Baryluk <witold.baryluk+gcc at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |witold.baryluk+gcc at gmail dot co
| |m
--- Comment #15 from Witold Baryluk <witold.baryluk+gcc at gmail dot com> ---
I know this is a pretty old bug, but I was exploring some assembly of gcc and
clang on godbolt, and also stumbled into same issue.
https://godbolt.org/z/qPzMhWse1
class A {
public:
virtual int f7(int x) const;
};
int g(const A * const a, int x) {
int r = 0;
for (int i = 0; i < 10000; i++)
r += a->f7(x);
return r;
}
(same happens without loop, when just calling a->f7 multiple times)
g(A const*, int):
push r13
mov r13d, esi
push r12
xor r12d, r12d
push rbp
mov rbp, rdi
push rbx
mov ebx, 10000
sub rsp, 8
.L2:
mov rax, QWORD PTR [rbp+0] # a vtable deref
mov esi, r13d
mov rdi, rbp
call [QWORD PTR [rax]] # f7 indirect call
add r12d, eax
dec ebx
jne .L2
add rsp, 8
pop rbx
pop rbp
mov eax, r12d
pop r12
pop r13
ret
I was expecting mov rax, QWORD PTR [rbp+0] and call [QWORD PTR [rax]],
to be hoisted out of the loop (call converted to lea, and call register).
A bit sad.
Is there some recent work done on this optimization?
Are there at least some cases where it is valid to do CSE, or change code so it
is moved out of the loop?
More information about the Gcc-bugs
mailing list