[Bug ipa/115097] Strange suboptimal codegen specifically at -O2 when copying struct type
rguenth at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Wed May 15 07:05:20 GMT 2024
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115097
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Component|tree-optimization |ipa
CC| |hubicka at gcc dot gnu.org,
| |rguenth at gcc dot gnu.org
--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
So actually it seems that the reason is ICF plus inlining:
t.ii:2:3: optimized: Semantic equality hit:A test1(A&)/0->A test2(A&&)/1
t.ii:2:3: optimized: Assembler symbol names:_Z5test1R1A/0->_Z5test2O1A/1
t.ii:2:3: optimized: Semantic equality hit:A test1(A&)/0->A test3(const A&)/2
t.ii:2:3: optimized: Assembler symbol names:_Z5test1R1A/0->_Z5test3RK1A/2
t.ii:2:3: optimized: Semantic equality hit:A test1(A&)/0->A test4(const A&&)/3
t.ii:2:3: optimized: Assembler symbol names:_Z5test1R1A/0->_Z5test4OK1A/3
optimized: Inlined A test1(A&)/4 into A test2(A&&)/1 which now has time
4.000000 and size 5, net change of -1.
optimized: Inlined A test1(A&)/5 into A test3(const A&)/2 which now has time
4.000000 and size 5, net change of -1.
optimized: Inlined A test1(A&)/6 into A test4(const A&&)/3 which now has time
4.000000 and size 5, net change of -1.
for some reason we "optimize" the functions to the following in IPA ICF:
struct A test4 (const struct A & a)
{
struct A retval.6;
<bb 2> [local count: 1073741824]:
retval.6 = test1 (a_2(D)); [tail call]
return retval.6;
}
struct A test3 (const struct A & a)
{
struct A retval.5;
<bb 2> [local count: 1073741824]:
retval.5 = test1 (a_2(D)); [tail call]
return retval.5;
}
struct A test2 (struct A & a)
{
struct A retval.4;
<bb 2> [local count: 1073741824]:
retval.4 = test1 (a_2(D)); [tail call]
return retval.4;
}
and then we inline them back, introducing the extra copy. Why do we use
tail-calls here instead of aliases? Why do we lack cost modeling here?
Why do we inline back? It looks like a pointless exercise to me ...
With -fdisable-ipa-inline we get
_Z5test2O1A:
.LFB5:
.cfi_startproc
jmp _Z5test1R1A
so that's at least reasonable and what's expected I suppose. So one
could argue the bug is in the inliner and with introducing the extra
copy (IIRC there's a bug about this), but still.
More information about the Gcc-bugs
mailing list