Bug 63533 - Function splitter causes unnecessary splits
Summary: Function splitter causes unnecessary splits
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: ipa (show other bugs)
Version: 5.0
: P3 enhancement
Target Milestone: 10.0
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2014-10-14 11:31 UTC by Yury Gribov
Modified: 2021-12-25 09:36 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2014-10-14 00:00:00


Attachments
Short repro (107 bytes, text/plain)
2014-10-14 11:31 UTC, Yury Gribov
Details
Draft patch (275 bytes, patch)
2014-10-14 11:32 UTC, Yury Gribov
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Yury Gribov 2014-10-14 11:31:38 UTC
Created attachment 33712 [details]
Short repro

The attached reprocase contains an innocent function which can not ever benefit from inlining. But fnsplit pass still jumps on it and splits it to two functions thus hurting performance and code size:
 gcc -O2 -fconserve-stack -S repro.i -o-
 ...g.part.0:
 ...
 g:
 ...
 	call	g.part.0
This can not be recovered later by inliner because of -fconserve-stack option. The reprocase is a reduced version of a real-world code (from Linux kernel).
Comment 1 Yury Gribov 2014-10-14 11:32:38 UTC
Created attachment 33713 [details]
Draft patch

Here is a draft patch. If this makes sense I'll regtest and send to gcc-patches.
Comment 2 Jan Hubicka 2014-10-14 14:32:09 UTC
If g is called with argument that is usually 0, then the partial inlining makes sense. I guess it is a problem that we try to apply stack size limit to prevent inlining of .part into the original function body. Pehraps inliner should special case this and bypass the limits?
Comment 3 Yury Gribov 2014-10-14 14:49:22 UTC
> If g is called with argument that is usually 0,
> then the partial inlining makes sense.

But note that there are zero callers of g in the file so no inlining can happen anyway. Frankly I was surprised to see this !address_taken there, what's the reasoning behind it?

> I guess it is a problem that we try to apply stack size limit
> to prevent inlining of .part into the original function body.

Exactly.

> Perhaps inliner should special case this and bypass the limits?

You mean ignore stack limits if we somehow know that callee was originally part of the caller? Probably could be done but looks somewhat hacky.
Comment 4 Andrew Pinski 2014-10-14 15:26:02 UTC
Note the section of the split function is also different from the original function if the user had supplied a section.
Comment 5 Andrew Pinski 2021-12-25 09:36:36 UTC
> The attached reprocase contains an innocent function which can not ever benefit from inlining.

That is not true any more.
In GCC 10+, we get:
g:
        testl   %edi, %edi
        jne     .L8
        subq    $1016, %rsp
        xorl    %eax, %eax
        movq    %rsp, %rdi
        call    f
        addq    $1016, %rsp
        ret
        .p2align 4,,10
        .p2align 3
.L8:
        jmp     g.part.0

g.part.0:
        subq    $8, %rsp
        xorl    %eax, %eax
        call    f
        xorl    %eax, %eax
        call    f
        xorl    %eax, %eax
        call    f
        xorl    %eax, %eax
        call    f
        xorl    %eax, %eax
        addq    $8, %rsp
        jmp     f

So we conserve stack space now due to two things, conditional shrink wrapping and being able to tail callto g.part.0. The performance miss is just one unconditional jump (there is another bug handling conditional tail calls already).

So all fixed in GCC 10+.