This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
Re: bad interaction between inlining and sibcall optimization?
- To: David Mosberger <davidm at hpl dot hp dot com>
- Subject: Re: bad interaction between inlining and sibcall optimization?
- From: Richard Henderson <rth at redhat dot com>
- Date: Mon, 26 Mar 2001 18:56:34 -0800
- Cc: gcc-bugs at gcc dot gnu dot org, davidm at napali dot hpl dot hp dot com
- References: <200103232104.NAA24193@napali.hpl.hp.com>
On Fri, Mar 23, 2001 at 01:04:08PM -0800, David Mosberger wrote:
> Attached below is a small test program that demonstrates what appears
> to be a bad interaction between inlining and the "sibcall"
> optimization. Specifically, if I compile the program with the gcc 3.0
> branch configured for IA-64 using the command "gcc -O3 fac.c", then
> fac1() gets inlined into fac() as expected, but the inlined recursive
> call to fac() is not being treated as a "sibcall". In contrast, if I
> drop the optimization level to -O2, both fac() and fac1() are sibcall
> optimized, but of course no inlining is done that way.
This is mildly amusing.
This fails because we generate too much silly stuff at the end
of the function that the mildly simplistic end-of-function
detection bits in sibcall.c cannot suss their way though.
Stripping unimportant bits, we have, after the call:
(insn/i 55 54 62 (set (reg:DF 352)
(reg:DF 136 f8)) -1 (nil)
(nil))
(insn/i 63 62 67 (set (reg:DF 344)
(reg:DF 352)) -1 (nil)
(nil))
(code_label/i 67 63 85 10 "" "" [1 uses])
(insn 70 68 76 (set (reg/i:DF 136 f8)
(reg:DF 344)) -1 (nil)
(nil))
(code_label 76 70 86 5 "" "" [1 uses])
(insn 77 86 0 (use (reg/i:DF 136 f8)) -1 (nil)
(nil))
It's the presense of two labels what kills you -- the call
is not in a block that flows immediately to the exit.
> Is this something that can be fixed with a reasonable amount of
> effort?
Yep.
Assuming this detection problem is fixed, you'll get even better
results if you use C++ (which does inlining on trees instead of
on rtl). In rtl we only know to try tail-call optimization, since
the call in question originated in fac1. But after tree inlining
we see that fac is properly tail-recursive.
r~