This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH i386] Allow sibcalls in no-PLT PIC
- From: Rich Felker <dalias at libc dot org>
- To: "H.J. Lu" <hjl dot tools at gmail dot com>
- Cc: Richard Henderson <rth at redhat dot com>, Michael Matz <matz at suse dot de>, Jan Hubicka <hubicka at ucw dot cz>, Alexander Monakov <amonakov at ispras dot ru>, GCC Patches <gcc-patches at gcc dot gnu dot org>, Uros Bizjak <ubizjak at gmail dot com>
- Date: Tue, 19 May 2015 21:06:02 -0400
- Subject: Re: [PATCH i386] Allow sibcalls in no-PLT PIC
- Authentication-results: sourceware.org; auth=none
- References: <alpine dot LSU dot 2 dot 20 dot 1505191637140 dot 27315 at wotan dot suse dot de> <20150519180659 dot GG17573 at brightrain dot aerifal dot cx> <555B87F4 dot 30908 at redhat dot com> <CAMe9rOpPa9edYxaHQ1BOSfq4VuLsEQZHB=hXxdagTXpqv2qc=A at mail dot gmail dot com> <555B8ACD dot 20503 at redhat dot com> <CAMe9rOq4-F7+LCxee82LzFS9WjvfErZCDoS=TraXPsakRnyJRQ at mail dot gmail dot com> <20150519201557 dot GK17573 at brightrain dot aerifal dot cx> <CAMe9rOokTXwzuqGu103mHBak_vAWkrbCsgxg5i-pxp1XMbXZuw at mail dot gmail dot com> <20150519205420 dot GL17573 at brightrain dot aerifal dot cx> <CAMe9rOrkmbuVkWQfD+L36etWjN4JPMakPib_No2WpUeCsdd+aA at mail dot gmail dot com>
On Tue, May 19, 2015 at 05:10:11PM -0700, H.J. Lu wrote:
> On Tue, May 19, 2015 at 1:54 PM, Rich Felker <dalias@libc.org> wrote:
> > On Tue, May 19, 2015 at 01:27:06PM -0700, H.J. Lu wrote:
> >> On Tue, May 19, 2015 at 1:15 PM, Rich Felker <dalias@libc.org> wrote:
> >> > On Tue, May 19, 2015 at 12:17:18PM -0700, H.J. Lu wrote:
> >> >> On Tue, May 19, 2015 at 12:11 PM, Richard Henderson <rth@redhat.com> wrote:
> >> >> > On 05/19/2015 12:06 PM, H.J. Lu wrote:
> >> >> >> On Tue, May 19, 2015 at 11:59 AM, Richard Henderson <rth@redhat.com> wrote:
> >> >> >>> On 05/19/2015 11:06 AM, Rich Felker wrote:
> >> >> >>>> I'm still mildly worried that concerns for supporting
> >> >> >>>> relaxation might lead to decisions not to optimize code in ways that
> >> >> >>>> would be difficult to relax (e.g. certain types of address load
> >> >> >>>> reordering or hoisting) but I don't understand GCC internals
> >> >> >>>> sufficiently to know if this concern is warranted or not.
> >> >> >>>
> >> >> >>> It is. The relaxation that HJ is working on requires that the reads from the
> >> >> >>> got not be hoisted. I'm not especially convinced that what he's working on is
> >> >> >>> a win.
> >> >> >>>
> >> >> >>> With LTO, the compiler can do the same job that he's attempting in the linker,
> >> >> >>> without an extra nop. Without LTO, leaving it to the linker means that you
> >> >> >>> can't hoist the load and hide the memory latency.
> >> >> >>>
> >> >> >>
> >> >> >> My relax approach won't take away any optimization done by compiler.
> >> >> >> It simply turns indirect branch into direct branch with a nop prefix at
> >> >> >> link-time. I am having a hard time to understand why we shouldn't do it.
> >> >> >
> >> >> > I well understand what you're doing.
> >> >> >
> >> >> > But my point is that the only time the compiler should present you with the
> >> >> > form of indirect branch you're looking for is when there's no place to hoist
> >> >> > the load.
> >> >> >
> >> >> > At which point, is it really worth adding a new relocation to the ABI? Is it
> >> >> > really worth adding new code to the linker that won't be exercised often?
> >> >>
> >> >> I believe there are plenty of indirect branches via GOT when compiling
> >> >> PIE/PIC with -fno-plt:
> >> >>
> >> >> [hjl@gnu-6 gcc]$ cat /tmp/x.c
> >> >> extern void foo (void);
> >> >>
> >> >> void
> >> >> bar (void)
> >> >> {
> >> >> foo ();
> >> >> }
> >> >> [hjl@gnu-6 gcc]$ ./xgcc -B./ -fPIC -O3 -S /tmp/x.c -fno-plt
> >> >> [hjl@gnu-6 gcc]$ cat x.s
> >> >> ..file "x.c"
> >> >> ..section .text.unlikely,"ax",@progbits
> >> >> ..LCOLDB0:
> >> >> ..text
> >> >> ..LHOTB0:
> >> >> ..p2align 4,,15
> >> >> ..globl bar
> >> >> ..type bar, @function
> >> >> bar:
> >> >> ..LFB0:
> >> >> ..cfi_startproc
> >> >> jmp *foo@GOTPCREL(%rip)
> >> >> ..cfi_endproc
> >> >> ..LFE0:
> >> >> ..size bar, .-bar
> >> >
> >> > I agree these exist. What I question is whether the savings from the
> >> > linker being able to relax this to a direct call in the case where the
> >> > programmer failed to let the compiler make it a direct call to begin
> >> > with (by using hidden or protected visibility) are worth the cost of
> >> > not being able to hoist the load out of loops or schedule it earlier
> >> > in cases where relaxation is not possible because the call target is
> >> > not defined in the same DSO.
> >>
> >> Just for fun. I compiled binutils as PIE with -fno-plt -flto:
> >>
> >> [hjl@gnu-mic-2 gas]$ file as-new
> >> as-new: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV),
> >> dynamically linked (uses shared libs), for GNU/Linux 2.6.32, not
> >> stripped
> >> [hjl@gnu-mic-2 gas]$
> >>
> >> There are 43:
> >>
> >> ff 25 21 93 2d 00 jmpq *0x2d9321(%rip) # 3d5f58 <_DYNAMIC+0x1e8>
> >>
> >> and 1983
> >>
> >> ff 15 eb f4 38 00 callq *0x38f4eb(%rip) # 3d60e0 <_DYNAMIC+0x370>
> >
> > How many of those would be relaxed? I suspect it depends a lot on
> > whether libbfd is static or shared.
>
> When shared libraries are enabled, there are 177 indirect branches
> to locally defined functions. Call to any locally defined functions,
> which aren't compiled with LTO, is indirect.
And are the above indirect calls/jumps (1983+43) candidates for
scheduling/hoisting the address load (that's not being done yet), or
are they the ones the compiler opted not to schedule/hoist? The win
from relaxation seems small here, but as long as you're not going to
block optimizations that would preclude relaxing, I don't see any
disadvantages to doing it.
Rich