This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH i386] Allow sibcalls in no-PLT PIC
- From: Rich Felker <dalias at libc dot org>
- To: "H.J. Lu" <hjl dot tools at gmail dot com>
- Cc: Jan Hubicka <hubicka at ucw dot cz>, Alexander Monakov <amonakov at ispras dot ru>, GCC Patches <gcc-patches at gcc dot gnu dot org>, Uros Bizjak <ubizjak at gmail dot com>
- Date: Fri, 15 May 2015 16:23:19 -0400
- Subject: Re: [PATCH i386] Allow sibcalls in no-PLT PIC
- Authentication-results: sourceware.org; auth=none
- References: <1430757479-14241-1-git-send-email-amonakov at ispras dot ru> <1430757479-14241-5-git-send-email-amonakov at ispras dot ru> <alpine dot LNX dot 2 dot 11 dot 1505151927170 dot 22867 at monopod dot intra dot ispras dot ru> <CAMe9rOq+gfTcCoY4AKefkZgysHr_XXOaCtOFWDro8OeT+mBUWQ at mail dot gmail dot com> <20150515194824 dot GB14415 at kam dot mff dot cuni dot cz> <CAMe9rOr3yQv-+cJxwp_YfNY=9G+FnV6=JdYh_T6kPDJtSA0fzg at mail dot gmail dot com>
On Fri, May 15, 2015 at 01:08:15PM -0700, H.J. Lu wrote:
> With relax branch in 32-bit, there are 2 cases:
>
> 1. PIC or PIE: We generate
>
> set up EBX
> relax call foo@PLT
>
> It is almost the same as we do now, except for the relax prefix.
> If foo is defined in another shared library or may be preempted,
> linker will generate
>
> call *foo@GOTPLT(%ebx)
>
> If foo turns out local, linker will output
>
> relax call foo
This does not address the initial and primary motivation for no-plt on
32-bit: eliminating the awful codegen constraint costs of the
GOT-register (ebx, and equivalent on other targets) ABI for calling
PLT entries. If instead you generated code that sets up an expression
for the GOT slot using arbitrary registers, and relaxed it to a direct
call (possibly rendering the register setup useless), it would be
comparable to the no-plt approach. So for example:
set up ecx (or whatever register)
relax call *foo@GOT(%ecx)
and relax to:
set up ecx (or whatever register; now useless)
relax call foo
But the no-plt approach is still superior in that the address load
from the GOT can be hoisted out of loops, etc., resulting in something
like:
call *%esi
This could be valuable in loops calling a math function repeatedly,
for example.
Overall I'm still not a fan of the relaxation approach. There are very
few places it would actually help that couldn't already be improved
better with use of visibility, and it can't give codegen as good as
no-plt option.
Rich