This is the mail archive of the
mailing list for the GCC project.
Re: [RFC][PATCH][X86_64] Eliminate PLT stubs for specified external functions via -fno-plt=
- From: Sriraman Tallam <tmsriram at google dot com>
- To: Sriraman Tallam <tmsriram at google dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>, "H.J. Lu" <hjl dot tools at gmail dot com>, David Li <davidxl at google dot com>
- Date: Thu, 30 Apr 2015 20:26:00 -0700
- Subject: Re: [RFC][PATCH][X86_64] Eliminate PLT stubs for specified external functions via -fno-plt=
- Authentication-results: sourceware.org; auth=none
- References: <CAAs8HmxC4KQSc5EWiux7syOj8+ghs_LY8zY-D4-Wjp+ZhHiDuw at mail dot gmail dot com> <20150501032132 dot GA4302 at bubble dot grove dot modra dot org>
On Thu, Apr 30, 2015 at 8:21 PM, Alan Modra <firstname.lastname@example.org> wrote:
> On Thu, Apr 30, 2015 at 05:31:30PM -0700, Sriraman Tallam wrote:
>> This comes with caveats. This cannot be generally done for all
>> functions marked extern as it is impossible for the compiler to say if
>> a function is "truly extern" (defined in a shared library). If a
>> function is not truly extern(ends up defined in the final executable),
>> then calling it indirectly is a performance penalty as it could have
>> been a direct call. Further, the newly created GOT entries are fixed
>> up at start-up and do not get lazily bound.
> I've considered something similar for PowerPC (but didn't consider
> doing do so for a subset of calls). Losing lazy symbol resolution is
> a real problem.
With -fno-plt= option, you are choosing functions that are hot and
PLT must be avoided. Losing lazy binding on these should be perfectly
fine because they would be called.
The other problem you cite of indirect calls that
> could be direct can be fixed in the linker relatively easily.
> Edit this code
> 0: ff 15 00 00 00 00 callq *0x0(%rip) # 0x6
> 2: R_X86_64_GOTPCREL foo-0x4
> 6: ff 25 00 00 00 00 jmpq *0x0(%rip) # 0xc
> 8: R_X86_64_GOTPCREL foo-0x4
> to this
> c: e8 00 00 00 00 callq 0x11
> d: R_X86_64_PC32 foo-0x4
> 11: 90 nop
> 12: e9 00 00 00 00 jmpq 0x17
> 13: R_X86_64_PC32 foo-0x4
> 17: 90 nop
> You may need to have gcc or gas add a marker reloc to say exactly
> where an instruction starts.
> Alan Modra
> Australia Development Lab, IBM