This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations


> Normally, with -fPIE/-fpie, GCC accesses globals that are extern to the
> module using the GOT.  This is two instructions, one to get the address
> of the global from the GOT and the other to get the value.  If it turns
> out that the global gets defined in the executable at link-time, it still
> needs to go through the GOT as it is too late then to generate a direct
> access.
>
> Examples:
>
> foo.cc
> ------
> int a_glob;
> int main () {
>   return a_glob; // defined in this file
> }
>
> With -O2 -fpie -pie, the generated code directly accesses the global via
> PC-relative insn:
>
> 5e0   <main>:
>    mov    0x165a(%rip),%eax        # 1c40 <a_glob>
>
> foo.cc
> ------
>
> extern int a_glob;
> int main () {
>   return a_glob; // defined in this file
> }
>
> With -O2 -fpie -pie, the generated code accesses global via GOT using
> two memory loads:
>
> 6f0  <main>:
>    mov    0x1609(%rip),%rax   # 1d00 <_DYNAMIC+0x230>
>    mov    (%rax),%eax
>
> This is true even if in the latter case the global was defined in the
> executable through a different file.
>
> Some experiments on google benchmarks shows that the extra memory loads
> affects performance by 1% to 5%.
>
> Solution - Copy Relocations:
>
> When the linker supports copy relocations, GCC can always assume that
> the global will be defined in the executable.  For globals that are truly
> extern (come from shared objects), the linker will create copy relocations
> and have them defined in the executable. Result is that no global access
> needs to go through the GOT and hence improves performance.
>
> This optimization only applies to undefined, non-weak global data.
> Undefined, weak global data access still must go through the GOT.
>
> This patch checks if linker supports PIE with copy reloc, which is
> enabled in gold and bfd linker in bininutils 2.25, at configure time
> and enables this optimization if the linker support is available.
>
> gcc/
>
> * configure.ac (HAVE_LD_PIE_COPYRELOC): Defined to 1 if
> Linux/x86-64 linker supports PIE with copy reloc.
> * config.in: Regenerated.
> * configure: Likewise.
>
> * config/i386/i386.c (legitimate_pic_address_disp_p): Allow
> pc-relative address for undefined, non-weak, non-function
> symbol reference in 64-bit PIE if linker supports PIE with
> copy reloc.
>
> * doc/sourcebuild.texi: Document pie_copyreloc target.
>
> gcc/testsuite/
>
> * gcc.target/i386/pie-copyrelocs-1.c: New test.
> * gcc.target/i386/pie-copyrelocs-2.c: Likewise.
> * gcc.target/i386/pie-copyrelocs-3.c: Likewise.
> * gcc.target/i386/pie-copyrelocs-4.c: Likewise.
>
> * lib/target-supports.exp (check_effective_target_pie_copyreloc):
> New procedure.

It caused pr64189.

Dominique.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]