This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: RFC: Use 32-byte PLT to preserve bound registers


There is a typo in pushq offset computation.  It should be

pushq_offset += ((unsigned char *) pushq_offset)[-6] == 0xf2 ? 1 : 0

instead of

pushq_offset += ((unsigned char *) pushq_offset)[6] == 0xf2 ? 1 : 0

H.J.
----
On Mon, Nov 18, 2013 at 11:03 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> Here is a proposal to use 32-byte PLT to preserve bound registers.
> Any comments?
>
> BTW, we are working on another proposal to use a second PLT
> section with 8 byte or 16 byte memory overhead, instead of
> 24 byte overhead.
>
> --
> H.J.
> ---
> Intel MPX:
>
> http://software.intel.com/sites/default/files/319433-015.pdf
>
> introduces 4 bound registers, which will be used for parameter passing
> in x86-64.  Bound registers are cleared by branch instructions.  Branch
> instructions with BND prefix will keep bound register contents. This leads
> to 2 requirements to 64-bit MPX run-time:
>
> 1. Dynamic linker (ld.so) should save and restore bound registers during
> symbol lookup.
> 2. Change the current 16-byte PLT0:
>
>   ff 35 08 00 00 00    pushq  GOT+8(%rip)
>   ff 25 00 10 00    jmpq  *GOT+16(%rip)
>   0f 1f 40 00        nopl   0x0(%rax)
>
> and 16-byte PLT1:
>
>   ff 25 00 00 00 00        jmpq   *name@GOTPCREL(%rip)
>   68 00 00 00 00           pushq  $index
>   e9 00 00 00 00           jmpq   PLT0
>
> which clear bound registers, to preserve bound registers.
>
> We use 2 new relocations:
>
> #define R_X86_64_PC32_BND  39 /* PC relative 32 bit signed with BND prefix */
> #define R_X86_64_PLT32_BND 40 /* 32 bit PLT address with BND prefix */
>
> to mark branch instructions with BND prefix.
>
> When linker sees any R_X86_64_PC32_BND or R_X86_64_PLT32_BND relocations,
> it switches to a different PLT0:
>
>   ff 35 08 00 00 00    pushq  GOT+8(%rip)
>   f2 ff 25 00 10 00    bnd jmpq *GOT+16(%rip)
>   0f 1f 00        nopl   (%rax)
>
> to preserve bound registers for symbol lookup.  For a symbol with
> R_X86_64_PC32_BND or R_X86_64_PLT32_BND relocations, linker will use
> a 32-byte PLT1:
>
>   f2 ff 25 00 00 00 00        bnd jmpq   *name@GOTPCREL(%rip)
>   68 00 00 00 00        pushq       $index
>   f2 e9 00 00 00 00           bnd jmpq   PLT0
>   0f 1f 80 00 00 00 00        nopl       0(%rax)
>   0f 1f 80 00 00 00 00        nopl       0(%rax)
>
> Prelink stores the offset of pushq of PLT1 (plt_base + 0x16) in GOT[1] and
> GOT[1] is stored in GOT[3].  We can undo prelink in GOT by computing
> the corresponding the pushq offset with
>
> GOT[1] + (GOT offset - &GOT[3]) * 2
>
> It depends on that each pushq is 16-byte apart and GOT entry is 8 byte.
> To support prelink, each 16-byte block in PLT must have an 8-byte entry
> in GOT.  Linker allocates 2 8-byte entries in GOT for each 32-byte PLT1.
> Then we can undo prelink by computing the corresponding the pushq offset
> with
>
> pushq_offset = GOT[1] + (GOT offset - &GOT[3]) * 2
> pushq_offset += ((unsigned char *) pushq_offset)[6] == 0xf2 ? 1 : 0
>
> For each symbol with R_X86_64_PC32_BND or R_X86_64_PLT32_BND
> relocations, this approach increases PLT size by 16 bytes and
> GOT size by 8 bytes.  That is 24 bytes in total.
>
> Pros: No additional sections are needed.
> Cons: 24-byte memory overhead for each symbol with BND relocation.



-- 
H.J.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]