This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: PING^3: [PATCH] x86: Force __x86_indirect_thunk_reg for function call via GOT


> On Sun, Mar 11, 2018 at 7:40 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> > On Mon, Mar 5, 2018 at 4:20 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> >> On Tue, Feb 27, 2018 at 11:39 AM, H.J. Lu <hongjiu.lu@intel.com> wrote:
> >>> For x86 targets, when -fno-plt is used, external functions are called
> >>> via GOT slot, in 64-bit mode:
> >>>
> >>>         [bnd] call/jmp *foo@GOTPCREL(%rip)
> >>>
> >>> and in 32-bit mode:
> >>>
> >>>         [bnd] call/jmp *foo@GOT[(%reg)]
> >>>
> >>> With -mindirect-branch=, they are converted to, in 64-bit mode:
> >>>
> >>>         pushq          foo@GOTPCREL(%rip)
> >>>         [bnd] jmp      __x86_indirect_thunk[_bnd]
> >>>
> >>> and in 32-bit mode:
> >>>
> >>>         pushl          foo@GOT[(%reg)]
> >>>         [bnd] jmp      __x86_indirect_thunk[_bnd]
> >>>
> >>> which were incompatible with CFI.  In 64-bit mode, since R11 is a scratch
> >>> register, we generate:
> >>>
> >>>         movq           foo@GOTPCREL(%rip), %r11
> >>>         [bnd] call/jmp __x86_indirect_thunk_[bnd_]r11
> >>>
> >>> instead.  We do it in ix86_output_indirect_branch so that we can use
> >>> the newly proposed R_X86_64_THUNK_GOTPCRELX relocation:
> >>>
> >>> https://groups.google.com/forum/#!topic/x86-64-abi/eED5lzn3_Mg
> >>>
> >>>         movq           foo@OTPCREL_THUNK(%rip), %r11
> >>>         [bnd] call/jmp __x86_indirect_thunk_[bnd_]r11
> >>>
> >>> to load GOT slot into R11.  If foo is defined locally, linker can can
> >>> convert
> >>>
> >>>         movq           foo@GOTPCREL_THUNK(%rip), %reg
> >>>         call/jmp       __x86_indirect_thunk_reg
> >>>
> >>> to
> >>>
> >>>         call/jmp       foo
> >>>         nop            0L(%rax)
> >>>
> >>> In 32-bit mode, since all caller-saved registers, EAX, EDX and ECX, may
> >>> used to function parameters, there is no scratch register available.  For
> >>> -fno-plt -fno-pic -mindirect-branch=, we expand external function call
> >>> to:
> >>>
> >>>         movl           foo@GOT, %reg
> >>>         [bnd] call/jmp *%reg
> >>>
> >>> so that it can be converted to
> >>>
> >>>         movl           foo@GOT, %reg
> >>>         [bnd] call/jmp __x86_indirect_thunk_[bnd_]reg
> >>>
> >>> in ix86_output_indirect_branch.  Since this is performed during RTL
> >>> expansion, other instructions may be inserted between movl and call/jmp.
> >>> Linker optimization isn't always possible.

I suppose we can just combine those into patterns if we want to prevent gcc from
interleaving this with other instructions.  However since this affects ABI and
not only return thunk, did you discuss the changes with LLVM folks as well?

I would be nice to not have diverging solutions.

Honza


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]