This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] x86-64: Load external function address via GOT slot
- From: "H.J. Lu" <hjl dot tools at gmail dot com>
- To: Uros Bizjak <ubizjak at gmail dot com>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, Jakub Jelinek <jakub at redhat dot com>, Richard Sandiford <rdsandiford at googlemail dot com>
- Date: Mon, 20 Jun 2016 13:33:47 -0700
- Subject: Re: [PATCH] x86-64: Load external function address via GOT slot
- Authentication-results: sourceware.org; auth=none
- References: <20160620170501 dot GA18281 at intel dot com> <CAFULd4ZU4LT044ukSvCrJKAtCJYWz9OD1o3wV_ggEDq0gZ27Rg at mail dot gmail dot com> <CAMe9rOqAZRwR4hrCK9_OH-mwQVnZuH1NTKg7BUaU2NONSWwCaA at mail dot gmail dot com> <CAFULd4aK2ODUjXvo6QHPweZS2RrDBuSMvu+vLJR70_60wiOMJg at mail dot gmail dot com> <87shw7d4nl dot fsf at googlemail dot com>
On Mon, Jun 20, 2016 at 12:46 PM, Richard Sandiford
<rdsandiford@googlemail.com> wrote:
> Uros Bizjak <ubizjak@gmail.com> writes:
>> On Mon, Jun 20, 2016 at 9:19 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>> On Mon, Jun 20, 2016 at 12:13 PM, Uros Bizjak <ubizjak@gmail.com> wrote:
>>>> On Mon, Jun 20, 2016 at 7:05 PM, H.J. Lu <hongjiu.lu@intel.com> wrote:
>>>>> Hi,
>>>>>
>>>>> This patch implements the alternate code sequence recommended in
>>>>>
>>>>> https://groups.google.com/forum/#!topic/x86-64-abi/de5_KnLHxtI
>>>>>
>>>>> to load external function address via GOT slot with
>>>>>
>>>>> movq func@GOTPCREL(%rip), %rax
>>>>>
>>>>> so that linker won't create an PLT entry for extern function
>>>>> address.
>>>>>
>>>>> Tested on x86-64. OK for trunk?
>>>>
>>>>> + else if (ix86_force_load_from_GOT_p (op1))
>>>>> + {
>>>>> + /* Load the external function address via the GOT slot to
>>>>> + avoid PLT. */
>>>>> + op1 = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, op1),
>>>>> + (TARGET_64BIT
>>>>> + ? UNSPEC_GOTPCREL
>>>>> + : UNSPEC_GOT));
>>>>> + op1 = gen_rtx_CONST (Pmode, op1);
>>>>> + op1 = gen_const_mem (Pmode, op1);
>>>>> + /* This symbol must be referenced via a load from the Global
>>>>> + Offset Table. */
>>>>> + set_mem_alias_set (op1, ix86_GOT_alias_set ());
>>>>> + op1 = convert_to_mode (mode, op1, 1);
>>>>> + op1 = force_reg (mode, op1);
>>>>> + emit_insn (gen_rtx_SET (op0, op1));
>>>>> + /* Generate a CLOBBER so that there will be no REG_EQUAL note
>>>>> + on the last insn to prevent cse and fwprop from replacing
>>>>> + a GOT load with a constant. */
>>>>> + rtx tmp = gen_reg_rtx (Pmode);
>>>>> + emit_clobber (tmp);
>>>>> + return;
>>>>
>>>> Jeff, is this the recommended way to prevent CSE, as far as RTL
>>>> infrastructure is concerned? I didn't find any example of this
>>>> approach with other targets.
>>>>
>>>
>>> FWIW, the similar approach is used in ix86_expand_vector_move_misalign,
>>> ix86_expand_convert_uns_didf_sse and ix86_expand_vector_init_general
>>> as well as other targets:
>>>
>>> frv/frv.c: emit_clobber (op0);
>>> frv/frv.c: emit_clobber (op1);
>>> im32c/m32c.c: /* emit_clobber (gen_rtx_REG (HImode, R0L_REGNO)); */
>>> s390/s390.c: emit_clobber (addr);
>>> s390/s390.md: emit_clobber (reg0);
>>> s390/s390.md: emit_clobber (reg1);
>>> s390/s390.md: emit_clobber (reg0);
>>> s390/s390.md: emit_clobber (reg0);
>>> s390/s390.md: emit_clobber (reg1);
>>
>> These usages mark the whole register as being "clobbered"
>> (=undefined), before only a part of register is written, e.g.:
>>
>> emit_clobber (int_xmm);
>> emit_move_insn (gen_lowpart (DImode, int_xmm), input);
>>
>> They aren't used to prevent unwanted CSE.
>
> Since it's being called in the move expander, I thought the normal
> way of preventing the constant being rematerialised would be to reject
> it in the move define_insn predicates.
>
> FWIW, I agree that using a clobber for this is going to be fragile.
>
Here is the alternative from clobber.
--
H.J.
--
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index a68983c..79999df 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -2347,7 +2347,7 @@
(define_insn "*movsi_internal"
[(set (match_operand:SI 0 "nonimmediate_operand"
"=r,m ,*y,*y,?rm,?*y,*v,*v,*v,m ,?r ,?r,?*Yi,*k ,*rm")
- (match_operand:SI 1 "general_operand"
+ (match_operand:SI 1 "ix86_general_operand"
"g ,re,C ,*y,*y ,rm ,C ,*v,m ,*v,*Yj,*v,r ,*krm,*k"))]
"!(MEM_P (operands[0]) && MEM_P (operands[1]))"
{
@@ -2564,7 +2564,7 @@
(define_insn "*movqi_internal"
[(set (match_operand:QI 0 "nonimmediate_operand"
"=q,q ,q ,r,r ,?r,m ,k,k,r ,m,k")
- (match_operand:QI 1 "general_operand"
+ (match_operand:QI 1 "ix86_general_operand"
"q ,qn,qm,q,rn,qm,qn,r ,k,k,k,m"))]
"!(MEM_P (operands[0]) && MEM_P (operands[1]))"
{
diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
index 06a0002..a471deb 100644
--- a/gcc/config/i386/predicates.md
+++ b/gcc/config/i386/predicates.md
@@ -367,6 +367,12 @@
}
})
+;; Return true if OP is general operand representable on ix86
+(define_predicate "ix86_general_operand"
+ (and (match_operand 0 "general_operand")
+ (ior (not (match_code "symbol_ref"))
+ (match_test "!ix86_force_load_from_GOT_p (op)"))))
+
;; Return true if size of VALUE can be stored in a sign
;; extended immediate field.
(define_predicate "x86_64_immediate_size_operand"
@@ -1036,6 +1042,9 @@
struct ix86_address parts;
int ok;
+ if (ix86_force_load_from_GOT_p (op))
+ return false;
+
if (!CONST_INT_P (op)
&& mode != VOIDmode
&& GET_MODE (op) != mode)