[RFC][PATCH][X86_64] Eliminate PLT stubs for specified external functions via -fno-plt=

Xinliang David Li davidxl@google.com
Tue Jun 2 21:25:00 GMT 2015


On Tue, Jun 2, 2015 at 1:56 PM, Ramana Radhakrishnan
<ramana.gcc@googlemail.com> wrote:
> On Tue, Jun 2, 2015 at 7:15 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>> On Mon, Jun 1, 2015 at 1:33 PM, Ramana Radhakrishnan
>> <ramana.gcc@googlemail.com> wrote:
>>> On Mon, Jun 1, 2015 at 7:55 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>>> On Mon, Jun 1, 2015 at 11:41 AM, Ramana Radhakrishnan
>>>> <ramana.gcc@googlemail.com> wrote:
>>>>> On Mon, Jun 1, 2015 at 7:01 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>>>>> On Mon, Jun 1, 2015 at 1:24 AM, Ramana Radhakrishnan
>>>>>> <ramana.radhakrishnan@arm.com> wrote:
>>>>>>>
>>>>>>>>> Why isn't it just an indirect call in the cases that would require a GOT
>>>>>>>>> slot and a direct call otherwise ? I'm trying to work out what's so
>>>>>>>>> different on each target that mandates this to be in the target backend.
>>>>>>>>> Also it would be better to push the tests into gcc.dg if you can and
>>>>>>>>> check
>>>>>>>>> for the absence of a relocation so that folks at least see these as being
>>>>>>>>> UNSUPPORTED on their target.
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> To be even more explicit, shouldn't this be handled similar to the way in
>>>>>>> which -fno-plt is handled in a target agnostic manner ? After all, if you
>>>>>>> can handle this for the command line, doing the same for a function which
>>>>>>> has been decorated with attribute((noplt)) should be simple.
>>>>>>
>>>>>> -fno-plt does not work for non-PIC code, having non-PIC code not use
>>>>>> PLT was my primary motivation.  Infact, if you go back in this thread,
>>>>>> I suggested to HJ if I should piggyback on -fno-plt.  I tried using
>>>>>> the -fno-plt implementation to do this by removing the flag_pic check
>>>>>> in calls.c, but that does not still work for non-PIC code.
>>>
>>> If you want __attribute__ ((noplt)) to work for non-PIC code, we
>>> should look to code it in the same place surely by making all
>>> __attribute__((noplt)) calls, indirect calls irrespective of whether
>>> it's fpic or not.
>>>
>>>
>>>>>
>>>>> You're missing my point, unless I'm missing something basic here - I
>>>>> should have been even more explicit and said -fPIC was a given in all
>>>>> this discussion.
>>>>>
>>>>> calls.c:229 has
>>>>>
>>>>> else if (flag_pic && !flag_plt && fndecl_or_type
>>>>>            && TREE_CODE (fndecl_or_type) == FUNCTION_DECL
>>>>>            && !targetm.binds_local_p (fndecl_or_type))
>>>>>
>>>>> why can't we merge the check in here for the attribute noplt ?
>>>>
>>>> We can and and please see this thread, that is the exact patch I proposed :
>>>> https://gcc.gnu.org/ml/gcc-patches/2015-05/msg02682.html
>>>>
>>>> However, there was one caveat.  I want this working without -fPIC too.
>>>> non-PIC code also generates PLT calls and I want them eliminated.
>>>>
>>>>>
>>>>> If a new attribute is added to the "GNU language" in this case, why
>>>>> isn't this being treated in the same way as the command line option
>>>>> has been treated ? All this means is that we add an attribute and a
>>>>> command line option to common code and then not implement it in a
>>>>> proper target agnostic fashion.
>>>>
>>>> You are right.  This is the way I wanted it too but I also wanted the
>>>> attribute to work without PIC. PLT calls are generated without -fPIC
>>>> and -fPIE too and I wanted a solution for that.  On looking at the
>>>> code in more detail,
>>>>
>>>> * -fno-plt is made to work with -fPIC, is there a reason to not make
>>>> it work for non-PIC code?  I can remove the flag_pic check from
>>>> calls.c
>>>
>>> I don't think that's right, you probably have to allow that along with
>>> (flag_pic || (decl && attribute_no_plt (decl)) - however it seems odd
>>> to me that the language extension allows this but the flag doesn't.
>>>
>>>> * Then, I add the generic attribute "noplt" and everything is fine.
>>>>
>>>> There is just one caveat with the above approach, for x86_64
>>>> (*call_insn) will not generate indirect-calls for *non-PIC* code
>>>> because constant_call_address_operand in predicates.md will evaluate
>>>> to false.  This can be fixed appropriately in ix86_output_call_insn in
>>>> i386.c.
>>>
>>> Yes, targets need to massage that into place but that's essentially
>>> the mechanics of retaining indirect calls in each backend. -fno-plt
>>> doesn't work for ARM / AArch64 with optimizers currently (and I
>>> suspect on most other targets) because our predicates are too liberal,
>>> fixed by treating "noplt" or -fno-plt as the equivalent of
>>> -mlong-calls.
>>>
>>>>
>>>>
>>>> Is this alright?  Sorry for the confusion, but the primary reason why
>>>> I did not do it the way you suggested is because we wanted "noplt"
>>>> attribute to work for non-PIC code also.
>>>
>>> If that is the case, then this is a slightly more complicated
>>> condition in the same place. We then always have indirect calls for
>>> functions that are marked noplt and just have target generate this
>>> appropriately.
>>
>> I have now modified this patch.
>
> Thanks for taking care of this. I'll have a read through tomorrow
> morning when I'm at my normal work machine.
>
>>
>> This patch does two things:
>>
>> 1) Adds new generic function attribute "no_plt" that is similar in
>> functionality  to -fno-plt except that it applies only to calls to
>> functions that are marked  with this attribute.
>> 2) For x86_64, it makes -fno-plt(and the attribute) also work for
>> non-PIC code by  directly generating an indirect call via a GOT entry.
>
> I'm sorry I'm going to push back again for the same reason.
>
> Other than forcing targets to tweak their call insn patterns, the act
> of generating the indirect call should remain in target independent
> code. Sorry, not having the same behaviour on all platforms for
> something like this is just a recipe for confusion.

Do you have a good suggestion on the way to implement this (non PIC
no-plt) in a clean and target independent way? Regarding the
'confusion' part, is it a matter of documentation (can be updated when
more targets start to support it more efficiently)?

David


>
> regards
> Ramana
>
>>
>> For PIC code, no_plt merely shadows the implementation of -fno-plt, no
>> surprises here.
>>
>> * c-family/c-common.c (no_plt): New attribute.
>> (handle_no_plt_attribute): New handler.
>> * calls.c (prepare_call_address): Check for no_plt
>> attribute.
>> * config/i386/i386.c (ix86_function_ok_for_sibcall): Check
>> for no_plt attribute.
>> (ix86_expand_call):  Ditto.
>> (nopic_no_plt_attribute): New function.
>> (ix86_output_call_insn): Output indirect call for non-pic
>> no plt calls.
>> * doc/extend.texi (no_plt): Document new attribute.
>> * testsuite/gcc.target/i386/noplt-1.c: New test.
>> * testsuite/gcc.target/i386/noplt-2.c: New test.
>> * testsuite/gcc.target/i386/noplt-3.c: New test.
>> * testsuite/gcc.target/i386/noplt-4.c: New test.
>>
>>
>> Please review.
>>
>> Thanks
>> Sri
>>
>>
>>>
>>> To be honest, this is trivial to implement in the ARM backend as one
>>> would just piggy back on the longcalls work - despite that, IMNSHO
>>> it's best done in a target independent manner.
>>>
>>> regards
>>> Ramana
>>>
>>>>
>>>> Thanks
>>>> Sri
>>>>
>>>>>
>>>>> regards
>>>>> Ramana
>>>>>
>>>>>
>>>>>>
>>>>>>>
>>>>>>>> I am not familiar with PLT calls for other targets.  I can move the
>>>>>>>> tests to gcc.dg but what relocation are you suggesting I check for?
>>>>>>>
>>>>>>>
>>>>>>> Move the test to gcc.dg, add a target_support_no_plt function in
>>>>>>> testsuite/lib/target-supports.exp and mark this as being supported only on
>>>>>>> x86 and use scan-assembler to scan for PLT relocations for x86. Other
>>>>>>> targets can add things as they deem fit.
>>>>>>
>>>>>>>
>>>>>>> In any case, on a large number of elf/ linux targets I would have thought
>>>>>>> the absence of a JMP_SLOT relocation would be good enough to check that this
>>>>>>> is working correctly.
>>>>>>>
>>>>>>> regards
>>>>>>> Ramana
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Sri
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Ramana
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Also I think the PLT calls have EBX in call fusage wich is added by
>>>>>>>>>>> ix86_expand_call.
>>>>>>>>>>>    else
>>>>>>>>>>>      {
>>>>>>>>>>>        /* Static functions and indirect calls don't need the pic
>>>>>>>>>>> register.  */
>>>>>>>>>>>        if (flag_pic
>>>>>>>>>>>            && (!TARGET_64BIT
>>>>>>>>>>>                || (ix86_cmodel == CM_LARGE_PIC
>>>>>>>>>>>                    && DEFAULT_ABI != MS_ABI))
>>>>>>>>>>>            && GET_CODE (XEXP (fnaddr, 0)) == SYMBOL_REF
>>>>>>>>>>>            && ! SYMBOL_REF_LOCAL_P (XEXP (fnaddr, 0)))
>>>>>>>>>>>          {
>>>>>>>>>>>            use_reg (&use, gen_rtx_REG (Pmode,
>>>>>>>>>>> REAL_PIC_OFFSET_TABLE_REGNUM));
>>>>>>>>>>>            if (ix86_use_pseudo_pic_reg ())
>>>>>>>>>>>              emit_move_insn (gen_rtx_REG (Pmode,
>>>>>>>>>>> REAL_PIC_OFFSET_TABLE_REGNUM),
>>>>>>>>>>>                              pic_offset_table_rtx);
>>>>>>>>>>>          }
>>>>>>>>>>>
>>>>>>>>>>> I think you want to take that away from FUSAGE there just like we do
>>>>>>>>>>> for
>>>>>>>>>>> local calls
>>>>>>>>>>> (and in fact the code should already check flag_pic && flag_plt I
>>>>>>>>>>> suppose.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Done that now and patch attached.
>>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>> Sri
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Honza
>>>>>>>>
>>>>>>>>
>>>>>>>



More information about the Gcc-patches mailing list