This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH, rs6000] Preserve link stack for 476 cpus


On Mon, Sep 12, 2011 at 12:07 PM, Peter Bergner <bergner@vnet.ibm.com> wrote:
> The Power ISA declares the "bcl 20,31,..." instruction as the preferred
> idiom for obtaining the next instruction address (NIA), which we use for
> computing the address of the GOT. ?This special branch and link is *not*
> a subroutine call, meaning it won't be paired with a blr (subroutine return).
> Processors therefore are not supposed to update their internal link stack
> when executing one of this instructions, otherwise we'll mispredict the
> following blrs.
>
> The 476 processor has an bug where it doesn't ignore these "bcl 20,31,..."
> instructions, so we end up getting lots of mispredicts for -fPIC code.
> The following patch adds a -mpreserve-link-stack option that is enabled
> automatically for -mtune={476,476fp}, that changes the two types of GOT
> access code GCC produces. ?The new code replaces the "bcl 20,31,..." with
> a "bl..., b..., blr" triplet. ?I've included some old versus new code
> snipits for both types of GOT access code to illustrate how the code
> has changed.
>
>
> 1) ? ? ?Normal Code: ? ? ? ? ? ? ? ? ? ? ? ? ? ?New 476 Code:
> ==============================================================================
>
> ? ? ? ?bcl 20,31,$+4 ? ? ? ? ? ? ? ? ? ? ? ? ? bl $+8
> .L3: ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?.L3:
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?b $+8
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?blr
> ? ? ? ?mflr 9 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?mflr 9
> ? ? ? ?addis 9,9,.LCTOC1-.L3@ha ? ? ? ? ? ? ? ?addis 9,9,.LCTOC1-.L3@ha
> ? ? ? ?addi 9,9,.LCTOC1-.L3@l ? ? ? ? ? ? ? ? ?addi 9,9,.LCTOC1-.L3@l
>
>
>
> 2) ? ? ?Normal Code: ? ? ? ? ? ? ? ? ? ? ? ? ? ?New 476 Code:
> ==============================================================================
>
> ? ? ? ?bcl 20,31,$+8 ? ? ? ? ? ? ? ? ? ? ? ? ? bl $+12
> ? ? ? ?.long _GLOBAL_OFFSET_TABLE_-$ ? ? ? ? ? b $+12
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?.long _GLOBAL_OFFSET_TABLE_-$
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?blr
> ? ? ? ?mflr 9 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?mflr 9
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?addi 9,9,4
> ? ? ? ?lwz 3,0(9) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?lwz 3,0(9)
>
>
> I have bootstrapped and regtested the following patch with no regressiosn.
> To test the code even more, I modified the patch so that we default to always
> using -mpreserve-link-stack and that bootstrapped and regtested with no
> regressions too.
>
> Ok for mainline?
>
> Peter
>
>
> ? ? ? ?* config/rs6000/rs6000.opt (mpreserve-link-stack): New option.
> ? ? ? ?* config/rs6000/rs6000.c (rs6000_option_override_internal): Enable
> ? ? ? ?TARGET_LINK_STACK for -mtune=476 and -mtune=476fp.
> ? ? ? ?(rs6000_legitimize_tls_address): Emit the link stack preserving GOT
> ? ? ? ?code if TARGET_LINK_STACK.
> ? ? ? ?(rs6000_emit_load_toc_table): Likewise.
> ? ? ? ?(output_function_profiler): Likewise
> ? ? ? ?(macho_branch_islands): Likewise
> ? ? ? ?(machopic_output_stub): Likewise
> ? ? ? ?* config/rs6000/rs6000.md (load_toc_v4_PIC_1, load_toc_v4_PIC_1b):
> ? ? ? ?Convert to a define_expand.
> ? ? ? ?(load_toc_v4_PIC_1_normal): New define_insn.
> ? ? ? ?(load_toc_v4_PIC_1_476): Likewise.
> ? ? ? ?(load_toc_v4_PIC_1b_normal): Likewise.
> ? ? ? ?(load_toc_v4_PIC_1b_476): Likewise.

First, please choose a more informative option name.
-mpreserve-link-stack seems like something generally useful for all
processors and someone may randomly add the option.  It always is
useful to preserve the link stack -- that's why you're jumping through
hoops to fix this bug.  Maybe -mpreserve-ppc476-link-stack .

I would prefer that this patch were maintained by the chip vendors
distributing SDKs for PPC476 instead of complicating the FSF codebase.

Otherwise, please implement this like Xilinx FPU in rs6000.opt,
rs6000.h, ppc476.h and config.gcc where TARGET_LINK_STACK is defined
as 0 unless GCC explicitly is configured for powerpc476.

Thanks, David


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]